1 Introduction

The origin of neutrino mass and mixing remains one of the important open questions in fundamental physics [1, 2]. It clearly requires the introduction of new particles beyond the particle content of the Standard Model (SM). Qualitatively, we can expect these new particles to induce novel experimental signatures, such as lepton number violation (LNV) and charged lepton flavor violation (LFV), which are either forbidden or highly suppressed in the SM. Arguably, the cleanest method to identify the new particle(s) would be via their direct production at a high-energy collider. By studying the subsequent decays of these new particles to SM particles, preferably involving LNV and/or LFV to reduce SM background, one might be able to pinpoint the underlying neutrino mass model. A summary of existing collider constraints on various neutrino mass models can be found in Refs. [3, 4]. Similarly, a summary of the LFV constraints can be found in Refs. [5, 6].

All past and current high-energy colliders constructed so far [7] involve electron or proton beams and are therefore particularly sensitive to new particles that couple to electrons or quarks. An entirely new class of couplings could be probed using muon colliders, originally proposed long ago [8]. The main advantage is that leptons provide a much cleaner collision environment than hadrons, and muon beams suffer less synchrotron radiation loss than electron beams, thus making muon colliders capable of reaching higher center-of-mass energies with a reasonable-size circular ring design [9, 10]. They have gained considerable attention in recent years [11,12,13,14,15], as novel muon cooling techniques are now available [16], and other technical difficulties related to the muon lifetime and radiation seem solvable [15], making muon colliders an increasingly realistic and desirable option. Most work has been done in the context of future \(\mu ^+\mu ^-\) colliders [17], which would mimic LEP [18] and could reach a center of mass energy of 10 TeV or more.

Here, we will focus on a different experimental setup, \(\mu \)TRISTAN [19], which is a proposed high-energy lepton collider using the ultra-cold antimuon technology developed at J-PARC [20]. It can run in the \(\mu ^+ e^-\) mode with \(\sqrt{s} = 346\) GeV, and later, in the \(\mu ^+ \mu ^+\) mode [21] with \(\sqrt{s}=2\) TeV or higher. It can serve as a Higgs factory and do precision physics [22]. Other new physics studies for the \(\mu ^+ e^-\) and \(\mu ^+ \mu ^+\) collider options can be found in Refs. [23,24,25,26,27,28], respectively. As we will show in this article, the unique initial states of \(\mu \)TRISTAN make it especially sensitive to neutrino mass models involving leptophilic neutral and/or doubly-charged scalars, allowing for direct production and study of these new scalars in regions of parameter space otherwise untestable. We take examples from both tree- and loop-level neutrino mass models. Specifically, we use the Zee model [29], Zee–Babu model [30, 31], cocktail model [32], and type-II seesaw model [33,34,35,36,37] as concrete examples, and we consider the cleanest final states (with the least SM background), i.e., the LFV channels \(\mu ^+ e^- \rightarrow \ell _\alpha ^+\ell _\beta ^-\) and \(\mu ^+\mu ^+\rightarrow \ell _\alpha ^+\ell _\beta ^+\) mediated by the scalars, as well as the associated production of scalars with a photon or Z boson.Footnote 1 We show that \(\mu \)TRISTAN can provide unprecedented sensitivity well beyond existing constraints and complementary to future low-energy LFV searches.

The rest of this article is organized as follows: in Sect. 2 we briefly describe the details of the \(\mu \)TRISTAN collider. In Sect. 3 we go through several neutrino mass models (both radiative and tree-level), derive \(\mu \)TRISTAN’s sensitivity and compare to other LFV observables, notably lepton flavor violation. We conclude in Sect. 4.

2 \(\mu \)TRISTAN

The ultra-cold antimuon technology developed for the muon anomalous magnetic moment and electric dipole moment experiment at J-PARC [20] uses laser ionization of muonium atoms to provide a low-emittance \(\mu ^+\) beam, which can be re-accelerated to high energies [38]. Allowing a 1 TeV \(\mu ^+\) beam to collide with a high-intensity \(e^-\) beam at the TRISTAN (Transposable Ring Intersecting Storage Accelerators in Nippon [39]) energy of 30 GeV in a storage ring of the same size as TRISTAN (3 km circumference), one can realize the \(\mu ^+e^-\) mode of \(\mu \)TRISTAN with a center-of-mass energy \(\sqrt{s}=346~\text {GeV}.\)Footnote 2 Taking into account muon decay, the deliverable instantaneous luminosity for a single detector at any collision point in the storage ring is estimated as \(4.6\times 10^{33}~\text {cm}^{-2}~\text {s}^{-1}\) [22], which translates to an integrated luminosity of \(100~\text {fb}^{-1}~\text {year}^{-1}.\)

Using the same 3 km storage ring and 1 TeV \(\mu ^+\) beams, one can also consider a \(\mu ^+\mu ^+\) collider [21] with \(\sqrt{s}=2~\text {TeV}\) (or 6 TeV for the larger ring option). The beam intensity will be lower than in the \(\mu ^+e^-\) mode due to both muons decaying in the storage ring. The instantaneous luminosity is estimated as \(5.7\times 10^{32}~\text {cm}^{-2}~\text {s}^{-1}\) [22], which translates to an integrated luminosity of \(12~\text {fb}^{-1}~\text {year}^{-1}.\)

The precise luminosity numbers depend on various efficiencies for the muon production, as well as the detailed designs of the muon accelerator and storage ring. For instance, a higher luminosity is, in principle, achievable with better focusing of the \(e^-\) beam (compared to the \(\mu ^+\) beam [20]), following the SuperKEKB design [40]. We will use the numbers given above from Ref. [22] as realistic but conservative order-of-magnitude estimates to work with. Assuming negligible SM background for the LFV signals we study below, the above-mentioned luminosities correspond to a minimum signal cross section of 0.09 (0.75) fb in the \(\mu ^+e^-\) \((\mu ^+\mu ^+)\) mode in order to achieve \(3\sigma \) sensitivity with 1 year runtime. To be conservative, we will use a signal cross section of 0.1 (1) fb in the \(\mu ^+e^-\) \((\mu ^+\mu ^+)\) mode to derive our sensitivity limits. These limits can be easily scaled for a longer runtime. For instance, 10 years of runtime with 1 ab\(^{-1}\) integrated luminosity can achieve the same level of sensitivity with a signal cross section ten times smaller, thus being capable of probing a larger model parameter space than what is shown here.

Since the details of the \(\mu \)TRISTAN detector design and acceptance efficiencies are currently unknown, we will only impose basic trigger-level cuts on the transverse momenta and pseudorapidity of the outgoing leptons and photons, i.e., the default MadGraph5 cuts \(p_T^{\ell ,\gamma }>10\) GeV and \(|\eta ^{\ell ,\gamma }|<2.5\) [41] while calculating the cross sections in the \(\mu ^+\mu ^+\) option. For the asymmetric beams in the \(\mu ^+e^-\) option, we only keep the trigger-level \(p_T\) cuts and remove the \(\eta \) cuts because the final state particles are boosted in the \(\mu ^+\) direction; the detector should be designed to cover the small-angle region from the beam direction.

We will use unpolarized beams for both \(\mu ^+e^-\) and \(\mu ^+\mu ^+\) modes to derive our sensitivity limits. Although the surface antimuons produced by the \(\pi ^+\) decay are 100% polarized due to the \(V-A\) nature of the weak interaction, the final polarization of the antimuon beam depends on a detailed understanding of the beam emittance under the applied magnetic field, which in some cases can reduce the polarization down to 25% [22]. Similarly, the beam polarization option for the \(e^-\) beam is still under discussion for the SuperKEKB upgrade [42]. Including realistic beam polarization effects could modify our cross sections by a factor of few due to the chiral nature of the scalar couplings.

Fig. 1
figure 1

Relevant Feynman diagrams for the processes involving the neutral scalar H in the Zee model at \(\mu \)TRISTAN

3 Neutrino mass models with leptophilic scalars

The leptonic initial states and clean environment at \(\mu \)TRISTAN provide an unprecedented opportunity to directly probe heavy leptophilic particles with possible LFV interactions. We will mainly focus on the leptophilic neutral and doubly-charged scalars that arise in well-known neutrino mass models, both tree-level and radiative, such as the Zee model [29], Zee–Babu model [30, 31], cocktail model [32], and type-II seesaw model [33,34,35,36,37]. If kinematically allowed, a neutral scalar H with sizable LFV coupling \(e\mu \) can be resonantly produced in \(\mu ^+e^-\) collisions either by itself or in association with a photon or Z boson, as shown in Fig. 1a and b respectively, thus providing unparalleled sensitivity to the LFV scalar sector. Even for \(m_H>\sqrt{s},\) the dilepton channels \(\mu ^+e^-\rightarrow \ell _\alpha ^+\ell _\beta ^-\) and \(\mu ^+\mu ^+\rightarrow \ell _\alpha ^+\ell _\beta ^+,\) shown in Fig. 1c and d, respectively, are sensitive to the LFV couplings of H and give rise to a contact-interaction-type bound on the scalar parameter space. Similarly, a doubly-charged scalar can be resonantly produced at a \(\mu ^+\mu ^+\) collider, either by itself or in association with a photon or Z boson (see Fig. 3). The higher center-of-mass energy of the \(\mu ^+\mu ^+\) option at \(\mu \)TRISTAN allows us to probe doubly-charged scalars beyond the current LHC constraints [43]. We only focus on the LFV final states, as they are free from the SM background (modulo lepton misidentification, whose rate is negligible at lepton colliders [44, 45]). Also, we do not consider processes involving singly-charged scalars, as they necessarily involve neutrinos in the final state, making it harder to separate our signal from the SM background.

Table 1 Predictions for the sum of neutrino masses \(\sum _j m_j,\) the effective \(0\nu \beta \beta \) Majorana neutrino mass \(\langle m_{\beta \beta }\rangle ,\) and the Dirac CP phase \(\delta _{\textrm{CP}}\) from the texture zeros employed in the Zee model, using the \(3\sigma \) normal-ordering ranges for the oscillation parameters from NuFit 5.2 [52]

3.1 Zee model

In the Zee model [29], the SM scalar sector with one Higgs doublet \(H_1\) is extended by adding a second Higgs doublet \(H_2\) and an \(SU(2)_L\)-singlet charged scalar \(\eta ^+.\) The relevant Lagrangian terms are given by

$$\begin{aligned} \mathcal {L}\supset \mu H_1 H_2 \eta ^- - f\bar{L}^c L \eta ^+ - \tilde{Y}\bar{\ell } L \tilde{H}_1- Y \bar{\ell } L \tilde{H}_2 +{\text {H.c.}}, \end{aligned}$$
(1)

where the superscript c stands for charge conjugate and \(\tilde{H}_a\equiv i\sigma _2H_a^\star \) \((a=1,2,\) \(\sigma _2\) is the second Pauli matrix). We have suppressed the flavor and \(SU(2)_L\) indices. Note that the Yukawa coupling matrix f is anti-symmetric in flavor space, while Y is an arbitrary complex coupling matrix. We go to the Higgs basis [46, 47], where only \(H_1\) acquires a vacuum expectation value, \(\langle H_1\rangle \equiv v/\sqrt{2} \simeq \mathrm{174{GeV}},\) and the charged leptons obtain a diagonal mass matrix \(M_\ell = \tilde{Y} v/\sqrt{2}.\) We work in the alignment limit [48], as preferred by the LHC Higgs data [49], where the neutral scalars of \(H_2\) (the CP-even H and the CP-odd A) do not mix with the neutral Higgs contained in \(H_1\) that can be identified as the SM Higgs boson. The \(\mu \) term in the Lagrangian (1) will induce a mixing of \(\eta ^+\) with the charged scalar contained in \(H_2\) upon electroweak symmetry breaking; we denote the mixing angle by \(\phi \) and the two mass eigenstates by \(h^+\) and \(H^+,\) see Refs. [50, 51] for details.

The simultaneous presence of fY,  and \(\mu \) breaks lepton number by two units and leads to a one-loop Majorana neutrino mass matrix

$$\begin{aligned} M^\nu = \kappa \left( f M_\ell Y + Y^T M_\ell f^T\right) , \end{aligned}$$
(2)

with prefactor \(\kappa \equiv (16\pi ^2)^{-1} \sin 2\phi \log (m_{h^+}^2/m_{H^+}^2).\) This matrix is manifestly symmetric and can be diagonalized as usual via

$$\begin{aligned} M^\nu = U \,\text {diag}(m_1,m_2,m_3)\, U^T, \end{aligned}$$
(3)

where U is the unitary Pontecorvo–Maki–Nakagawa–Sakata matrix and \(m_j\) the neutrino masses. Through neutrino oscillations we have obtained information about the mass splittings and the three mixing angles in U. The overall neutrino mass scale, ordering, and CP phases are unknown, although their ranges are partially restricted [52].

With the parametrization of Refs. [53, 54] we can express Y in terms of \(M^\nu \) and f. The \(\mu ^+ e^-\) run of \(\mu \)TRISTAN will be uniquely sensitive to \(Y_{e\mu }\) and \(Y_{\mu e},\) see Fig. 1a–c, so we investigate Y textures where one of these entries is non-vanishing, which is hardly a restriction. The simultaneous presence of \(Y_{e\mu }\) and \(Y_{e e}\) (or \(Y_{\mu \mu })\) however would induce large LFV amplitudes, e.g. \(\mu \rightarrow e \gamma \) and \(\mu \rightarrow 3 e\) [55,56,57,58,59], leaving little parameter space for \(\mu \)TRISTAN to probe. To evade LFV constraints and simplify our analysis, we will set as many Y entries to zero as possible, leading to the four benchmark textures

$$\begin{aligned} Y_{A_1}&\propto \begin{pmatrix} 0 &{} 1 &{} 0\\ 0 &{} 0 &{} - \frac{2 m_e}{m_\mu } \frac{M^\nu _{e\tau }}{M^\nu _{\mu \mu }} \\ 0 &{} 0 &{} 0 \end{pmatrix} \sim \begin{pmatrix} 0 &{} 1 &{} 0\\ 0 &{} 0 &{} 0.0035 \\ 0 &{} 0 &{} 0 \end{pmatrix}, \end{aligned}$$
(4)
$$\begin{aligned} Y_{B_2}&\propto \begin{pmatrix} 0 &{} 1 &{} 0\\ -\frac{m_e}{m_\mu } \frac{M^\nu _{e e}}{M^\nu _{\mu \mu }} &{} 0 &{} 0 \\ 0 &{} 0 &{} 0 \end{pmatrix} \sim \begin{pmatrix} 0 &{} 1 &{} 0\\ 0.013 &{} 0 &{} 0 \\ 0 &{} 0 &{} 0 \end{pmatrix}, \end{aligned}$$
(5)
$$\begin{aligned} Y_{B_3}&\propto \begin{pmatrix} 0 &{} 0 &{} 1\\ -\frac{m_e}{2 m_\mu } \frac{M^\nu _{e e}}{M^\nu _{\mu \tau }} &{} 0 &{} 0 \\ 0 &{} 0 &{} 0 \end{pmatrix} \sim \begin{pmatrix} 0 &{} 0 &{} 1 \\ 0.0023 &{} 0 &{} 0 \\ 0 &{} 0 &{} 0 \end{pmatrix}, \end{aligned}$$
(6)
$$\begin{aligned} Y_{B_4}&\propto \begin{pmatrix} 0 &{} 1 &{} 0\\ 0 &{} 0 &{} 0 \\ -\frac{m_e}{2 m_\tau } \frac{M^\nu _{e e}}{M^\nu _{\mu \tau }} &{} 0 &{} 0 \end{pmatrix} \sim \begin{pmatrix} 0 &{} 1 &{} 0\\ 0 &{} 0 &{} 0 \\ 0.00013 &{} 0 &{} 0 \end{pmatrix}. \end{aligned}$$
(7)

All these Y textures lead to viable two-zero textures in \(M^\nu \) [60], indicated by their common name as a subscript, following the nomenclature of Ref. [61]. The \(M^\nu \) two-zero textures predict the unknown parameters in the neutrino sector, i.e., the lightest neutrino mass and the three phases. We show in Table 1 the predictions for the sum of neutrinos masses \(\sum _j m_j\) (testable via cosmology [62]), the effective mass parameter for neutrinoless double beta decay \(\langle m_{\beta \beta }\rangle =\sum _i U^2_{ei}m_i\) (testable in the next-generation experiments [63]), and the Dirac CP phase (testable in neutrino oscillation experiments [64, 65]). Notice that the \(\sum m_\nu \) predictions of the B textures are already in tension [66] with limits from cosmology, \(\sum m_\nu < 0.12~\hbox {eV}\) [67],Footnote 3 but perfectly in line with laboratory constraints [71].

The many zeros in these four Y benchmarks ensure highly suppressed LFV. Indeed, neither of them give rise to the most stringent LFV modes, \(\mu \rightarrow e\gamma \) and \(\mu \rightarrow 3e,\) despite the non-zero \(e\mu \) entry in Y. However, all cases induce muonium–antimuonium oscillation [72,73,74,75] through those \(e\mu \) entries, which will turn out to be an important constraint. In addition, all textures except for \(Y_{B_2}\) also give rise to LFV tauon decays. Furthermore, all textures contribute to \((g-2)_\mu ,\) although the \(2\sigma \)-preferred region turns out to be already excluded by the muonium constraint.

The overall scale of Y is degenerate with f and \(\kappa \) from Eq. (2) and can effectively be adjusted at will. The \(e\mu \) entry of Y is then a free parameter, subject only to perturbative unitarity constraints. The second non-zero entry of Y is not free, however, but rather predicted by lepton masses and neutrino mass matrix entries. The latter are essentially predicted due to the two-zero textures in \(M^\nu ,\) allowing us to predict the Y entries, as already shown above. For \(A_1,\) \(B_2,\) and \(B_4,\) we find a large \(e\mu \) entry in Y that drives the H production at \(\mu \)TRISTAN, plus a suppressed second Y entry that induces LFV. For \(B_3,\) the \(e\tau \) entry dominates and \(\mu \)TRISTAN’s reach is severely limited by tau LFV. Notice that we are focusing on such extreme textures just for the sake of illustration to emphasize \(\mu \)TRISTAN’s complementarity to other experimental probes.

Fig. 2
figure 2

\(\mu \)TRISTAN sensitivity to the Zee model parameter space for various channels as shown in Fig. 1. The shaded regions are excluded: Purple (pink) shaded from LEP (LHC) dilepton data, green shaded from \((g-2)_\mu ,\) and gray shaded from muonium oscillation. The future muonium (ILC) sensitivity is shown by the black (purple) dashed line (curve). The solid black lines show the \(\tau \) LFV constraints for different Y textures \((A_1, B_3, B_4)\)

Assuming H to be the lightest scalar, the textures \(Y_{A_1},\) \(Y_{B_3},\) and \(Y_{B_4}\) lead to \(\tau ^-\rightarrow \mu ^-\mu ^\pm e^\mp ,\) \(\tau ^-\rightarrow e^-\mu ^\pm e^\mp ,\) and \(\tau ^-\rightarrow e^-e^\pm \mu ^\mp ,\) respectively, which give limits of order \(|Y_{\tau \alpha } Y_{\beta \delta }|< (m_H/\mathrm{5{TeV}})^2,\) as shown by the solid black lines in Fig. 2. For all textures except \(B_3\) these are very suppressed by the small \(Y_{\tau \alpha }\) entry. For those textures, as well as for the \(Y_{B_2}\) texture which does not give rise to tau (or muon) LFV decay, the most important LFV process is the \(|\Delta L_\mu | =|\Delta L_e|=2\) conversion of muonium \((M = e^-\mu ^+)\) to antimuonium \((\bar{M}= e^+\mu ^-)\) [72,73,74,75], which only requires the \(Y_{e\mu }\) entry we are interested in for \(\mu \)TRISTAN. The conversion probability is currently limited to \(P(M\leftrightarrow \bar{M}) < 8.3 \times 10^{-11}\) at \(90\%\) CL by the MACS experiment at PSI [76], while a sensitivity at the level of \(\mathcal {O}(10^{-14})\) is expected in the future by the proposed MACE experiment [77]. The current MACS limit sets stringent constraints on the Yukawa couplings \(Y_{e\mu }\) and \(Y_{\mu e}\):

$$\begin{aligned} |Y_{e\mu ,\mu e}| < \frac{m_H}{0.85~{\hbox {TeV}}}. \end{aligned}$$
(8)

This is the most important limit for \(\mu \)TRISTAN, as shown in Fig. 2 by the gray-shaded region (current) and black dotted line (future).

The muonium limit can be significantly weakened due to destructive interference in the \(M-\bar{M}\) amplitude [78] if we choose \(m_A\simeq m_H,\) which renders even the future MACE projection insensitive to our parameter space of interest. However, for \(m_H\simeq m_A\ll m_{H^+},\) we would generate large oblique parameters due to custodial symmetry breaking [79, 80]; this puts an upper limit on the mass splitting between the neutral and charged scalars in the Zee model [54, 78]. On the other hand, the leptophilic charged scalars in this model are constrained from slepton searches at the LHC because the slepton decay \(\tilde{\ell }^+\rightarrow \ell ^+\tilde{\chi }^0\) mimicks a charged scalar decay \(H^+\rightarrow \ell ^+\nu \) in the massless neutralino limit. The current LHC bound is \(m_{H^+}>425\) GeV at 90% CL [81] for \(\textrm{BR}(H^+\rightarrow \mu ^+\nu _e)=1.\) To evade the muonium bound while satisfying the global electroweak precision constraint [82, 83], we then require \(m_H\simeq m_A\gtrsim \mathrm{320{GeV}},\) making direct H production in \(\mu \)TRISTAN’s \(\mu ^+ e^-\) mode difficult. To extend our analysis to lighter H,  we therefore assume the scalar hierarchy \(m_H\ll m_A\simeq m_{H^+},\) subject to the muonium constraint from Eq. (8).Footnote 4 Moreover, to set the scale of neutrino masses, we choose the f couplings to be much smaller than Y and can hence neglect the \(\eta ^\pm \)-mediated processes at \(\mu \)TRISTAN entirely.

Having established our benchmark scenarios and relevant LFV signatures, we can study this region of the Zee-model parameter space at \(\mu \)TRISTAN. The relevant Feynman diagrams and processes are shown in Fig. 1. Away from the s-channel resonance at \(\sqrt{s}\sim m_H,\) the dilepton cross section takes on the simple form

$$\begin{aligned} \sigma (\mu ^+ e^-\rightarrow \mu ^- e^+) \simeq \frac{|Y_{e\mu }|^4}{64\pi s} {\left\{ \begin{array}{ll} 1, &{} m_H\ll \sqrt{s},\\ \frac{s^2}{12 m_H^4} , &{} m_H\gg \sqrt{s}. \end{array}\right. } \end{aligned}$$
(9)

This was numerically verified in MadGraph5_aMC@NLO [41] using the general 2HDM FeynRules model file [84]. The exact analytic expression for the cross section is not very illuminating, and therefore, we do not show it here. We demand this cross section to be of order \(0.1~{\hbox {fb}}\) (after applying the cuts specified in Sect. 2) for a discovery, since this flavor-violating channel is background-free. The textures \(A_1,\) \(B_2,\) and \(B_4\) dominantly induce this channel.Footnote 5 We show the \(\mu \)TRISTAN reach of this process \(\mu ^+ e^-\rightarrow \mu ^- e^+\) in Fig. 2 (solid red curve), after applying the basic trigger cuts. We find that the \(\mu \)TRISTAN sensitivity surpasses the current limit from muonium conversion for \(m_H > 50~{\hbox {GeV}}.\) The \(B_3\) texture is the only one that is already too constrained by tau LFV to give large \(\sigma (\mu ^+ e^-\rightarrow \ell ^+_\alpha \ell ^-_\beta ).\) Future muonium data can cover almost the entire relevant parameter space for \(\mu \)TRISTAN’s dilepton mode in the Zee model, offering confirmation potential in case of a discovery.

In Fig. 2, we also show the existing collider constraints from LEP \(e^+e^-\rightarrow \mu ^+\mu ^-\) data (purple shaded) [85, 86] and from LHC \(pp\rightarrow e\mu \) data (pink shaded) [87, 88].Footnote 6 The future ILC sensitivity from \(e^+e^-\rightarrow \mu ^+\mu ^-H\) is also shown by the pink dashed curve [51, 89, 90] for comparison with the \(\mu \)TRISTAN sensitivity. The green-shaded region is excluded by demanding the H contribution to \((g-2)_\mu \) not to exceed \(5\sigma \) deviation between the world average of the SM prediction [91] and the experimental value [92].Footnote 7

For the associated production of H with a photon or a Z boson (cf. Fig. 1b), the cross sections for small \(m_H\ll \sqrt{s}\) take the form

$$\begin{aligned}&\sigma (\mu ^+ e^-\rightarrow H\gamma ) \simeq \frac{\alpha _{\text {EM}} |Y_{e\mu }|^2}{8 s}\,\log \left( \frac{s}{m_e m_\mu }\right) , \end{aligned}$$
(10)
$$\begin{aligned}&\sigma (\mu ^+ e^-\rightarrow H Z) \simeq \frac{\alpha _{\text {EM}} |Y_{e\mu }|^2 \, (s-m_Z^2)}{32 s_w^2 c_w^2 s^2 }\left[ \frac{s}{4 m_Z^2} \right. \nonumber \\&\quad \left. - (1-2 s_w^2 + 4 s_w^4) - (1-4 s_w^2 + 8 s_w^4) \log \left( \frac{m_H m_Z}{s-m_Z^2}\right) \right] , \end{aligned}$$
(11)

where \(\alpha _\textrm{EM}\) is the electromagnetic fine-structure constant, and \(s_w\equiv \sin \theta _w\) \((c_w\equiv \cos \theta _w)\) is the (co)sine of the weak mixing angle. These cross sections are typically larger than the dilepton channel but are open only for \(m_H \lesssim \sqrt{s}\) for the photon case (or \(\sqrt{s}-m_Z\) for the Z case). The photon cross section exhibits an infrared divergence for \(\sqrt{s}\rightarrow m_H\) that is regulated by the cut \(p_T^\gamma >10~{\hbox {GeV}},\) reducing the total cross section compared to the analytical expression above. The Z cross section is well behaved near the kinematic threshold but diverges for \(m_H\rightarrow 0,\) not of any concern for us. As can be seen in Fig. 2, both modes are important for \(\mu \)TRISTAN and cover parameter space that cannot be probed with other colliders or LFV.Footnote 8 The H scalars subsequently decay promptly into \(\mu ^\pm e^\mp ,\) half of which being background free even without any momentum reconstruction.

The Zee model also makes predictions for \(\mu \)TRISTAN’s \(\mu ^+\mu ^+\) mode, as there are t-channel diagrams for \(\mu ^+\mu ^+\rightarrow \ell ^+ \ell '^+\) (cf. Fig. 1d). All textures except \(B_3\) induce the background free \(\mu ^+\mu ^+\rightarrow e^+ e^+,\) with testable allowed cross sections for \(m_H>300~{\hbox {GeV}},\) as shown in Fig. 2 by the brown curve. We find that the H sensitivity in this channel is worse than or comparable to the dilepton channel in the \(\mu ^+e^-\) mode, so it can only be used as a secondary channel for verifying any signal found in \(\mu \)TRISTAN’s first run.

Before we move on to other neutrino mass models, let us briefly comment on the discrepancy in the muon magnetic moment [92]. While the status of the SM prediction is currently unclear, it is worthwhile to entertain the possibility that the discrepancy is real and a sign for new physics. The benchmark values taken above are incapable of explaining \((g-2)_\mu \) due to LFV constraints. A recent study [54] has shown that the Zee model is in principle able to explain \((g-2)_\mu ,\) but this requires one of the following textures:

$$\begin{aligned} Y = \begin{pmatrix} 0 &{} 0 &{} 0\\ 0 &{} \times &{} \times \\ 0 &{} \times &{} \times \end{pmatrix} \text { or } \begin{pmatrix} \times &{} 0 &{} \times \\ 0 &{} \times &{} 0 \\ \times &{} 0 &{} \times . \end{pmatrix} \end{aligned}$$
(12)

The first (second) requires \(M^\nu _{ee} = 0\) \((M^\nu _{\mu \mu } = 0)\) and effectively conserves electron (muon) number, which makes it obvious that muon LFV is evaded, including muonium conversion. The first texture could only show up in \(\mu \)TRISTAN’s \(\mu ^+\mu ^+\) run via \(\mu ^+\mu ^+\rightarrow \mu ^+\tau ^+\) or \(\tau ^+\tau ^+;\) the second texture can give \(\mu ^+ e^- \rightarrow \mu ^+ \tau ^-\) in \(\mu \)TRISTAN’s first run. A dedicated study of this scenario will be postponed until the \((g-2)_\mu \) anomaly is clarified.

Overall, we see that \(\mu \)TRISTAN could probe the Zee model in regions of parameter space that are inaccessible by other means. A exhaustive study of the Zee model at \(\mu \)TRISTAN goes beyond the scope of this work but the benchmarks discussed here indicate a very promising situation.

3.2 Zee–Babu model

In the Zee–Babu model [30, 31], we extend the SM by two \(SU(2)_L\)-singlet scalars \(h^+\) and \(k^{++}\) with hypercharge 1 and 2,  respectively, which have the following couplings relevant for neutrino masses:

$$\begin{aligned} -\mathcal {L}\supset f\bar{L}^c L h^+ + g\bar{\ell }^c \ell \, k^{++} + \mu h^- h^- k^{++}+{\text {H.c.}}\end{aligned}$$
(13)

The matrix g (f) is symmetric (antisymmetric) in flavor space. Taken together, these couplings break lepton number and generate a Majorana neutrino mass matrix

$$\begin{aligned} M^\nu \simeq 16\mu \, I(m_h,m_k)\, f M_\ell g^* M_\ell f, \end{aligned}$$
(14)

where \(I(m_h,m_k)\) is a two-loop function [94, 95]. The antisymmetry of f leads to \(\det M^\nu = 0\) and thus predicts one massless neutrino.

Fig. 3
figure 3

Relevant Feynman diagrams for the doubly-charged scalars in the Zee–Babu, cocktail, and triplet seesaw models

Similar to the Zee model, we can make the overall scale of g as large as we want and compensate for that with a smaller f matrix or \(\mu \) coupling. For simplicity we assume \(h^+\) to be very heavy and the f couplings to be small, effectively decoupling \(h^+.\) This leaves us with the doubly charged \(k^{++}\) with coupling matrix g. At \(\mu \)TRISTAN’s \(\mu ^+\mu ^+\) run, this \(k^{++}\) leads to dilepton and associated production signatures as long as \(g_{\mu \mu }\ne 0,\) see Fig. 3a, b. We show \(\mu \)TRISTAN’s reach and competing constraints in Fig. 4, having computed the cross sections with MadGraph5_aMC@NLO [41] using the model file given in Ref. [96].

Fig. 4
figure 4

\(\mu \)TRISTAN sensitivity to the Zee–Babu and cocktail model parameter space for various channels as shown in Fig. 3. The shaded purple region is excluded from LHC dilepton data [43], the dashed purple line shows the HL-LHC reach [96]. The diagonal non-solid lines indicate LFV constraints on the coupling products \(|g_{\mu \mu } g_{\alpha \beta }|.\) For the Zee–Babu g texture from Eq. (15), only \(g_{\mu \mu }\) and \(g_{\mu \tau }\) are relevant. For the cocktail-model texture from Eq. (18), mainly \(g_{\mu \mu }\) and \(g_{e\tau }\) are relevant

\(\mu \)TRISTAN can easily probe a large region of parameter space as long as \(g_{e\mu }\) is somewhat suppressed compared to \(g_{\mu \mu }\) to evade the \(\mu \rightarrow e \gamma \) constraint. This is hardly a restriction and we can even find g textures that eliminate almost all LFV constraints, e.g.

$$\begin{aligned} g^*&\propto \begin{pmatrix} 0 &{} 0 &{} 0 \\ 0 &{} 1 &{} -\frac{m_\mu }{m_\tau }\, \frac{M^\nu _{\mu \tau }}{M^\nu _{\tau \tau }} \\ 0 &{} -\frac{m_\mu }{m_\tau }\, \frac{M^\nu _{\mu \tau }}{M^\nu _{\tau \tau }} &{} \frac{m_\mu ^2}{m_\tau ^2}\, \frac{M^\nu _{\mu \mu }}{M^\nu _{\tau \tau }} \end{pmatrix} \sim \begin{pmatrix} 0 &{} 0 &{} 0 \\ 0 &{} 1 &{} 0.1 \\ 0 &{} 0.1 &{} 5 \times 10^{-3} \end{pmatrix}. \end{aligned}$$
(15)

This structure does not lead to any eLFV. The only process we could worry about is \(\tau \rightarrow 3\mu ,\) which is however not particularly stringent and could be further suppressed by tuning \(|M^\nu _{\mu \tau }/M^\nu _{\tau \tau }|\ll 1.\) \(\mu \)TRISTAN has a large region of testable parameter space even without this tuning. Notice that the dominant \(g_{\mu \mu }\) entry here leads to the dominant channels \(\mu ^+\mu ^+\rightarrow \mu ^+\mu ^+\) and \(\mu ^+\mu ^+\rightarrow \gamma /Z\, (k^{++}\rightarrow \mu ^+\mu ^+);\) these are not exactly background free, even though invariant mass distributions and angular observables can be used to isolate new-physics contributions. The subleading channels \(\mu ^+\mu ^+\rightarrow \mu ^+\tau ^+\) and \(\mu ^+\mu ^+\rightarrow \gamma /Z\, (k^{++}\rightarrow \mu ^+\tau ^+)\) on the other hand are smoking-gun observables.

The texture from Eq. (15) does not induce any interesting signatures in the \(\mu ^+ e^-\) run, but other textures might, see Fig. 3c. For example, a \(\mu \mu \) and ee entry in g would give the very clean \(\mu ^+e^-\rightarrow \mu ^- e^+ \) (in addition to \(\mu ^+\mu ^+\rightarrow e^+ e^+),\) allowed by current muonium-conversion constraints, as shown in Fig. 4.

We also show other relevant constraints in Fig. 4. The \((g-2)_\mu \) excluded region is shown by the black shaded region on top left corner. The vertical pink shaded region is the current LHC bound [43], and the vertical pink dashed line is the future HL-LHC sensitivity [96]. Thus, we find that \(\mu \)TRISTAN will probe a wide range of the Zee–Babu model parameter space well beyond the HL-LHC sensitivity. Similar sensitivities are also achievable at a future \(\mu ^+\mu ^-\) collider [97].

3.3 Cocktail model

The cocktail model [32] is an SM extension by two \(SU(2)_L\)-singlet scalars \(h^-\) and \(k^{++},\) as well as a second Higgs doublet \(H_2.\) The field content is reminiscent of the Zee and Zee–Babu models, but here an extra \({\mathbb {Z}}_2\) symmetry is imposed under which \(h^-\) and \(H_2\) are odd, which leaves the following relevant terms in the Lagrangian:

$$\begin{aligned} -\mathcal {L}&\supset g\bar{\ell }^c \ell \, k^{++} + \mu h^- h^- k^{++} +\kappa \tilde{H}_2^\dagger H_1 h^-\nonumber \\&\quad +\xi \tilde{H}_2^\dagger H_1 h^+ k^{--} +\frac{\lambda _5}{2} (H_1^\dagger H_2)^2 +{\text {H.c.}}, \end{aligned}$$
(16)

where g is once again a symmetric Yukawa matrix in flavor space. Lepton number is broken explicitly if all the above couplings are non-zero. We assume parameters in the scalar potential so that \(\langle H_2\rangle = 0,\) leaving the \({\mathbb {Z}}_2\) unbroken. In that case, Majorana neutrino masses arise at three-loop level:

$$\begin{aligned} M^\nu \simeq \frac{F_{\text {cocktail}}}{(16\pi ^2)^3\,m_{k^{++}}}\, M_\ell g M_\ell , \end{aligned}$$
(17)

where \(F_{\text {cocktail}}\) is a complicated dimensionless loop function that depends on scalar masses and couplings [98, 99]. The three-loop suppression factor and additional suppression by charged-lepton masses require large entries in g that are easily in the non-perturbative regime, even when all scalar masses are close to their experimental limits and the scalar-potential couplings as large as allowed by perturbative unitarity. To keep g perturbative and evade stringent constraints from muon LFV, one is more or less forced to consider the two-zero texture \(A_1\) for \(M^\nu \) [98, 99], which then results in a g matrix

$$\begin{aligned} g&\propto \begin{pmatrix} 0 &{} 0 &{} 1 \\ 0 &{} \frac{m_e m_\tau }{m_\mu ^2}\, \frac{M^\nu _{\mu \mu }}{M^\nu _{e\tau }} &{} \frac{m_e}{m_\mu }\, \frac{M^\nu _{\mu \tau }}{M^\nu _{e\tau }} \\ 1 &{} \frac{m_e}{m_\mu }\, \frac{M^\nu _{\mu \tau }}{M^\nu _{e\tau }} &{} \frac{m_e}{m_\tau }\, \frac{M^\nu _{\tau \tau }}{M^\nu _{e\tau }} \end{pmatrix} \sim \begin{pmatrix} 0 &{} 0 &{} 1 \\ 0 &{} 0.24 &{} 0.01 \\ 1 &{} 0.01 &{} 6\times 10^{-4} \end{pmatrix} \end{aligned}$$
(18)

and the neutrino-parameter predictions from the first row of Table 1. The strongest LFV constraint mediated by \(k^{++}\) then comes from \(\tau ^-\rightarrow e^+ \mu ^-\mu ^-,\) requiring \(|g_{e\tau }| < 0.17 \, m_{k^{++}}/ {\textrm{TeV}},\) although, by coincidence, \(\mu \rightarrow e\gamma \) gives essentially the same limit for this texture.

The LFV constraints of this texture are severe enough that \(\mu \)TRISTAN in the \(\mu ^+ e^-\) mode would not observe the characteristic \(\mu ^+ e^-\rightarrow \tau ^+ \mu ^-,\) see Fig. 4. However, \(\mu \)TRISTAN in the \(\mu ^+\mu ^+\) run could potentially see \(\mu ^+\mu ^+\rightarrow e^+ \tau ^+\) or \(\mu ^+\mu ^+\rightarrow k^{++} \gamma /Z\) followed by prompt \(k^{++}\rightarrow e^+\tau ^+\) decays.

Notice that the \({\mathbb {Z}}_2\) symmetry renders the lightest particle among the \(H_2\) and \(h^-\) stable. We can choose scalar-potential parameters to make this one of the neutral scalars inside \(H_2,\) which could then form dark matter. We will not discuss this here since there is very limited connection to \(\mu \)TRISTAN.

3.4 Type-II or triplet seesaw

In the type-II or triplet seesaw mechanism [33,34,35,36,37], we extend the SM by an \(SU(2)_L\)-triplet with hypercharge \(+2,\) usually written as the \(SU(2)_L\) matrix

$$\begin{aligned} \Delta = \begin{pmatrix} \Delta ^+/\sqrt{2} &{} \Delta ^{++}\\ \Delta ^0 &{} -\Delta ^+/\sqrt{2} \end{pmatrix} . \end{aligned}$$
(19)

This triplet couples to the left-handed lepton doublets \(L_{e,\mu ,\tau }\) and the SM scalar doublet H,  giving rise to the Lagrangian

$$\begin{aligned} -\mathcal {L}\supset Y \bar{L}^c \textrm{i}\sigma _2 \Delta L +\mu H^\dagger \textrm{i}\sigma _2 \Delta H^*+ {\text {H.c.}}\end{aligned}$$
(20)

This Lagrangian breaks lepton number and induces a small vacuum expectation value \(\langle \Delta ^0\rangle = v_\Delta /\sqrt{2},\) which in turn generates the Majorana neutrino mass matrix \(M^\nu = \sqrt{2} Y v_\Delta .\) The Yukawa couplings thus inherit the structure from the neutrino mass matrix but come with an unknown scaling factor \(v_\Delta .\)

In the limit of \(v_\Delta \ll v,\) the mass eigenstates that dominantly come from the triplet, \(H^{++}\simeq \Delta ^{++},\) \(H^+\simeq \Delta ^{+},\) \(H\simeq \sqrt{2} {\text {Re}}\,\Delta ^0,\) and \(A\simeq \sqrt{2}{\text {Im}}\,\Delta ^0,\) have mass splittings

$$\begin{aligned} m_H^2 \simeq m_A^2 \simeq m_{H^+}^2+\frac{\lambda _4 v^2}{4}\simeq m_{H^{++}}^2+\frac{\lambda _4 v^2}{2} , \end{aligned}$$
(21)

specified exclusively by the coupling \(\lambda _4\, H^\dagger \Delta \Delta ^\dagger H\) [100, 101]. For simplicity we will assume an almost degenerate spectrum here, even though a mass splitting could resolve [102,103,104] the recently observed discrepancy in CDF’s W-boson mass measurement [105]. The large Yukawa couplings required to produce \(\Delta ^{++}\) at \(\mu \)TRISTAN also lead to strong constraints from searches at the LHC, which exclude masses below 1 TeV [43] and can be improved at the HL-LHC [106].

Even more importantly, the triplet scalars induce LFV decays, for example [101, 107,108,109]

$$\begin{aligned}&{\text {BR}}(\mu \rightarrow e\gamma )\simeq \frac{\alpha _{\text {EM}} \left| (M^{\nu \,\dagger } M^\nu )_{e\mu }\right| ^2}{48\pi G_F^2 v_\Delta ^4}\left( \frac{1}{m_{H^{+}}^2}+\frac{8}{m_{H^{++}}^2}\right) ^2, \end{aligned}$$
(22)
$$\begin{aligned}&{\text {BR}}(\mu ^+\rightarrow e^+e^-e^+)\simeq 4\frac{\left| M^\nu _{ee} M^\nu _{\mu e}\right| ^2}{G_F^2 v_\Delta ^4 m_{H^{++}}^4}, \end{aligned}$$
(23)

where \(G_F\) is the Fermi coupling constant. \(\mu \rightarrow e\gamma \) is particularly important because the prefactor \( \left| (M^{\nu \,\dagger } M^\nu )_{e\mu }\right| ^2\) is completely specified by the known neutrino oscillation parameters [110] and is limited from below by \((0.016~{\hbox {eV}})^4,\) using the \(2\sigma \) range from NuFit 5.2 [52]. The current limit \({\text {BR}}(\mu \rightarrow e\gamma )<4.2\times 10^{-13}\) [111] then gives \(m_{\Delta ^{++}}> 1.5~{\hbox {TeV}}( {\textrm{eV}}/v_\Delta ).\) The \(\mu \rightarrow e\gamma \) limit can be improved by almost an order of magnitude with MEG-II [112, 113] but will eventually be surpassed by muon-conversion in Mu2e [114, 115], which probes the same coupling in our case and effectively has a sensitivity down to \({\text {BR}}(\mu \rightarrow e\gamma )<2\times 10^{-14}.\) This would improve the limit to \(m_{\Delta ^{++}}> 3~{\hbox {TeV}}({\textrm{eV}}/v_\Delta ).\)

Notice that the other LFV decays, notably \(\mu \rightarrow 3 e\) [116], could give even stronger limits on \( v_\Delta m_{\Delta ^{++}},\) especially with the upcoming Mu3e [117], but depend on the so-far unknown neutrino parameters such as the lightest neutrino mass and the Majorana CP phases. These allow us, for example, to set \(M^\nu _{ee} = 0\) and thus eliminate \(\mu \rightarrow 3e\) entirely. For simplicity we will therefore ignore these other LFV processes and only consider the unavoidable \(\mu \rightarrow e \gamma .\)

Fig. 5
figure 5

\(\mu \)TRISTAN sensitivity to the triplet/type-II seesaw model parameter space for various channels as shown in Fig. 3. We have set \(M^\nu _{\mu \mu } = 0.05~\text {eV}\) to fix the \(\Delta ^{++}\mu \mu \) coupling, see text for details

In Fig. 5, we show the LFV and LHC constraints together with the \(\mu \)TRISTAN sensitivities in various channels. We have implemented the model file in FeynRules [84] and computed the cross sections using MadGraph5_aMC@NLO [41]. To specify the production Yukawa coupling \(Y_{\mu \mu }\) we set \(M^\nu _{\mu \mu } =0.05~{\hbox {eV}};\) this satisfies the cosmology bound \(\sum m_\nu < 0.12~{\hbox {eV}}\) [67], otherwise we could go to larger \(M^\nu _{\mu \mu } \) values and increase the \(\mu \)TRISTAN cross sections without changing the LFV bound.

The cross section \(\sigma (\mu ^+\mu ^+\rightarrow \ell ^+_\alpha \ell ^+_\beta )\) scales with \(|M^\nu _{\mu \mu }|^2|M^\nu _{\alpha \beta }|^2,\) at least away from the resonance. The on-shell produced \(\Delta ^{++}\) has decay rates into charged leptons proportional to \(|M^\nu _{\alpha \beta }|^2.\) Our current lack of information about the lightest neutrino mass and the CP phases preclude us from making definite predictions for these final states, but this will improve with future neutrino data [118]. Generically, we expect final states with more muons and tauons than electrons at \(\mu \)TRISTAN from \(\Delta ^{++}\) processes for normal-ordered neutrino masses. Diboson decays \(\Delta ^{++}\rightarrow W^+ W^+\) are heavily suppressed by \(v_\Delta \) in our region of interest [119,120,121,122]. Similarly, the cascade decays of \(\Delta ^{++}\) involving neutral or singly-charged scalars depend on the choice of mass spectrum and can be ignored here.

Unlike for the doubly charged scalars in the Zee–Babu or cocktail models, the \(\Delta ^{++}\) in the triplet model cannot generate clean \(\mu ^+ e^-\rightarrow \ell ^+ \ell '^-\) signatures in \(\mu \)TRISTAN’s first run, since this region of parameter space is already excluded by \(\mu \rightarrow e\gamma \) (Fig. 5).

3.5 Other neutrino mass models

The \(\mu ^+\mu ^+\) mode of \(\mu \)TRISTAN will also be uniquely sensitive to the LNV/LFV signatures arising from other neutrino mass models. For instance, the heavy neutral leptons appearing in type-I [123,124,125,126,127] and type-III [128] seesaw models will induce a clean LNV signal \(\mu ^+\mu ^+\rightarrow W^+W^+\rightarrow \textrm{jets},\) which is like an inverse neutrinoless double beta decay \(e^-e^-\rightarrow W^-W^-\) [129,130,131,132] but in the muon sector [21]. This channel has been recently analyzed in Refs. [27, 133], so we will not repeat this analysis here. Similarly, the \(\mu \)TRISTAN sensitivities for the neutral and/or doubly-charged scalars derived here can also be applied to other models, such as the left–right symmetric model [134,135,136], and other radiative neutrino mass models [58], although the connection to neutrino mass may not be as direct as in the models studied here.

4 Conclusion

Neutrino masses provide the most convincing laboratory evidence for physics beyond the SM, making searches for the underlying new particles highly motivated. In this article, we have shown that \(\mu ^+ e^-\) and \(\mu ^+\mu ^+\) colliders in the vein of the recently proposed \(\mu \)TRISTAN experiment offer a new way to search for a variety of neutrino mass models. As exemplified by several benchmark scenarios of the popular Zee, Zee–Babu, cocktail, and triplet seesaw models, we showed that \(\mu \)TRISTAN could probe regions of parameter space that are out of reach of other experiments, be it future hadron colliders or future low-energy LFV searches.