1 Introduction

The observed neutrino mass matrix \([m_\nu ]\) [1] requires New Physics beyond the Standard Model (SM) which is lepton flavour-changing. Observing lepton flavour-changing processes other than neutrino oscillations, would therefore give complementary information on New Physics in the lepton sector.

Flavour-changing contact interactions among charged leptons (which we refer to as LFV), have not yet been observed, but upcoming experiments aim to improve the sensitivity to a few processes in the \(\mu \rightarrow e\) sector by orders of magnitude, and to probe a wide palet of \(\tau \rightarrow l\) processes with lesser sensitivity. This is summarised in Table 1; it suggests that LFV could be discovered in \(\mu \rightarrow e\), while \(\tau \rightarrow l\) is more promising for distinguishing among models. For a review of \(\mu \rightarrow e\) LFV, see e.g. [2].

The aim of this project is to explore what can be learned about New Physics in the lepton sector from observations of \(\mu \rightarrow e \gamma ,\mu \rightarrow e {\bar{e}} e\) and/or \(\mu A \rightarrow \! eA \). For instance, it would be ideal if the data could indicate properties of the New Physics model, such as whether new particles interact with lepton doublets or singlets or both, whether LFV occurs among SM particles at loop or tree level, or whether LFV is related to \([m_\nu ]\), baryogenesis or New Physics in the quark flavour sector.

In order to quantify what \(\mu \rightarrow e\) data could tell us about models, we study these questions in a bottom-up EFT approach, assuming TeV. We translate the data from the experimental scale to \(\Lambda _{NP}\) using EFT, then match three “representative” models to the experimentally allowed Wilson coefficient space, and explore which differences among the models can be identified by the data.

Model predictions for LFV have been widely studied; in particular, there is a large literature devoted to calculating LFV rates in neutrino mass models (for a review, see e.g. [3, 4]; or for example [5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31,32,33,34,35,36,37,38,39]) and other Standard Model extensions such as leptoquarks [40,41,42,43,44,45,46,47,48,49]. We hope that our bottom-up EFT approach could give a complementary perspective on the well-studied relations between models and observables. Our study differs from top-down analyses in that firstly, we suppose that upcoming \(\mu \rightarrow e\) experiments will measure 12 Wilson coefficients, and not just three rates. So we are using a more optimistic/futuristic parametrisation of the LFV observables, using as input everything they could tell us. This allowed us to show in a previous publication [50], that the three models we consider could be ruled out by upcoming data. Secondly, model studies frequently scan over the model parameter space; this allows to estimate correlations among LFV observables, but the results depend on the choice of measure on model parameter space. We circumvent the issue of measure and the need to scan by parametrising the models in terms of “Jarlskog-like” invariants [51], which contribute to the observables. Also, the operator coefficients are allowed to be complex, which is consistent with the hints of leptonic CP violation in neutrino observations [52].

This manuscript gives more complete results than presented in [50] – in particular highlighting the appearance of model “invariants” in the coefficients at the experimental scale, and explores the impact of some complementary observables on the model predictions for \(\mu \rightarrow e\). Section 2 reviews \(\mu \rightarrow e\) observables and the EFT formalism implemented here, then Sect. 3 summarises the three TeV-scale models that we consider, which are the type II seesaw [19,20,21,22], an inverse seesaw [29,30,31], and a scalar leptoquark [40,41,42,43,44,45,46,47,48,49] which can fit the \(R_D\) anomaly [53,54,55,56,57]. The matching of the models to the EFT is relegated to Appendix C. Section 4 gives the twelve observable Wilson coefficients at the experimental scale (including the coefficients for \(\mu A \rightarrow \! eA \) on heavy targets, missing in [50]), expressed in terms of “invariant” combinations of model and SM parameters. Section 5 explores the interplay of \(\mu \rightarrow e\) flavour change with some complementary observables in the models we consider. Then we discuss what we learned about bottom-up reconstruction in Sect. 6, and conclude in Sect. 7.

2 Observables and notation

In a bottom-up perspective, one starts from data and how to parametrise it, which is reviewed in Sect. 2.1. Section 2.2 reviews our EFT formalism; it allows to obtain expressions for the observable Wilson coefficients in terms of operator coefficients at the weak scale, which are given in Appendix B.

2.1 Observables

Some models considered here generate a Majorana mass matrix for the three light neutrinos. At energies below the weak scale (taken \(\simeq m_W\)), it can be included in the Lagrangian as [1]

$$\begin{aligned} \delta \mathcal{L}_{<m_W}= - \overline{\nu _{\alpha }}\frac{[m_\nu ]^{\alpha \beta }}{2}\nu ^c_\beta +h.c. \end{aligned}$$
(2.1)

where greek indices indicate the charged lepton mass eigenstate basis, square brackets indicate a matrix, and \([m_\nu ]_{\alpha \beta } = U_{\alpha i} m_i U_{\beta i}\) for \( m_i\) real and positive. The leptonic mixing matrix U is parameterised as in [1] by three mixing angles, one “Dirac” phase and two “Majorana” phases. In our study, we approximate the Dirac phase \(\delta \approx 3\pi /2\) [58], and take the Majorana phases as free parameters. The Majorana phases and the lightest neutrino mass \(m_{min}\) affect the elements of \([m_\nu ]\), but not the flavour-changing elements of

$$\begin{aligned} {[}m_\nu m^\dagger _\nu ]_{\alpha \beta } = [U D_mD_m U^\dagger ]_{\alpha \beta } - m_j^2 [U U^\dagger ]_{\alpha \beta } \end{aligned}$$
(2.2)

for \(\alpha \ne \beta \in \{e,\mu ,\tau \}\) and \(D_m = \textrm{Diag}\, (m_1,m_2, m_3)\), because the Majorana phases cancel and the \(|m_i^2 -m_j^2|\) are determined in neutrino oscillations.

Lepton flavour changing processes can be broadly classified as: those that change lepton flavour by two units such as muonium–anti-muonium oscillations, those that change both lepton and quark flavour by one unit such as \(K\rightarrow \mu ^\pm e^\mp \), and those that change lepton flavour by one unit and are otherwise flavour diagonal. The three classes are independent under Renormalisation Group running below the scale of the new particles responsible for LFVFootnote 1 We focus on \(\mu \rightarrow e\) interactions from the last class, which are otherwise flavour diagonal. This is motivated by the very restrictive bounds on a handful of processes in this sector, for which the experimental sensitivity is planned to improve by several orders of magnitude in the next few years (see Table 1). This differs from \(\tau \rightarrow l\) \((l\in \{ \mu ,e\})\) processes, where the upper bound on the branching ratio for a large variety of LFV \(\tau \) decays is currently of order \( 10^{-8}\), and Belle II aims for sensitivities \(\mathcal{O} (10^{-9}\rightarrow 10^{-10})\). So from an EFT perspective, the \(\tau \rightarrow l\) sector allows to independently measure almost every Wilson coefficient up to a New Physics scale TeV (at tree level) [60]; whereas in the \(\mu \rightarrow e\) sector, fewer coefficients are probed at greater accuracy, motivating the use of Renormalization Group Equations (RGEs) in the \(\mu \rightarrow e\) sector.

Table 1 Some \(\mu \rightarrow e\) processes, with current experimental bounds, and the estimated reach of upcoming (and future) experiments. For \(\mu \rightarrow e \gamma \), MEG II [63] aims to reach \(BR \sim 6\times 10^{-14}\). The Mu3e experiment [65] aims for \(BR(\mu \rightarrow e {\bar{e}} e) \sim 10^{-14}\) with the current beam and data-taking starting in 2025, then \(\sim 10^{-16}\) with a beam upgrade [77]. The COMET [67] and Mu2e [68, 69] experiments will search for \(\mu \rightarrow e\) conversion on light nuclei where the \(\mu \)-A bound state is long-lived; a different experimental approach, e.g. the PRISM/PRIME [71, 73] or ENIGMA [72] proposals would be required to improve also the sensitivity to heavy nuclei with shorter lifetimes. Some \(\tau \rightarrow l\) processes are included for comparison, with Belle II expectations for a luminosity of 50 ab\(^{-1}\) [75]

The three processes we consider are \(\mu \rightarrow e \gamma \), \(\mu \rightarrow e {\bar{e}} e\) and \(\mu A \rightarrow \! eA \). In the latter, a \(\mu ^-\) is captured by a nucleus, where it can transform into an electron via various interactions; we restrict to those which are coherent across the nucleus, or “spin-independent”Footnote 2 (some details are given in Appendix A). At the experimental scale, these three processes can be parametrised by the dimensionless operator coefficients \(\{C\}\) of the following Lagrangian [2, 79,80,81]:

$$\begin{aligned} \delta \mathcal{L}_{<m_W}= & {} \frac{1}{v^2}\sum _{X\in \{L,R\}}\Big ( C^{e\mu }_{D,X} (m_\mu {\overline{e}} \sigma ^{\alpha \beta }P_{X} \mu ) F_{\alpha \beta }\nonumber \\{} & {} + C^{e\mu ee}_{S,XX} ({\overline{e}} P_X \mu ) ({\overline{e}} P_X e )\nonumber \\{} & {} + C^{e\mu ee}_{V,LX} ({\overline{e}} \gamma ^\alpha P_L \mu ) ({\overline{e}} \gamma _\alpha P_X e ) \nonumber \\{} & {} + C^{e\mu ee}_{V,RX} ({\overline{e}} \gamma ^\alpha P_R \mu ) ({\overline{e}} \gamma _\alpha P_X e )\nonumber \\{} & {} + C^{e\mu }_{Al,X} \mathcal{O}_{Al,X} + C^{e\mu }_{Au\perp ,X} \mathcal{O}_{Au \perp ,X} \Big ) + h.c. \end{aligned}$$
(2.3)

where \(\mathcal{O}_{Al} \) is the combination of operators contributing to SI \(\mu \rightarrow e\) conversion on light targetsFootnote 3 such as Titanium (used by SINDRUM [66, 70]) or Aluminium (to be used by COMET [67] and Mu2e [68, 69]), and \( \mathcal{O}_{Au\perp }\) is an independent combination probed by heavy targets such as Gold (used by SINDRUM [70]). At the experimental scale, these operators would describe \(\mu \rightarrow e\) interactions with nucleons, which can be matched to operators involving quarks at a scale \(\sim \) 2 GeV. We use the quark basis here, because it is more convenient for comparing to models. At 2 GeV, the operators areFootnote 4

$$\begin{aligned}{} & {} \mathcal{O}_{Al,X} \simeq ({\overline{e}} P_X \mu ) \Big ( 0.692 ({\overline{u}} u) + 0.699 ({\overline{d}} d) + 0.0341 ({\overline{s}} s)\nonumber \\{} & {} + 0.00440 ({\overline{c}} c)+ 0.00128 ({\overline{b}} b) \Big ) \nonumber \\{} & {} + ({\overline{e}} \gamma ^\alpha P_X \mu ) \Big ( 0.125 ({\overline{u}} \gamma _\alpha u) + 0.128 ({\overline{d}} \gamma _\alpha d) \Big ) \end{aligned}$$
(2.4)
$$\begin{aligned}{} & {} \mathcal{O}_{Au,X} \simeq \cos \theta _A \mathcal{O}_{Al,X} + \sin \theta _A\mathcal{O}_{Au\perp ,X} \end{aligned}$$
(2.5)
$$\begin{aligned}{} & {} \mathcal{O}_{Au\perp ,X} \simeq -({\overline{e}} P_X \mu ) \Big ( 0.2 ({\overline{u}} u) +0.1 ({\overline{d}} d)+ 0.008({\overline{s}} s)\nonumber \\{} & {} +0.001({\overline{c}} c)+ 0.0003({\overline{b}} b) \Big ) \nonumber \\{} & {} + ({\overline{e}} \gamma ^\alpha P_X \mu ) \Big ( 0.56 ({\overline{u}} \gamma _\alpha u) + 0.8 ({\overline{d}} \gamma _\alpha d) \Big ) \end{aligned}$$
(2.6)

where \(\theta _A\) is the misalignment angle between the vectors in operator space corresponding to \(\mu \rightarrow e\) conversion on Gold and Aluminium, and Appendix A briefly reviews these results.

The Branching Ratios (BRs) can be written as

$$\begin{aligned} BR(\mu \rightarrow e \gamma )= & {} 384 \pi ^2 (|C^{e\mu }_{D L}|^2 + |C^{e\mu }_{D R}|^2) \end{aligned}$$
(2.7)
$$\begin{aligned} BR(\mu \rightarrow e {\bar{e}} e)= & {} \frac{|C^{e\mu ee}_{S,LL}|^2+ |C^{e\mu ee}_{S,RR}|^2}{8} \nonumber \\{} & {} +2 |C^{e\mu ee}_{V,RR} + 4eC^{e\mu }_{D, L}|^2\nonumber \\{} & {} +2| C^{e\mu ee}_{V,LL} + 4eC^{e\mu }_{D,R}|^2 \nonumber \\{} & {} + (64 \ln \frac{m_\mu }{m_e} -136) (|eC^{e\mu }_{D,R}|^2\nonumber \\{} & {} +|eC^{e\mu }_{D,L}|^2)+ |C^{e\mu ee}_{V,RL} + 4eC^{e\mu }_{D, L}|^2\nonumber \\{} & {} + |C^{e\mu ee}_{V,LR} + 4eC^{e\mu }_{D, R} |^2 \end{aligned}$$
(2.8)
$$\begin{aligned} BR_{SI}(\mu Al \rightarrow e Al)= & {} B_{Al} ( |d_{Al} C^{e\mu }_{D R}\nonumber \\{} & {} +C_{Al,L}|^2 + |d_{Al} C^{e\mu }_{D L} + C_{Al, R}|^2) \nonumber \\ \end{aligned}$$
(2.9)
$$\begin{aligned} BR_{SI}(\mu Au \rightarrow e Au)= & {} B_{Au} (|d_{Au}C^{e\mu }_{D R}\nonumber \\{} & {} +C_{Au,L}|^2 + |d_{Au} C^{e\mu }_{D L} + C_{Au, R}|^2)\nonumber \\ \end{aligned}$$
(2.10)

where only the Spin Independent contribution to \(\mu A \rightarrow \! eA \) is included based on the results of [79] (we do not use the recent results of [82]), the \(B_A\) are target-nucleus dependent constants discussed in Appendix A, and \(d_A\equiv I_{A,D}/(4|\vec {u}_A|)\) are given after Eq. (A.16). In all cases, the outgoing electrons are approximated as chiral because they are relativistic. The final states containing electrons of different chirality therefore do not interfere, giving independent constraints on the coefficients generating those final states. The experimental limits on \(\mu \rightarrow e \gamma \) and \(\mu \rightarrow e {\bar{e}} e\) therefore constrain eight coefficients – the two dipoles and six four-lepton coefficients – to be in the vicinity of zero. The correlation matrix and resulting bounds are discussed in [83] and we repeat the bounds here for completeness:

$$\begin{aligned} |C_{D,X}|^2\le & {} \frac{B_{\mu \rightarrow e \gamma } }{ 384\pi ^2 + e^2(64\ln \frac{m_\mu }{m_e}- 136) \frac{B_{\mu \rightarrow e \gamma }}{ B_{\mu \rightarrow e {\bar{e}} e}} } \nonumber \\ |C_{V,XX}|^2\le & {} \frac{ B_{\mu \rightarrow e {\bar{e}} e}}{ 2}\left( 1 + \frac{32e^2B_{\mu \rightarrow e \gamma }}{ 205e^2B_{\mu \rightarrow e \gamma } +384\pi ^2 B_{\mu \rightarrow e {\bar{e}} e}}\right) \nonumber \\ |C_{V,XY}|^2\le & {} B_{\mu \rightarrow e {\bar{e}} e} \left( 1 + \frac{16e^2B_{\mu \rightarrow e \gamma }}{205e^2B_{\mu \rightarrow e \gamma } +384\pi ^2 B_{\mu \rightarrow e {\bar{e}} e}}\right) \nonumber \\ \end{aligned}$$
(2.11)

where \(B_{process}\) is the experimental upper bound on the Branching Ratio for the process, and \(e^2(64\ln \frac{m_\mu }{m_e}- 136)\) is written as \( 205e^2\). Although the discussion of [83] implicitly supposed the coefficients were real, the same bounds apply for complex coefficients, because the real and imaginary components of the coefficients are constrained to lie inside the same ellipse around the origin.

The dipole coefficient is not constrained by \(\mu A \rightarrow e A\) because it contributes in interference, but the bound on \(|C_{Al,X}|^2 \) (and in principle \(|C_{Au\perp ,X}|^2 \)) is affected in the expected way by the independent bounds on the dipole:

$$\begin{aligned} |C_{Al,X}|^2 \le \frac{B_{\mu Al \rightarrow e Al}}{B_{Al}} + \frac{d_{Al}^2B_{\mu \rightarrow e \gamma } }{ 384\pi ^2}. \end{aligned}$$

In this work, we do not consider future improvements in the experimental reach for \(\mu A \rightarrow \! eA \) on heavy targets, so the current MEG bound on the dipole coefficient ensures that it is negligible in \(\mu Au \rightarrow e Au\). The current bound on Gold therefore implies

$$\begin{aligned} |C_{Au\perp ,X}|^2 \le \frac{{B}_{\mu Au \rightarrow e Au} }{B_{Au} \sin ^2\theta _A}+ \frac{\cos ^2\theta _A {B}_{\mu Al \rightarrow e Al}}{B_{Al} \sin ^2\theta _A} \end{aligned}$$
(2.12)

where \(\theta _A\) is the misalignement angle between the vectors in operator space corresponding to \(\mu \rightarrow e\) conversion on Gold and Aluminium and given after Eq. (A.15).

The current bounds on the twelve observable coefficients are given in Table 2, as well as the bounds that could be set, if upcoming experiments do not observe \(\mu \rightarrow e\) processes. The \(\mu A \rightarrow \! eA \) bounds given here differ from our previous paper [50] because \(\mathcal{O}_{Au\perp ,X}\) is here defined in terms of operators with quarks, so differs from the nucleon definition of [50]. The matching of quarks with nucleons is discussed in Appendix A.

Table 2 Current and possible future bounds on the observable coefficients, with \(X,Y \in \{L,R\}, X\ne Y\). The current bound on \(C_{Al,X}\) is from \(\mu \textrm{Ti}\rightarrow e \textrm{Ti}\) [66], and the future bound assumes BR( \(\mu \textrm{Al}\rightarrow e \textrm{Al}) \le 10^{-16}\). The current and future bounds on \(C_{Au\perp ,X}\) are identical because we do not include the reach of the proposed PRISM/PRIME and ENIGMA experiments among our upcoming results

In this theoretical study, we optimistically consider that these twelve coefficients can be measured with the same reach that they can be constrained, given in Table 2. Polarising the muon could allow to distinguish between coefficients of operators with an L vs an R projector in the \({\bar{e}}-\mu \) bilinear in \(\mu \rightarrow e \gamma \) and \(\mu \rightarrow e {\bar{e}} e\) [2] as well as in \(\mu A \rightarrow \! eA \) [84, 85]. And in the case of \(\mu \rightarrow e {\bar{e}} e\), the angular distributions of the three-particle final state allow to determine the magnitude of various coefficients and some phases, as discussed in [86,87,88]. Vector operators \(\mathcal{O}_{VXX}\) and scalar operators \(\mathcal{O}_{SYY}\) (for \(Y \ne X\)) induce the same angular distributions, but could in principle be distinguished by measuring final state helicities [86,87,88]. Our models do not generate the scalar operators within the reach of upcoming experiments (see Sect. 5.3).

2.2 Bottom-up EFT

For an introduction to EFT, see for instance [89,90,91,92,93,94,95,96]. Our EFT consists of an operator basis and Renormalisation Group Equations (RGEs) to look after scale evolution. The previous section presented an operator basis describing the observables at low energy; a more complete set of operators is required in translating the observables from the experimental to the New Physics scale, because the RGEs mix operators.

Our New Physics scale is at \(\Lambda _{NP}\approx \) TeV, so we need a QED\(\times \)QCD invariant basis of operators below the weak scale, included in the Lagrangian as

$$\begin{aligned} \delta \mathcal{L}_{<m_W}= & {} - \overline{\nu _{\alpha }}\frac{[m_\nu ]^{\alpha \beta }}{2}\nu ^c_\beta + \sum _{n \ge 2} \sum _{A,\zeta }\frac{C_A^{\zeta } {\mathcal {O}}_A^{(4+n)\zeta }}{v^n} + h.c. \nonumber \\ \end{aligned}$$
(2.13)

where \(v = 174\) GeV, the superscript \(\zeta \) gives the flavour indices, and the operator subscripts indicate the Lorentz structure and particle content. So for instance

$$\begin{aligned} {\mathcal {O}}_{V,XY}^{e\mu uu} = ({\overline{e}} \gamma ^\alpha P_X \mu ) ({\overline{u}} \gamma _\alpha P_Y u),~~~X,Y \in \{L,R\}. \end{aligned}$$

We write the SM Yukawa matrices as \([Y_e], [Y_u]\) and \( [Y_d]\), and the Yukawa eigenvalue of fermion f as \(y_f\).

As discussed in Appendix C1, we did not find a simple and reliable recipe to express the coefficients of the Lagrangian (2.13) in terms of model parameters, because a TeV is not so far from the weak scale. The SMEFT operator basis [97, 98] and RGEs [99, 100] are appropriate above the weak scale, but the EFT expansions (in operator dimension and loop\(\times \)logarithm) are not converging fast. So it is a poor approximation to match to dimension six SMEFT operators, run to the weak scale at one loop, then match to the Lagrangian (2.13). In principle, we could match the model directly to the Lagragian (2.13), but that requires to explicitly calculate all the loop diagrams between the TeV and the weak scale. Finally, we opt to present the matching results in the basis of Eq. (2.13), but at the TeV-scale (such that the RGEs include the QED and QCD loops). So we will not be using the SMEFT basis, but for comparing to the literature, we implement it as

$$\begin{aligned} \delta \mathcal{L}_{SMEFT}= & {} - \frac{C_5^{\alpha \beta }}{2v} (\overline{\ell _{\alpha }} \varepsilon H^*)(\ell ^c_{\beta } \varepsilon H^*) + \sum _{A,\zeta }\frac{C_A^{\zeta } {\mathcal {O}}_A^{\zeta }}{v^2} + h.c.\nonumber \\ \end{aligned}$$
(2.14)

where \([m_\nu ]^{\alpha \beta } = C_5^{\alpha \beta }v \).

The low energy EFT includes all LFV operators of dimension six, and some relevant operators of dimension seven (see [80] for a list). In this low-energy EFT, we ensure that operators appear only once by requiring \(\zeta = e \mu \ldots \). Operators of dimension five and six (\(n= 1,2\)) are included in SMEFT, but without the \(+h.c.\) for hermitian operators. In SMEFT, we follow the convention that each flavour index in \(\zeta \) runs over all three generations (so some operators are repeated in the SMEFT Lagrangian, causing some factor of 2 or 4 differences between coefficients in SMEFT vs the low-energy EFT).

The doublet and singlet leptons are in the charged lepton mass eigenstates \(\{e,\mu ,\tau \}\) [59], which can differ from the diagonal-Yukawa basis as discussed in Appendix C2. The singlet quarks are labelled by their flavour, and the quark doublets are in the u-type mass basis, with generation indices that run \(1\rightarrow 3\).

In quantum field theories, the coefficients of renormalizable and non-renormalizable operators evolve with scale. For non-renormalisable operators, this evolution of the operator coefficients lined up in a row vector \(\vec {C}\) can be described as

$$\begin{aligned} \mu \frac{d}{d\mu } \vec {C} \equiv \frac{d}{d t} \vec {C} = \vec {C} [\Gamma ] + \vec {C}[\vec {X}] \vec {C}^\dagger + \cdots \end{aligned}$$
(2.15)

where \(t =\ln \mu \), the effects of renormalisable interactions on the non-renormalisable operators are described by the matrix \([\Gamma ]\), and \([\vec {X}]\) is a three-index tensor that schematically represents the effect of non-renormalisable interactions on the evolution of other non-renormalisable interactions (for instance, the mixing of a pair of four-fermion operators into another four-fermion operator via a fish diagram).

The RGEs we implement automatically for the QED\(\times \) QCD-invariant EFT are an improved leading log approximation to \([\Gamma ]\) [101], where most of the anomalous dimension are at one-loop, augmented by the two-loop vector to dipole mixing (because this mixing vanishes at one loop and is \(\mathcal{O} (10^{-3})\) [102]). We also consider the mixing of two dimension six operators into a dimension eight operator described by \([\vec {X}]\), and when relevant, include these contributions by hand in matching the model to EFT at \(\Lambda _{NP}\). Since the dimension eight operators have two additional Higgs legs, they are particularly relevant when the Higgs has \(\mathcal{O} (1)\) couplings to loop particles, such as the top quark in the leptoquark model. These dimension eight contributions are discussed more fully in Sect. 5.3 and Appendix C.

Solving the RGEs allows to translate operator coefficients from the experimental scale to the New Physics scale – where the coefficients can be calculated in a model. We aim to resum the QCD running, and include the electroweak effects in perturbation theory (\(\mathcal{O}(\alpha \ln )\) and occasionally \(\mathcal{O}(\alpha ^2 \ln ^2)\)). If the matrix \([\Gamma ]\) were scale-independent, then the solution would be \(\sim \exp \Gamma t\); however, \([\Gamma ]\) describes SM loop corrections, and SM parameters run in various ways with scale. For \([\Gamma ]\) scale-dependent, the RGEs can be solved by scale-ordering, analogous to the familiar time-ordering that allows to solve a similar equation for the time-translation operator in quantum field theory. We include the scale-dependence of \(\alpha _s\) (at one loop), and the associated running of quark Yukawas, but neglect the running due to other SM couplings (including the top Yukawa). So the effect of QCD is to renormalise some coefficients and rescale some electroweak anomalous dimensions, such that the first terms in the perturbative solution of the RGE (2.15) are

$$\begin{aligned} C_J(m )\simeq & {} C_J(\Lambda ) \eta _m^{a_J} - C_K(\Lambda ) \eta _m^{a_K} {\widetilde{\Gamma }}_{KJ} \ln \frac{\Lambda }{m} \end{aligned}$$
(2.16)

where \({\widetilde{\Gamma }}_{KJ} = f_{KJ} \Gamma _{KJ} (m)\) [103, 104] with \(\Gamma \) the electroweak anomalous dimension matrix, no sum on JK and:

$$\begin{aligned} f_{KJ}= & {} \frac{( 1 - \eta _m^{a_J-a_K -a_I+1 }) }{(1 +a_J - a_K-a_I)(1-\eta _m)},~~~\eta _m = \frac{\alpha _s(m(m))}{\alpha _s(\Lambda _{NP})},\nonumber \\ ~~~ \eta= & {} \frac{\alpha _s(2\textrm{GeV})}{\alpha _s(\Lambda _{NP})} \end{aligned}$$
(2.17)

where \(a_{J} = -\gamma ^s_{J}/2\beta _0\) for \(\alpha _s\gamma ^s_{J}/4\pi \) the QCD anomalous dimension of the coefficients \(C_{J}\) (\(a_T=-4/23\), \(a_S= a_D = 12/23\) with 5 flavours), \(\beta _0\) is the 1-loop QCD beta function coefficient, and the parameters in the electroweak anomalous dimension cause it to run as \(\Gamma _{KJ}(m) = \eta ^{a_I} \Gamma _{KJ}(\Lambda )\). Finally, once the RGEs have been solved, a coefficient at the experimental scale (for instance \(C_{D,X} (m_\mu )\)), can be written at \(\Lambda _{NP}\) as a weighted sum of the coefficients of operators that can contribute via loops to \(\mu \rightarrow e \gamma \).

It can be shown [80] that almost every operator involving 3 or 4 legs that induces a \(\mu \rightarrow e\) interaction (but no other flavour change) contributes to the amplitude for \(\mu \rightarrow e \gamma \), \(\mu \rightarrow e {\bar{e}} e\) and/or \(\mu A \rightarrow \! eA \) suppressed at most by a factor of order \(10^{-3}\). This suggests that the RGEs are relevant to include, because they ensure that a small handful of processes are sensitive to almost any \(\mu \rightarrow e\) operator. Despite that almost all operators can contribute, there remain only 12 constraints. Rather than dealing with a large correlation matrix corresponding to the usual operator basis, we use a scale-dependent basis that corresponds to the twelve experimentally probed directions. This was proposed in [80], and the recipe we follow is outlined in [81].

The RGEs can be solved to express an operator coefficient at the experimental scale, e.g. the dipole coefficient, in terms of operator coefficients at the NP scale:

$$\begin{aligned} C_D(m_\mu ) \simeq \vec {C} (\Lambda _{NP}) \cdot \vec {e}_D. \end{aligned}$$

The directions in coefficient space corresponding to the twelve vectors \(\{e_O\}\) then form a basis for the observable subspace at \(\Lambda _{NP}\). The elements of most of these vectors at or just above the electroweak scale are given in [80] (the coefficient combinations probed by \((\mu Al \rightarrow e Al)\) and \((\mu Au \rightarrow e Au)\) are given for completeness in Appendix B).

3 The models

The three TeV-scale New Physics models that we consider are the type II and inverse seesaw models, and a scalar leptoquark. These models are selected for their diverseFootnote 5 lepton-flavour-changing predictions: LFV is controlled by the neutrino mass matrix in the type II seesaw, is independent of the neutrino mass matrix in the inverse seesaw, and the leptoquark can mediate \(\mu A \rightarrow \! eA \) at tree level, as well as addressing anomalies in the quark flavour sector.

The type II seesaw model [19,20,21,22] is an economical neutrino mass model, where the SM particle content is extended with a colour-singlet, SU(2) triplet scalar \(\Delta \), of hypercharge \(Y=+1\) (in the normalization where the lepton doublets have \(Y=-1/2\)). The SM Lagrangian at the mass scale of the triplet is augmented by

$$\begin{aligned} \delta \mathcal{L}_\Delta= & {} (D_\rho \Delta ^I)^\dagger D^\rho \Delta ^I -M_\Delta ^2|\Delta |^2\nonumber \\{} & {} +\frac{1}{2}\left( f_{\alpha \beta }\,\overline{\ell ^c_\alpha } (i\tau _2)\tau _I \ell _\beta \Delta ^I +M_\Delta \lambda _H\, H^T (i\tau _2) \tau _I H \Delta ^{*I}+\mathrm{h.c.}\right) \nonumber \\{} & {} + \lambda _3 (H^\dagger H) (\Delta ^{I*} \Delta ^{I})+\lambda _4\textrm{Tr}(\Delta ^{I*}\tau _{I}\tau _{J}\tau _{K}\Delta ^{K})(H^\dagger \tau _J H) + \dots , \nonumber \\ \end{aligned}$$
(3.1)

where \(\ell \) are the left-handed SU(2) doublets, \(M_\Delta \) is the triplet mass which we take \(\sim \) TeV, f is a symmetric complex \(3\times 3\) matrix proportional to the light neutrino mass matrix and whose indices \( \alpha ,\beta \) run over \(\{e,\mu ,\tau \}\), \(\{\tau _I\}\) are the Pauli matrices, and the \(\lambda \)’s are real dimensionless couplings. A feature of this model, that is shared with some other neutrino mass models, is that the SU(2) singlet leptons do not acquire new interactions, so LFV is expected to involve the doublet leptons. The phenomenology of the type II seesaw has been widely studied at colliders [106,107,108,109,110,111] and for low-energy LFV [23,24,25,26,27,28, 112,113,114].

In Effective Field Theory, the neutrino mass matrix generated in the type II seesaw model can be obtained by matching, at \(M_\Delta \), the tree-level diagram in the left panel of Fig. 5 onto the dimension five neutrino mass operator appearing in Eq. (2.14). This gives a neutrino mass matrix

$$\begin{aligned} {[}m_\nu ]^{\alpha \beta }\simeq \frac{ [f]^{\alpha \beta *} \lambda _H v^2}{M_\Delta } \end{aligned}$$
(3.2)

which can also be obtained in the model by minimising the potential for the Higgs and the electrically neutral component of the triplet = \((\Delta ^1 + i\Delta ^2)/\sqrt{2}\,\), which obtains a vev \( -\lambda _H v^2/(\sqrt{2}M_\Delta )\). This model is reputed to be predictive for LFV, because the lepton flavour-changing couplings \(f^{\alpha \beta }\) are proportional to the light neutrino mass matrix. However, LFV is not suppressed by the small neutrino mass scale, because lepton number change involves \([f]^{\alpha \beta *} \lambda _H\), so for a sufficiently small Higgs to triplet coupling, \(\lambda _H \sim 10^{-12}\), it is possible to have \(M_\Delta \sim \) TeV and \(f^{\alpha \beta }\) of \(\mathcal{O}(1)\).

The second model we consider is the inverse (type I) seesaw [29,30,31], which, like the type II seesaw, naturally generates small Majorana neutrino masses from new particles that can be at the TeV-scale with \(\mathcal{O}(1)\) LFV couplings. In this model, n gauge singlet Dirac fermions, \(\Psi _a^T = (S_a, N_a)\), are added to the SM particle content, with approximately lepton number conserving interactions. Lepton number changing interactions can be included via Yukawa couplings and/or Majorana masses(see [115] for a “basis-independent” discussion of the options); we choose to allow small Majorana masses for \(S_a\) and write the Lagrangian as

(3.3)

where ab run from 1..n, \(N_a\) and \(S_a\) are respectively right- and left-handed and in the eigenbases of M, \(Y_\nu \) is a complex \(3\times n\) dimensionless matrix, we take \(M_a\) of \(\mathcal{O}\)(TeV), and \(\mu \) is an \(n\times n\) Majorana mass matrix with \(\mu _{ab}\ll M_c\). For vanishing \(\mu \), lepton number is conserved, and the \(N_a\) combine with the \(S_a\) into Dirac singlet neutrinos, which can have lepton flavour changing interactions \(Y_\nu \) with the SM doublets. Like the type II seesaw, this model is expected to induce LFV among doublet leptons, but unlike the type II seesaw, it has several non-degenerate heavy new particles, and will induce low energy LFV processes via loop diagrams because the flavour-changing \(Y_\nu \) couples \(\ell \) to two heavy particles.

The inverse seesaw models induce LFV [32,33,34,35,36,37,38,39, 116], they can lead to non-unitarity of the lepton mixing matrix [117,118,119,120], and the singlets could be discovered at colliders for suitable mass ranges [121,122,123,124,125,126,127]. With small \(\mu _{ab}\), the leading contribution to the active neutrino mass matrix is

$$\begin{aligned} {[}m_\nu ]^{\alpha \beta } \simeq [Y_\nu M^{-1} \mu M^{-1} Y_\nu ^T]^{\alpha \beta } v^2. \end{aligned}$$
(3.4)

So the flavour-changing \(Y_\nu \) can be \(\mathcal{O}\)(1) because small \(\mu _{ab}\) gives small \(m_\nu \), and the flavour change is expected to be independent of the active neutrino mass matrix as can be seen for n = 3 by solving Eq. (3.4) for \(\mu _{ab}\).

Finally, our third model includes a leptoquark – for a review of this class of coloured and charged bosons, see e.g. [41] – which is chosen to fit the anomalies in \(R_{D^*}\) and/or \(R_D\) [53,54,55,56,57]. It is an SU(2)-singlet scalar denoted \(S_1\) in the notation of [40] – not to be confused with the singlet fermions \(\{S_a\}\) of Eq. (3.3) – and has interactions

$$\begin{aligned} \mathcal{L}_S= & {} (D_\rho S_1)^\dagger D^\rho S_1 - m_{LQ}^2 S_1^\dagger S_1\nonumber \\{} & {} + (- \lambda ^{\alpha j}_{L} {\overline{\ell }}_\alpha i \tau _2 q_j^c + \lambda ^{\alpha j}_{R} {\overline{e}}_\alpha u_j^c ) S_1\nonumber \\{} & {} + (\lambda ^{\alpha j*}_{L} \overline{q^c}_j i \tau _2 \ell _\alpha + \lambda ^{ \alpha j*}_{R} \overline{u^c}_j e_\alpha ) S_1^{\dagger } \nonumber \\{} & {} + \lambda _4 H^\dagger H S_1^\dagger S_1 +\cdots \end{aligned}$$
(3.5)

where the generation indices are \(\alpha \in \{ e,\mu ,\tau \}\) and \(j\in \{u,c,t\}\), and the sign of the doublet contraction is taken to give \(+ \lambda ^{\alpha j}_L \overline{e_L} (u_L)^c S_1\). The leptoquark mass is \(m_{LQ} \approx \) TeV, consistent with the CMS and ATLAS searches [128,129,130,131,132] which exclude leptoquarks with sub-TeV masses that are pair-produced via strong interactions, and decay to specific final states. Some leptoquark-Higgs interactions are included in Eq. (3.5) because they appear in the matching results of Appendix C3c, but their contributions to LFV observables are negligible assuming perturbative couplings. This Lagrangian does not lead to neutrino masses (as mentioned in Appendix C3c), but features \(\mu A \rightarrow \! eA \) at tree level and LFV interactions for singlet and doublet leptons (so unlike the seesaw models, it induces scalar and tensor operators).

The low-energy phenomenology of leptoquarks (see e.g. [42,43,44,45,46,47,48,49]) attracted attention in recent years due to various B-physics anomalies. In particular, the excesses in the ratios [53,54,55,56,57]

$$\begin{aligned} R_X = \frac{BR(B\rightarrow X_c {\bar{\tau }}\nu )}{BR(B\rightarrow X_c {\bar{\ell }}\nu )} \end{aligned}$$
(3.6)

where \(X_c = D, D^*\ldots \), could indicate a new charged-current four-fermion interaction involving b and c quarks, a \(\tau \) and a (tau) neutrino. The \(S_1\) leptoquark can generate this interaction with various Lorentz structures, while preserving lepton flavour (if \(S_1\) only interacts with \(\tau \)’s and \(\nu _\tau \)’s, one could attribute \(\tau \) flavour to the leptoquark.) So fitting \(R_D\) and/or \(R_{D^*}\) involves leptoquark coupling constants that differ from the \(\lambda _X^{e q}\), \(\lambda _X^{\mu q}\) that are relevant for \(\mu \rightarrow e\) processes, and in this manuscript, we neglect the \(\lambda _X^{\tau q}\) couplings and possible correlations among \(\mu \rightarrow e\) and \(\tau \rightarrow l\) processes that could arise in this model.Footnote 6 The LFV predictions of the \(S_1\) leptoquark have been discussed, for instance, in [133, 134].

4 The observables coefficients in terms of model parameters

In this section, we give the coefficients of the observable Lagrangian (2.3) as a function of model and SM parameters. Some of these results were already presented in [50]. The coefficients are expressed in terms of \(\mu \rightarrow e\) flavour-changing combinations of model and SM parameters, which we refer to as “invariants”. These are a convenient stepping-stone between the models and the observables, because they concisely identify the masses and couplings constants that the observables depend on, and give the functional dependance. We briefly discuss the invariants in Sect. 4.1 and list the coefficients in Sect. 4.2.

4.1 Comments on invariants

Invariants were introduced [51, 135] as products of Lagrangian coupling constants (or matrices), in order to have a Lagrangian-basis-independent measure of symmetry breaking in a model. However, it may be unclear how these elegant constructions relate to observable S-matrix elements for symmetry-violating processes, because the mass and scale-dependence of S-matrix elements can be intricate.

For instance, the original invariant constructed to measure CP violation in the quark sector of the Standard Model can be written in terms of Lagrangian Yukawa matrices as [51, 135, 136]

$$\begin{aligned} \frac{1}{3} \textrm{Tr} \Big \{ \left[ Y_uY_u^\dagger , Y_dY_d^\dagger \right] ^3 \Big \}= & {} 2i J (y_t^2 - y_c^2) (y_t^2 - y_u^2) (y_c^2 - y_u^2)\nonumber \\{} & {} \times (y_b^2 - y_s^2) (y_b^2 - y_d^2) (y_s^2 - y_d^2),~~ \nonumber \\ \textrm{with}~~ J= & {} \textrm{Im}\{ V_{us}V_{cs}^* V_{cb}V_{ub}^*\} \end{aligned}$$
(4.1)

but it is unclear at what scale to evaluate the quark masses (or equivalently, Yukawa couplings). This invariant can be compared to the parameter \(\epsilon _K\) (see e.g. [91] for a brief review) that contributes to CP-violation in \(K-{\bar{K}}\) mixing, and for which the result at next-to or next-to-next-to leading log has been expressed in a rephasing invariant form in [137]. Restricting to the leading log QCD corrections for simplicity, one can write

$$\begin{aligned} \epsilon _K\propto & {} J \Big ( \textrm{Re}\{V_{td}V_{ts}^*V_{ud}^*V_{us}\}\eta ^{tt} S(x_t) \nonumber \\{} & {} + 2 |V_{ud}|^2|V_{us}|^2\eta ^{ut} (S(x_c) -S(x_c,x_t))\Big ) \end{aligned}$$
(4.2)

where \( x_Q = m_Q(m_Q)/m_W\), the Inami-Lim functions [138] are

$$\begin{aligned} S(x_t)= & {} \frac{4x_t -11 x^2_t + x_t^3}{4(1-x_t)^2} - \frac{3x_t^2}{2(1-x_t)^3}\ln x_t \\ S(x_c)-S(x_c,x_t)= & {} x_c\left( 1-\ln \frac{x_t}{x_c}\right) \nonumber \\{} & {} + \frac{3x_cx_t}{4(1-x_t)} + \frac{3x_c x_t^2}{4(1-x_t)^2}\ln x_t, \end{aligned}$$

and \(\eta ^{pq}\) are the appropriate QCD corrections for the Inami-Lim functions (see [137]). So one sees that there is only a remote relation between the Lagrangian invariant of Eq. (4.1), and Eq. (4.2), which allows a numerically precise prediction for the observable \(\epsilon _K\),

In the models considered here, we obtain expressions for the observable coefficients at \(m_\mu \) (which are \(\approx \) S-matrix elements) in terms of “invariants”, which are flavour-changing combinations of Lagrangian parameters at specified scales, and encode the dependence of LFV observables on running SM and model coupling constants and masses. In practice, these combination are frequently products of matrices in flavour space, which are manifestly invariant under flavour-basis transformations, hence we refer to them as “invariants”. For instance, in the type II seesaw model, photon penguin diagrams (contributing to \(\mu \rightarrow e {\bar{e}} e\)) generate the coefficient \(C_{V,LR}^{e\mu ee}\) proportional to the invariant:

$$\begin{aligned}{} & {} \frac{v^2}{M_\Delta ^2}\, [m_\nu \ln \frac{[\tilde{m}_e \tilde{m}_e^\dagger ]}{M_\Delta ^2}\, m_\nu ^\dagger ]^{e \mu }\, = \frac{v^2}{M_\Delta ^2}\left( [m_\nu m_\nu ^\dagger ]^{e \mu } \ln \frac{m_\mu ^2}{M_\Delta ^2}\right. \nonumber \\{} & {} \left. + [m_\nu ]^{e\tau } [m_\nu ^*]^{ \mu \tau } \ln \frac{\tilde{m }_\tau ^2}{m_\mu ^2}+[m_\nu ]^{ee} [m_\nu ^*]^{ \mu e } \ln \frac{\tilde{m }_e^2}{m_\mu ^2}\right) , \end{aligned}$$
(4.3)

where \([\tilde{m}_e]\) is the charged lepton mass matrix, but with its eigenvalues replaced by \(m_\alpha \rightarrow \tilde{m }_\alpha = \max \{m_\alpha ,q^2\}\), which is the kinematic cutoff of the logarithm, with the (four-momentum)\(^2\) of the photon. (So the cutoff of the \(\tau \) loop is \(m_\tau \), but \(\tilde{m}_e \approx m_\mu \) so the last term in the parentheses of Eq. (4.3) vanishes, or equivalently there is no long-distance contribution to the matrix element, because \(q^2 \sim m_\mu ^2\) in most of the phase space [139].) Equation (4.3) exemplifies our relatively simple invariants, constructed by multiplying matrices, which encode the correct scale evolution of coupling constants and masses, provided that the scale separations are large enough for EFT to be reliable.Footnote 7

However, our simple invariants apply only for models with a single mass scale for LFV; for models with many heavy LFV particles around the scale \(\Lambda _{NP}\), such as the inverse seesaw, the operator coefficients obtained in matching are not linear products of matrices in flavour space (see e.g. Eqs. 4.10, 4.12). Constructions that involve Inami-Lim functions of mass matrix eigenvalues are no doubt also “invariant” under Lagrangian basis transformations, but like Eq. (4.2), this invariance is not manifest.

Our invariants have other attractive features, beyond the correct scale-dependence to parametrise S-matrix elements. As expected they measure \(\mu \rightarrow e\) flavour change in the model, and they also identify the products of model parameters relevant to observables. This second feature allows to circumvent the necessity of scanning over model parameters. For instance, in the inverse seesaw model, \([Y_\nu ]\) is a 3\(\times \)3 matrix of unknown complex numbers, so naively one must scan over them all, and possibly impose some texture. However, in the case of degenerate singlets, there are only two invariants – \([Y_\nu Y_\nu ^\dagger ]_{e\mu }\) and \([Y_\nu Y_\nu ^\dagger Y_\nu Y_\nu ^\dagger ]_{e\mu }\) – which are complex numbers of magnitude 0\(\rightarrow \)1. This raises the question whether one could reconstruct the model Lagrangian from a sufficient number of invariants?

Finally, the invariant of Eq. (4.3) illustrates an interesting dynamical mechanism to break properties of the model. Recall that \( [m_\nu m_\nu ^\dagger ]^{e \mu }\) is a function only of neutrino oscillation parameters – which are measured – so the second term of the second expression exhibits the log-induced dependence on the unknown Majorana phases and neutrino mass scale. This logarithmic breaking of a relation between model-matrices is reminiscent of the “log-GIM” mechanism in the quark sector [140], where \(\Delta F = 1\) FCNC operators can be mediated by similar penguin diagrams at \(\mathcal{O}(G_F \alpha _e\ln m_W/m_c)\), with a charm quark in the loop.Footnote 8

4.2 Predictions at the experimental scale

This section lists the model predictions for the observable coefficients of the Lagrangian (2.3), partially presented in Ref. [50]. The model parameters are given in the Lagrangians of Sect. 3. The expressions given here occasionally differ from [50], because we made numerically insignificant modifications of the lower cutoff of some logarithms, in order to obtain more elegant invariants with the physically correct cutoffs.

4.3 \(\mu \rightarrow e \gamma \)

In the inverse seesaw model, the dipole coefficients are

$$\begin{aligned} C^{e\mu }_{D,R}\simeq & {} - \frac{e}{32 \pi ^2 } [Y_\nu M^{-1} (M^\dagger )^{-1}Y_\nu ^\dagger ]^{e\mu } v^2 \nonumber \\{} & {} \times {\left( 1- 16 \frac{\alpha _e}{4\pi } \ln \frac{M}{m_\mu }\right) }, \end{aligned}$$
(4.4)

where the parentheses include the flavour-universal \(\mathcal{O}(10\%)\) QED running, and M is the singlet mass scale. The couplings \(Y^{\alpha a}_\nu \) can be of order one – which could be especially motivated in the \(\tau \) sector – so \(\mathcal{O}(Y^4)\) combinations can be larger than \(\mathcal{O}(Y^2)\), and could appear at dimension eight when two additional Higgs legs on the sterile neutrino line are replaced by the Higgs vev, or at two-loop when the Higgs legs are closed. We estimate the \({\mathcal {O}}(Y^4)\) terms to be suppressed with respect to the coefficient in Eq. (4.4) by \(\sim v^2/M^2\) or \(\sim 1/(16\pi ^2)\), and we therefore expect them to not modify significantly the correlations between the \(\mu \rightarrow e\) observables that the model can predict [50].

For the type II seesaw,

$$\begin{aligned} C^{e\mu }_{D,R}\simeq & {} \frac{3e}{128 \pi ^2} \left[ \frac{[m_\nu m_\nu ^\dagger ]_{e\mu }}{\lambda ^2_H v^2}\left( 1-16 \frac{\alpha _e}{4\pi }\ln \frac{M_\Delta }{m_\mu } \right) \right. \nonumber \\{} & {} \left. + \frac{\alpha _e}{\pi } \frac{116}{27}\left( \ln \frac{M_\Delta }{m_\mu } \frac{[m_\nu m_\nu ^\dagger ]_{e\mu }}{\lambda _H^2 v^2 } - \ln \frac{m_\tau }{m_\mu } \frac{[m_\nu ^*]_{\mu \tau } [m_\nu ]_{e\tau }}{\lambda _H^2 v^2 }\right) \right] \nonumber \\ \end{aligned}$$
(4.5)
$$\begin{aligned}\simeq & {} \frac{3e}{128 \pi ^2} \left[ \frac{[m_\nu m_\nu ^\dagger ]_{e\mu }}{\lambda ^2_H v^2} \Big (1 + \frac{32}{27} \frac{\alpha _e}{4\pi } \ln \frac{M_\Delta }{m_\tau } \Big )\right. \nonumber \\{} & {} \left. +\frac{116 \alpha _e}{27\pi } \ln \frac{m_\tau }{m_\mu } \sum _{\alpha \in e\mu } \frac{[m^*_\nu ]_{\mu \alpha } [m_\nu ]_{e\alpha } }{\lambda _H^2 v^2 }\right] \end{aligned}$$
(4.6)

where the second term in the bracket in Eq. (4.5) arises from the two-loop vector to dipole mixing [102]. In both seesaw models, the coefficient \(C^{e\mu }_{D,L}\) is suppressed by a factor \(m_e/m_\mu \), and can be obtained from Eqs. (4.6, 4.4) by multiplying by \(m_e/m_\mu \).

For the leptoquark, which interacts with singlet and doublet leptons, the dipole coefficients are

$$\begin{aligned} \frac{ m_{LQ}^2}{ v^2}C^{e\mu }_{D,X} (m_\mu )\simeq & {} \frac{ e [\lambda _Y\lambda ^\dagger _Y]^{e\mu } }{128\pi ^2} \left( 1- 16 \frac{\alpha _e}{4\pi } \ln \frac{m_{LQ}}{m_\mu }\right) \nonumber \\{} & {} + \frac{ 2\alpha ^2_e }{9 \pi ^2 e} \left[ \lambda _Y \ln \frac{m_{LQ}}{\tilde{m}_Q} \lambda ^\dagger _Y \right] ^{e\mu } \nonumber \\{} & {} -\frac{ \alpha _e }{2\pi e y_\mu } f_{TD} \left[ \lambda _Y Y_u \left( \eta ^{a_T}_{\tilde{m}_Q} \ln \frac{m_{LQ}}{\tilde{m}_Q} - \frac{5}{4} \right) \lambda ^\dagger _X \right] ^{e\mu } \nonumber \\ \end{aligned}$$
(4.7)

where \(X\ne Y\in \{L,R\}\), \(f_{TD}\), \(a_T\) and \(\eta \) are related to the QCD running and defined at Eq. (2.17), and the \(\tilde{m}_Q\) serving as lower cutoff for the logarithms (here and further in the manuscript) is

$$\begin{aligned} \tilde{m}_Q = \textrm{max} \{ m_Q(m_Q), 2~\textrm{GeV}\} \end{aligned}$$

because the quarks are matched to nucleons at 2 GeV. The first term in Eq. (4.7) is the matching contribution (times its QED running),the second term is the 2-loop mixing of tree vector operators into the dipole, and the third term is the one loop mixing of tensor operators to dipoles.

The last term of Eq. (4.7) requires some discussion, because the first log-enhanced term in the parentheses arises in the RGEs between \(m_{LQ}\rightarrow m_\mu \), but the second “finite”, or not-log-enhanced term is formally of higher order in EFT. It is included because it is of comparable magnitude to the log-enhanced term in the case of internal top quarks – that is, as discussed in Appendix C1, the scale ratio \(m_{LQ}/m_t\) is not large, so our matching conditions at \(\Lambda _{NP}\) are constructed to reproduce the results of matching to a QCD\(\times \)QED invariant EFT at \(m_W\). And finally, although the finite part is only required for the top quark, a quark-flavour-summed expression is given in order to retain the “invariant” formulation. (The light quark contributions are numerically negligible in this expression, which is fortunate because their QCD running is also not correct.)

We do not consider dimension 8 contributions to the dipole, because they are suppressed \(\propto v^2/M^2\), and do not allow to circumvent the parametric suppression of the dimension six term. For instance, in the inverse seesaw, such terms also have the loop and \(y_\mu \) suppression applying to (4.4).

4.3.1 \(\mu \rightarrow e {\bar{e}} e\)

The decay \(\mu \rightarrow e {\bar{e}} e\) can be mediated at the experimental scale by the dipole operators, and vector and scalar four-lepton operators (see Eq. (2.8). We do not give results for the scalar coefficients, because they are effectively vanishing: in matching, all three models induce coefficients that are smaller than the upcoming experimental sensitivity, and SM interactions that could transform some other LFV operator into a scalar are suppressed by lepton Yukawas, so negligible as well. Unfortunately, scalar coefficients \(C_{S,XX}\) are indistinguishable from vectors \(C_{V,YY}\) in the angular distribution of \(\mu \rightarrow e {\bar{e}} e\), so the absence of scalar coefficients in these models would be challenging to test.

For the type II seesaw, the vector four-lepton coefficients arise at tree level, as illustrated in the middle diagram of Fig. 5, and in the RGEs via QED penguin diagrams:

$$\begin{aligned} C^{e\mu ee}_{V,LL}\simeq & {} \frac{[m^*_{\nu }]_{\mu e}[m_{\nu }]_{e e}}{2 \lambda _H^2v^2} + { \frac{\alpha _e}{{ 3}\pi \lambda _H^2v^2} \Big [ m^\dagger _{\nu } \ln \left( \frac{M_\Delta }{\tilde{m}_\alpha }\right) m_{\nu } \Big ]_{\mu e }}\nonumber \\ \end{aligned}$$
(4.8)
$$\begin{aligned} C^{e\mu ee}_{V,LR}\simeq & {} { \frac{\alpha _e}{ 3\pi \lambda _H^2v^2} \Big [ m^\dagger _{\nu } \ln \left( \frac{M_\Delta }{\tilde{m}_\alpha }\right) m_{\nu } \Big ]_{\mu e }} . \end{aligned}$$
(4.9)

where \(\alpha \) is the index of the intermediate charged lepton. The operators \(\mathcal{O}^{e\mu ff }_{V,RR}\) and \(\mathcal{O}^{e\mu ff }_{V,RL}\), where the flavour-change is among singlet leptons, have coefficients below upcoming experimental sensitivities because they are suppressed \(\propto y_e y_\mu \). This is also the case for the inverse seesaw.

In the case of the inverse seesaw, both vector operators arise via loop diagrams, with propagating singlets and Higgses:

$$\begin{aligned} C^{e\mu ee}_{V,LL}\simeq & {} v^2\frac{\alpha _e}{4\pi }\bigg (-1.8[Y_\nu M_a^{-2}\left( \frac{11}{6}+\ln \left( \frac{m^2_W}{M_a^2}\right) \right) Y_\nu ^\dagger ]_{e\mu }\nonumber \\{} & {} + 2.7 [Y_{\nu }(Y^\dagger _\nu Y_\nu )_{ab}\frac{1}{M^2_{a}-M^2_{b}} \ln \left( \frac{M^2_{a}}{M^2_{b}}\right) Y_\nu ^\dagger ]_{e\mu } \nonumber \\{} & {} +2.5 Y^{e a}_\nu Y^{*\mu a}_\nu Y^{e b}_\nu Y^{*e b}_\nu \frac{1}{M^2_{a}-M^2_{b}}\ln \left( \frac{M^2_{a}}{M^2_{b}}\right) \nonumber \\{} & {} +{\mathcal {O}}\left( \frac{\alpha _e}{4\pi }\right) \bigg ) \end{aligned}$$
(4.10)
$$\begin{aligned}\rightarrow & {} \frac{v^2}{M^2}\left( 3.3\times 10^{-3}(Y_\nu Y^\dagger _\nu )_{e\mu }(1+0.56 (Y_\nu Y^\dagger _\nu )_{ee})\right. \nonumber \\{} & {} \left. +1.55\times 10^{-3}(Y_\nu Y^\dagger _\nu Y_\nu Y^\dagger _\nu )_{e\mu }\right) \end{aligned}$$
(4.11)
$$\begin{aligned} C^{e\mu ee}_{V,LR}\simeq & {} v^2\frac{\alpha _e}{4\pi } \bigg ( 1.5[Y_\nu M_a^{-2} \left( \frac{11}{6} +\ln \left( \frac{m^2_W}{M_a^2}\right) \right) Y_\nu ^\dagger ]_{e\mu }\nonumber \\{} & {} -2.7 [Y_{\nu }(Y^\dagger _\nu Y_\nu )_{ab} \frac{1}{M^2_{a}-M^2_{b}} \ln \left( \frac{M^2_{a}}{M^2_{b}}\right) Y_\nu ^\dagger ]_{e\mu }\nonumber \\{} & {} +{\mathcal {O}}\left( \frac{\alpha _e}{4\pi }\right) \bigg )\rightarrow \frac{v^2}{M^2} \end{aligned}$$
(4.12)
$$\begin{aligned}{} & {} \times \left( -2.8\times 10^{-3}(Y_\nu Y^\dagger _\nu )_{e\mu }-1.6\times 10^{-3}(Y_\nu Y^\dagger _\nu Y_\nu Y^\dagger _\nu )_{e\mu }\right) \nonumber \\ \end{aligned}$$
(4.13)

where the first expression for a coefficient is for arbitrary singlet masses TeV, and after the arrow is the simplified formula when the singlets mass\(^2\) differences are less than \(v^2\) [50]. In the first expression, the first two terms arise from Z and \(\gamma \) penguins above the electroweak scale (the Higgs propagates in the loop), and the last one is from boxes.

Finally the leptoquark can generate flavour-changing lepton currents involving either singlet or doublet leptons. We give here the coefficient for left-handed leptons; the result for singlets is obtained by interchanging \(L\leftrightarrow R\):

$$\begin{aligned} \frac{ m_{LQ}^2}{ v^2} C_{V,LX}^{e\mu ee} (m_\mu )\simeq & {} -\frac{N_c }{64\pi ^2} [\lambda _L \lambda _L^\dagger ]^{e \mu } [\lambda _X \lambda _X^\dagger ]^{ee}\nonumber \\{} & {} \times \left( 1\mp 12 \frac{\alpha _e}{4\pi } \ln \frac{m_{LQ}}{m_\mu } \right) \nonumber \\{} & {} { +} \frac{\alpha _e}{3\pi } \left[ \lambda _L \ln \frac{m_{LQ}}{m_Q} \lambda _L^\dagger \right] ^{e\mu } - g^e_X \frac{N_c }{16\pi ^2} \nonumber \\{} & {} \times \left[ \lambda _L Y_u \left( \ln \frac{m_{LQ}}{m_Q} -\frac{5}{6} \right) Y_u^\dagger \lambda ^\dagger _L \right] ^{e\mu }\nonumber \\ \end{aligned}$$
(4.14)

where \(g^e_L= -1+2\sin ^2\theta _W\), \(g^e_R=2\sin ^2\theta _W\), the first term represents the box diagram at \(m_{LQ}\) (and its QED running to \(m_\mu \), with −/\(+\) for \(X=\)/\(\ne L\)), the second term is the log-enhanced photon penguin that mixes the tree operators \(\mathcal{O}_{VLL}^{QQ}\) (for \(Q\in \{u,c,t\}\)) into 4-lepton operators, and the last term is the contribution of the Z-penguins without the negligible QED running. Since we only retain the part of the Z-penguin that is proportional to \(y_Q^2\)(see Appendix C4), the dominant contribution arises from the top quark, where the “finite” (not log-enhanced) part of the diagrams is included because the logarithm is not large (see the discussion after Eq. (4.7) or in Appendix C1).

4.3.2 \(\mu \textrm{Al}\rightarrow e \textrm{Al}\)

The SINDRUMII experiment searched for \(\mu A \rightarrow \! eA \) on Titanium (\(Z = 22\)) and Gold (\(Z= 79\)), setting the bounds listed in Table 1. Upcoming experiments will start with an Aluminium target, which probes a similar combination of coefficients as Titanium according to the analysis of [79]. So this section gives expressions for the coefficient on Aluminium \(C_{Al,X}\), expressed in the quark operator basis, where the conversion ratio on Aluminium is given in Eq. (A.16).

The photon penguin diagrams in the type II seesaw model generate a vector \(\mu \rightarrow e\) operator on u and d quarks, giving

$$\begin{aligned} {C}^{e\mu }_{Al,L}\simeq & {} -\frac{\alpha _e}{72\pi \lambda _H^2v^2} \Big [ m^\dagger _{\nu } \ln \left( \frac{M_\Delta }{\tilde{m}_\alpha }\right) m_{\nu } \Big ]_{\mu e }, \end{aligned}$$
(4.15)

where \(\tilde{m}_\tau = m_\tau \) and \(\tilde{m}_\alpha = m_\mu \) for \(\alpha \in \{e,\mu \}\) because the logarithm is cut off by the momentum transfer from the leptons to the nucleus [143], which is of order \(m_\mu \). Notice that the penguin diagram of Fig. 5 generates a 2-lepton-2-quark operator at scales above 2 GeV, where quarks are matched to nucleons, then it continues to mix into a 2-lepton-2-proton operator, which is why the logarithm cuts off at \(m_\mu \).

The inverse seesaw generates vector operators, via Z and \(\gamma \) penguins above the weak scale:

$$\begin{aligned} {C}^{e\mu }_{Al,L}\simeq & {} v^2\frac{\alpha _e}{4\pi }\bigg (-0.05[Y_\nu M_a^{-2}\left( \frac{11}{6}+\ln \left( \frac{m^2_W}{M_a^2}\right) \right) Y_\nu ^\dagger ]_{e\mu }\nonumber \\{} & {} +0.09 [Y_{\nu }(Y^\dagger _\nu Y_\nu )_{ab}\frac{1}{M^2_{a}-M^2_{b}}\ln \left( \frac{M^2_{a}}{M^2_{b}}\right) Y_\nu ^\dagger ]_{e\mu }~~~~ \end{aligned}$$
(4.16)
$$\begin{aligned}{} & {} +{\mathcal {O}}\left( \frac{\alpha _e}{4\pi }\right) \bigg )\rightarrow \frac{v^2}{M^2}\nonumber \\{} & {} \times \left( 8.6\times 10^{-5} (Y_\nu Y^\dagger _\nu )_{e\mu }+5.6\times 10^{-5}(Y_\nu Y^\dagger _\nu Y_\nu Y^\dagger _\nu )_{e\mu }\right) \nonumber \\ \end{aligned}$$
(4.17)

For the leptoquark, which induces scalar and vector 2l2q operators, we obtain

$$\begin{aligned} {C}^{e\mu }_{Al,L}\simeq & {} \Bigg (0.032 \lambda ^{e u}_L \lambda ^{\mu u *}_L. \left( 1{ +} \frac{{ 2}\alpha }{\pi } \ln \frac{m_{LQ}}{2\textrm{GeV}}\right) \nonumber \\{} & {} +0.033\left( \frac{g^2}{64\pi ^2} \lambda ^{e u}_L \lambda ^{\mu u *}_L \ln \frac{m_{LQ}}{m_W}\right) \nonumber \\{} & {} - 0.086\frac{\alpha }{6 \pi } \left[ \lambda _L \ln \frac{m_{LQ}}{\tilde{m}_Q} \lambda ^\dagger _L \right] ^{e\mu }\nonumber \\{} & {} -4.1\times 10^{-4} \left[ \lambda _LY_u \left( \ln \frac{m_{LQ}}{\tilde{m}_Q}- \frac{5}{6}\right) Y_u^\dagger \lambda ^\dagger _L \right] ^{e\mu } \nonumber \\{} & {} -\eta ^{a_S} \left( 0.16 \lambda ^{e u}_L \lambda ^{\mu u *}_R + \frac{0.035 m_N}{27m_c} \lambda ^{e c}_L \lambda ^{\mu c *}_R \right) \nonumber \\{} & {} \times \left( 1+ \frac{13 \alpha }{ 6\pi } \ln \frac{m_{LQ}}{2\textrm{GeV}} \right) \nonumber \\{} & {} +\frac{0.094m_N}{27m_t} \lambda ^{e t}_L \lambda ^{\mu t *}_R \Bigg )\times \frac{v^2}{m_{LQ}^2}\, \end{aligned}$$
(4.18)

where are included the tree vector coefficient on u quarks with its QED running, the electroweak box contribution to the d vector, the QED then Z penguin contributions to the u and d vectors (where we took \(V_{ud} \simeq 1, \sin ^2 \theta _W \simeq 1/4\)), and the scalar u, c and t contributions (where the QCD running of the top contribution is negligible). As in Eq. (4.14), the Z-penguin contribution only includes the diagrams \(\propto y_Q^2\), with their “finite parts”.

4.3.3 \(\mu A \rightarrow \! eA \) on heavy targets

Changing the target in \(\mu A \rightarrow \! eA \) allows to probe a different combination of operator coefficients [79]. This is discussed quantitatively in Appendix A, where Eqs. (A.12, A.13, A.14) give the operators probed by light and heavy targets in the quark basis. The SINDRUM experiment searched for \( \mu A \rightarrow \! eA \) on Gold (see Table 1), and there are plans based on the proposal of Ref. [71, 73] to build experiments that could probe \(\mu A \rightarrow \! eA \) on heavy targets (see Table 1). However, we consider these experiments to be too far in the future for the purposes of our study, so we suppose that the data for Gold remains the bound of SINDRUM given in Table 1. The operator probed by Gold can be decomposed into the operator probed by light targets, plus the remaining part, approximately given in Eq. (2.6). In this section, we discuss the coefficient of this orthogonal part, which can be written as

$$\begin{aligned} C_{Au\perp ,L}\simeq & {} - \Big ( 0.2 C^{e\mu uu}_{SR} +0.1 C^{e\mu dd}_{SR} +0.0075 C^{e\mu ss}_{SR} \nonumber \\{} & {} + 0.001C^{e\mu cc}_{SR}+ 0.0003 C^{e\mu bb }_{SR} \Big )\nonumber \\{} & {} + 0.56 C^{e\mu uu}_{VL}+ 0.7956 C^{e\mu dd}_{VL}. \end{aligned}$$
(4.19)

The type II seesaw only generated a vector operator on protons, (no scalar operators, and no vector operator on neutrons), so once the coefficient is measured on a first target, it can be predicted on any other. That is, Gold probes approximately the same four-fermion operator as light targets, and since the dipole and proton vector coefficients are weighted by approximately 1/4 and 1/2 in both the amplitudes on Gold and Aluminium, the ratio of the rates is

$$\begin{aligned} \frac{BR(\mu Au \rightarrow eAu)}{BR(\mu Al \rightarrow eAl)}\approx \frac{{\widetilde{B}}_{Au}}{{\widetilde{B}}_{Al}} \approx 2 \pm \mathcal{O}(10\%), \end{aligned}$$

where \({\widetilde{B}}_{Al}\) and \({\widetilde{B}}_{Au}\) are given after Eq. (A.5).

In the inverse seesaw, \(\mu A \rightarrow \! eA \) on heavy targets could give complementary information, because the Z penguin contribution generates vector \(\mu \rightarrow e\) operators on both protons and neutrons:

$$\begin{aligned} {C}^{e\mu }_{Au\perp ,L}\simeq & {} v^2\frac{\alpha _e}{4\pi }\bigg (-0.5[Y_\nu M_a^{-2}\left( \frac{11}{6}+\ln \left( \frac{m^2_W}{M_a^2}\right) \right) Y_\nu ^\dagger ]_{e\mu }\nonumber \\{} & {} +0.8[Y_{\nu }(Y^\dagger _\nu Y_\nu )_{ab}\frac{1}{M^2_{a}-M^2_{b}}\ln \left( \frac{M^2_{a}}{M^2_{b}}\right) Y_\nu ^\dagger ]_{e\mu }\qquad \end{aligned}$$
(4.20)
$$\begin{aligned}{} & {} +{\mathcal {O}}\left( \frac{\alpha _e}{4\pi }\right) \bigg )\rightarrow \frac{v^2}{M^2}\Big (1.1\times 10^{-4} (Y_\nu Y^\dagger _\nu )_{e\mu }\nonumber \\{} & {} +4.9\times 10^{-4}(Y_\nu Y^\dagger _\nu Y_\nu Y^\dagger _\nu )_{e\mu }\Big ) \end{aligned}$$
(4.21)

The coefficient is nonetheless a combination of the same invariants that feature in the other \(\mu \rightarrow e\) operators, and so can be predicted with a combination of \(\mu \rightarrow e\) observations. For instance, in the nearly degenerate limit, it can be written as a linear combination of the light targets coefficient and the dipole.

In the leptoquark model, we neglect the scalar coefficients on ds and b quarks, because the model does not generate scalar operators with down-type quarks at tree level, and the estimates of Appendix C3c suggest that the loop-induced coefficients are below experimental sensitivity. So we obtain

$$\begin{aligned} C_{Au\perp ,L}\simeq & {} \left( \eta ^{a_S} \left( 0.1 \frac{\lambda ^{e u}_L \lambda ^{\mu u *}_R }{2} + 0.001 \frac{\lambda ^{e c}_L \lambda ^{\mu c *}_R }{2} \right) \right. \\{} & {} \times \left( 1+ \frac{13 \alpha }{ 6\pi } \ln \frac{m_{LQ}}{2\textrm{GeV}} \right) \\{} & {} +0.14 \lambda ^{e u}_L \lambda ^{\mu u *}_L \left( 1+ \frac{2\alpha }{\pi } \ln \frac{m_{LQ}}{2\textrm{GeV}}\right) \\{} & {} + 0.4 \frac{g^2}{32\pi ^2 } \left[ \lambda _L V]^{ed} [V^\dagger \lambda ^\dagger _L \right] ^{d\mu } \\{} & {} -0.32 \frac{\alpha }{9\pi } [\lambda _L\ln \frac{m_{LQ}}{m_Q} \lambda ^\dagger _L ]^{e\mu } \\{} & {} \left. - \frac{ N_c }{16\pi ^2 } \Big (0.28 (g_L^u+ g_R^u) + 0.4 (g_L^d+ g_R^d) \Big ) [\lambda _L Y_u \right. \\{} & {} \times \left. \left( \ln \frac{m_{LQ}}{m_Q}- \frac{5}{6}\right) Y_u^\dagger \lambda ^\dagger _L]^{e\mu } \right) \times \frac{v^2}{m_{LQ}^2} \end{aligned}$$

where the terms, in order, are the scalar up and charm quark contribution with their QED and QCD running, the tree vector up quark contribution with its QED running, the leptoquark-W box matching onto the vector operator for down quarks (neglecting the QED running), then the QED penguin contribution to the vector coefficients for u and d quarks, and finally the Z penguin contributions to both u and d vector coefficients.

In this leptoquark model, different operators contribute to the \(\mu \rightarrow e\) conversion rates on Gold and Aluminum, so \({C}_{Au\perp ,X} \) is independent from \(C_{Al,X} \), implying that \(C_{Au\perp ,X} \) could be just below the current SINDRUMII bound (see Table 2) even if \(\mu \rightarrow e\) flavour change is not observed in upcoming experiments. Furthermore, we anticipate that this will remain true, even if the definition of \( \mathcal{O}_{Au\perp } \) changes as theoretical calculations are updated (the definition of the \( \mathcal{O}_{A} \)s in References [79, 82] appears different.). The point is that cancellations can occur, for instance among vector and scalar coefficients in the coefficient on Aluminium, allowing the coefficient on Gold to be relatively large.

5 Phenomenology

The type II seesaw model is reputed to be predictive, because the lepton flavour-changing couplings of the scalar boson \(\Delta \) are proportional to the observed neutrino mass matrix. However, we observed in [50] that knowing the neutrino oscillation parameters does not predict the observable \(\mu \rightarrow e\) coefficients. So Sects. 5.1 and 5.2 explore the connections between \(\mu \rightarrow e\) processes and other lepton flavour- and number-changing observables in the type II seesaw. The remaining two subsections respectively discuss how \(\mu \rightarrow e \gamma \) constraints suppress dimension eight operators in the leptoquark model (Sect. 5.3), and the impact of allowing operator coefficients to be complex (Sect. 5.4).

5.1 The neutrino mass scale in the type II seesaw

In this section, we explore how the predictions of the type II seesaw model for \(\mu \rightarrow e\) observables are influenced by the lightest neutrino mass scale, denoted as \(m_{\textrm{min}}.\)

The lightest neutrino mass plays a crucial role in determining the relevant coefficients in the model, namely \(C^{e\mu }_{D,R}\), \(C^{e\mu ee}_{V,LL}\), and \(C^{e\mu ee}_{V,LR}\propto C_{Al,L},\ C_{Au\perp ,L}\). The tau mass cut-off in the two-loop vector mixing and one-loop photon penguin introduces a term \(\propto [m^*_\nu ]_{\mu \tau } [m_\nu ]_{e\tau }\) (see Eqs. 4.5 and 4.9) that gives rise to the dipole and \(C^{e\mu ee}_{V,LR}\) dependence on the unknown neutrino parameters. Additionally, the \(\mu \rightarrow 3e_L\) vector depends on \(m_{\textrm{min}}\) from both the tree-level \(\propto [m_{\nu }]_{ee}[m^*_{\nu }]_{\mu e}\) and the photon penguin contributions.

The magnitude of these unknown terms increases with the lightest neutrino mass, making them more relevant when \(m_{\textrm{min}}\) is large. For example, if we allow \(m_{\textrm{min}}\sim 0.2\) eV, the two-loop vector-to-dipole mixing can reach the size of the one-loop matching contribution to the dipole, and certain choices of Majorana phases could allow for a cancellation that suppresses the dipole coefficient. Although this has not been pointed out in the literature before (the dipole cannot vanish in the type II seesaw at the leading order), the cancellation requires a high neutrino mass scale \(m_{\textrm{min}}\). Values \(m_{\textrm{min}}\sim 0.2\) eV are compatible with the laboratory constraint extracted from the tritium decay end-point, which yields \(\sqrt{\sum _i m^2_i |U_{ei}|^2}< 0.8\) eV (\(90\%\) CL) [144], but are disfavored by the cosmological bounds on the neutrino masses sum \(\sum m_i\). Assuming \(\Lambda \)CDM, CMB data constrain the sum to be \(\sum m_i<0.26\) eV (\(95\%\) CL), while combined with the BAO measurements the constraint is stricter \(\sum m_i<0.12\) eV (\(95\%\) CL) [145].

Fig. 1
figure 1

\(\frac{|C^{e\mu ee}_{V,LR}|}{|C^{e\mu }_{D,R}|}\) and \(\frac{|C^{e\mu ee}_{V,LL}|}{|C^{e\mu }_{D,R}|}\) as functions of \(m_{\textrm{min}}\) for normal and inverted ordering in blue and red respectively, for all possible values of the Majorana phases. We consider only the one-loop matching contribution to the dipole since for \(m_{\textrm{min}}\lesssim 0.2\) eV the two-loop vector to dipole mixing is negligible. The shaded blue region corresponds to the values of \(m_{\textrm{min}}\) above the cosmology preferred upper bound \(m_{\textrm{min}}\lesssim 0.04\) eV

By imposing the cosmological upper bound \(m_{\textrm{min}}\lesssim 0.04\) eV, the \(\mu \rightarrow e\) coefficients are also constrained. In this case, the dipole is approximately unaffected by the two-loop contribution and is completely determined from the neutrino oscillation parameters, apart from the overall LFV scale. When the dipole coefficient is non-vanishing, the ratios

$$\begin{aligned} \frac{|C^{e\mu ee}_{V,LL}|}{|C^{e\mu }_{D,R}|},\qquad \frac{|C^{e\mu ee}_{V,LR}|}{|C^{e\mu }_{D,R}|}, \end{aligned}$$
(5.1)

are finite for any value of the Majorana phases. Figure 1a, b illustrate that by imposing the upper-bound \(m_\textrm{min}\lesssim 0.04\) eV and allowing the Majorana phases to vary freely, these ratios are bounded from above

$$\begin{aligned}{} & {} m_{\textrm{min}}\le 0.04\ \textrm{eV}\quad \rightarrow \quad \frac{|C^{e\mu ee}_{V,LL}|}{|C^{e\mu }_{D,R}|}\le {\left\{ \begin{array}{ll} 4.3\times 10^{3}\ {} &{}\mathrm{(IO)} \\ 2\times 10^{3}\ {} &{}\mathrm{(NO)} \\ \end{array}\right. },\nonumber \\{} & {} \qquad \frac{|C^{e\mu ee}_{V,LR}|}{|C^{e\mu }_{D,R}|}\le {\left\{ \begin{array}{ll} 28\ {} &{}\mathrm{(IO)} \\ 21\ {} &{}\mathrm{(NO)} \\ \end{array}\right. } \end{aligned}$$
(5.2)

The cosmological bound is generally insufficient to constrain the ratios from below, as there exist \(m_{\textrm{min}}\) values for both \(\mu \rightarrow 3e\) vectors such that the operator coefficient can vanish. If \(m_{\textrm{min}}\) is measured, for instance by observing the neutrinoless double beta decay, additional information can be obtained for the \(\mu \rightarrow e\) coefficients. The penguin cannot vanish for \(m_{\textrm{min}}\lesssim 0.02\) eV regardless of the ordering, while the \(\mu \rightarrow 3e_L\) vector is also non-vanishing for \(m_\textrm{min}\lesssim 10^{-3}\) eV:

$$\begin{aligned}{} & {} m_{\textrm{min}}\lesssim 0.02\ \textrm{eV}\qquad \rightarrow \quad {\left\{ \begin{array}{ll} 4.7\ {} &{}\mathrm{(IO)} \\ 8.5\ {} &{}\mathrm{(NO)} \\ \end{array}\right. } \nonumber \\{} & {} \le \frac{|C^{e\mu ee}_{V,LR}|}{|C^{e\mu }_{D,R}|}\le {\left\{ \begin{array}{ll} 15\ {} &{}\mathrm{(IO)} \\ 10\ {} &{}\mathrm{(NO)} \\ \end{array}\right. }\nonumber \\{} & {} m_{\textrm{min}}\lesssim 10^{-3}\ \textrm{eV}\qquad \rightarrow \quad {\left\{ \begin{array}{ll} 17\ {} &{}\mathrm{(IO)} \\ 20\ {} &{}\mathrm{(NO)} \\ \end{array}\right. }\nonumber \\{} & {} \le \frac{|C^{e\mu ee}_{V,LL}|}{|C^{e\mu }_{D,R}|}\le {\left\{ \begin{array}{ll} 2.3 \times 10^{3}\ {} &{}\mathrm{(IO)} \\ 62\ {} &{}\mathrm{(NO)} \\ \end{array}\right. } \end{aligned}$$
(5.3)

Therefore, if the ratio of coefficients \(\frac{|C^{e\mu ee}_{V,LR}|}{|C^{e\mu }_{D,R}|}\) or \(\frac{|C^{e\mu ee}_{V,LL}|}{|C^{e\mu }_{D,R}|}\) were observed outside the ranges of Eq. (5.3), it would be possible to infer a lower bound on the neutrino mass scale (if it originates from the type II seesaw mechanism). Since these ranges are significantly narrower in the NO case, measuring the ordering (which is expected to be determined in the upcoming years) would be particularly useful to pinpoint the interplay between the lightest neutrino mass and the \(\mu \rightarrow e\) predictions in the type II seesaw. Furthermore, if the mass ordering is normal, future beta decay experiments might be able to constrain \(m_{\textrm{min}} \lesssim 0.02\, \text{ eV }\) (the Project 8 experiment [146] aims at a 90% C. L. sensitivity of \(40\, \text{ meV }\) on the effective neutrino mass in beta decay, which corresponds to \(m_{\textrm{min}} \simeq 0.04\, \text{ eV }\)). In this case, measuring the ratio of coefficients \(\frac{|C^{e\mu ee}_{V,LR}|}{|C^{e\mu }_{D,R}|}\) outside the range [8.5, 10] would exclude the type II seesaw model.

5.2 \(\tau \) LFV in the type II seesaw

In this section, we explore whether the type II seesaw model can be predictive, when one considers flavour-changing decays of muons and taus. In particular, we focus on observations of \(\tau \rightarrow 3 l \) decays, if one does not see \(\mu \rightarrow e\) decays. So in practise in this section, the Majorana phases are fixed to ensure that the tree contribution to \(\mu \rightarrow e {\bar{e}} e\) vanishes, and we study the \(\tau \rightarrow 3 l \) rates as a function of the neutrino mass scale.

There are four \(\Delta \)LF\(=\)1 \(\tau \rightarrow 3 l \) decays (\(\tau \rightarrow \mu {\bar{\mu }} \mu \), \( \rightarrow e{\bar{e}} e\), \( \rightarrow e {\bar{\mu }} \mu \) and \(\rightarrow e {\bar{e}} \mu \)) and two that are \(\Delta \)LF\(=\)2 (\( \rightarrow e{\bar{\mu }} e\), and \(\rightarrow \mu {\bar{e}} \mu \)), as given in [147]. They can all be mediated at tree level by \(\Delta \) exchange as illustrated in the middle diagram of Fig. 5. The branching ratios for the four \(\Delta \)LF\(=\)1 decays are analogous to Eq. (2.8)

$$\begin{aligned} BR(\tau \rightarrow l_L {\overline{l}} l)= & {} 0.18 \Big \{ 2| C^{l\tau l l}_{V,LL} + 4eC^{l \tau }_{D,R}|^2 \nonumber \\{} & {} + (64 \ln \frac{m_\tau }{m_l} -136) |eC^{l \tau }_{D,R}|^2 + |C^{l \tau l l}_{V,LR} \nonumber \\{} & {} + 4eC^{l \tau }_{D, R} |^2 \Big \}\end{aligned}$$
(5.4)
$$\begin{aligned} BR(\tau \rightarrow \tilde{l}_L {\overline{l}} l)= & {} 0.18 \Big \{ | C^{\tilde{l}\tau l l}_{V,LL} + 4eC^{\tilde{l} \tau }_{D,R}|^2 \nonumber \\{} & {} + (64 \ln \frac{m_\tau }{m_l} -136) |eC^{\tilde{l} \tau }_{D,R}|^2 \nonumber \\{} & {} + |C^{\tilde{l} \tau l l}_{V,LR} + 4eC^{\tilde{l} \tau }_{D, R} |^2 \Big \} \\ BR(\tau \rightarrow l_L \overline{\tilde{l}} l)= & {} 0.18 | C^{l\tau l \tilde{l} }_{V,LL} |^2 \nonumber \end{aligned}$$
(5.5)

where we restrict to a fixed chirality of the flavour-changing lepton bilinear, \(l,\tilde{l} \in \{e,\mu \}, l\ne \tilde{l}\) and the factor 0.18 accounts for the hadronic \(\tau \) decays: \(\Gamma (\tau \rightarrow e{\bar{\nu }} \nu )/\Gamma (\tau \rightarrow \textrm{all}) \simeq 0.18\). The final pair of branching ratios (Eq. 5.6) are \(\Delta \)LF\(=\)2; we assume such coefficients run with QED like other four-lepton vector operators. The current experimental constraints (see Table 1) are of order

Fig. 2
figure 2

Ratios BR(\(\tau \rightarrow 3l\))/BR(\(\mu \rightarrow e {\bar{e}} e\)) in the type II seesaw model as a function of \(m_1\) in inverted ordering (so \(m_1 \ge \sqrt{\Delta ^2_{atm}}\)) for vanishing \([m_\nu ]_{e \mu }\). This illustrates that Belle II could observe specific \(\tau \rightarrow 3l\) decays, even if \(\mu \rightarrow e {\bar{e}} e\) and \(\mu \rightarrow e \gamma \) are not observed at upcoming experiments because the tree contribution to \(\mu \rightarrow e {\bar{e}} e\) vanishes with \([m_\nu ]_{e \mu }\). Some \(\tau \rightarrow 3l\) processes (in black) also vanish at tree level, and we include \(\tau \rightarrow e \gamma \) and \(\tau \rightarrow \mu \gamma \) to illustrate our claim that they are undetectable at Belle II in the Type II seesaw model(the BR\((l_i\rightarrow l_j \gamma )\) are also divided by BR\((\mu \rightarrow e {\bar{e}} e)\))

Fig. 3
figure 3

Same as Fig. 2 for normal ordering in the case where \(\mu \rightarrow e {\bar{e}} e\) is suppressed because \([m_\nu ]_{e \mu }\) vanishes (so )

The type II seesaw can predict \(\tau \rightarrow 3 l \) within the reach of Belle II, in spite of the current bound BR\((\mu \rightarrow e {\bar{e}} e) \le 10^{-12}\). For instance, in normal ordering with vanishing lightest neutrino mass \(m_1\)

$$\begin{aligned} \frac{ BR(\mu \rightarrow e {\bar{e}} e)}{BR (\tau \rightarrow \mu {\bar{\mu }} \mu )}\sim & {} 5 \left| \frac{s_{13} \Delta _{sol} }{\sqrt{2} \Delta _{atm}}\right| ^2 \sim 10^{-3} \end{aligned}$$

However, in the coming years, the Mu3e experiment will improve the sensitivity to \(\mu \rightarrow e {\bar{e}} e\), so the allowed range for the \(\tau \rightarrow l\) rates will depend on the results of Mu3e. To be concrete, we suppose that \(\mu \rightarrow e\) flavour change is not observed at MEG II or Mu3e. So in the following, we suppose that the Majorana phases (and the lightest neutrino mass \(m_{min}\)) are fixed such that \(C_{VLL}^{e\mu ee}\) vanishes at tree level, implying that the \(\mu \rightarrow e {\bar{e}} e\) rate is suppressed by \(\mathcal{O}(\alpha ^2)\), because it is mediated by the dipole and the penguin-induced \(C_{VLR}^{e\mu ee}\). Recall that the tree-level contribution to the coefficient \(C_{VLL}^{e\mu ee}\), which is proportional to \([m]^*_{\mu e} [m]_{ee}\), can vanish with \([m]_{ee}\) in normal ordering (NO) for (as is well known from neutrinoless double beta decay [148]), and it can also vanish with \([m]_{e\mu }\) both in normal and inverted ordering for .

There are six \(\tau \rightarrow 3 l \) decays, whose tree-level amplitudes are proportional to products of only five neutrino mass matrix elements. This suggests at least one relation among these decays for generic \(m_{min}\) and Majorana phases – but this relation could be difficult to test in general, because \(m_{min}\) and the Majorana phases can accidentally suppress almost any element of the neutrino mass matrix (as we saw for \([m_\nu ]_{e \mu }\)). This section considers a scenario where either \( [m_\nu ]_{e\mu } \) or \([m_\nu ]_{ee}\) vanishes in order to suppress \(\mu \rightarrow e\) rates, so testable predictions can be expected.

For \(m_{ee}\rightarrow 0\), one of the \(\Delta LF\) = 2 processes vanishes (at the order we calculate):

$$\begin{aligned} BR(\tau \rightarrow e {\bar{\mu }} e) \rightarrow 0,&~~~&\textrm{for}~ [m_\nu ]_{e e} \rightarrow 0. \end{aligned}$$
(5.6)

Similarly, the tree contribution to \(\tau \rightarrow e {\bar{e}} e\) vanishes, so that only the dipole and penguin contributions to this decay remain. So a signature of the type II seesaw with vanishing \([m_\nu ]_{ee}\), is that these decays are suppressed simultaneously with the matrix element for neutrinoless double beta decay.

In the case where \(\mu \rightarrow e {\bar{e}} e\) is suppressed because \([m_\nu ]^{e\mu }\) vanishes (and ), there are identities among the tree-level coefficients \({ C}_{V,LL}^{l\tau \sigma \rho } \propto m_{l\sigma }m_{\tau \rho }^* \) ( which by default dominate the rates for the decays \(\tau \rightarrow l{\overline{\rho }} \sigma ):\)

$$\begin{aligned} \frac{m_{e e }}{ m_{\mu \mu }}= & {} \frac{{ C}_{V,LL}^{e \tau e e }}{ C_{V,LL}^{\mu \tau \mu e}}= \frac{{ C}_{V,LL}^{e\tau e \mu }}{ C_{V,LL}^{\mu \tau \mu \mu }},~~~ \frac{m^*_{\tau e }}{ m^*_{\tau \mu }}= \frac{{ C}_{V,LL}^{\mu \tau \mu e }}{ C_{V,LL}^{\mu \tau \mu \mu }}= \frac{{ C}_{V,LL}^{e\tau e e }}{ C_{V,LL}^{e\tau e \mu }}\nonumber \\= & {} \frac{{ C}_{V,LL}^{e\tau \mu e }}{ C_{V,LL}^{e\tau \mu \mu }}. \end{aligned}$$
(5.7)

As a result, for values of the Majorana phases fixed to suppress \(\mu \rightarrow e {\bar{e}} e\) (and for compatible \(m_{min}\)), ratios of \(\tau \rightarrow 3 l \) decays are predicted. We plot in Figs. 3 (for normal ordering) and 2 (for inverted ordering) the rates of various \(\tau \rightarrow 3 l \) decays, normalised to the rate for \(\mu \rightarrow e {\bar{e}} e\),Footnote 9 for the case where \(\mu \rightarrow e {\bar{e}} e\) is suppressed by \([m_\nu ]_{e\mu } \rightarrow 0\). These figures show [23, 25] that Belle II could see \(\tau \rightarrow 3 l \) decays for BR and BR, and that the branching ratios follow the predictions of Eq. (5.7). Furthermore, if several \(\tau \rightarrow 3 l \) decays were observed (including \(\tau \rightarrow e {\bar{e}} e\)), we anticipate than one could compare their relative branching ratios with the type II seesaw predictions shown in Figs. 2 and  3, and either exclude the model or deduce constraints on \(m_1\) and the mass ordering.Footnote 10 Regarding \(\mu A \rightarrow \! eA \), we estimate that BR\((\mu Al \rightarrow eAl) \sim 10^2\) BR\((\mu \rightarrow e {\bar{e}} e)\) in the case of vanishing \([m_\nu ]_{e\mu }\) in the type II seesaw, because \(C_{V,LR}^{e\mu ee}\) and \(\tilde{C}_{Al,L}\) are both induced by the photon penguin, see Eqs. (4.9, 4.15).

Finally, the decays \(\tau \rightarrow e \gamma \) and \(\tau \rightarrow \mu \gamma \) will not be observed at Belle II if neutrino masses arise via the type II seesaw model, because this requires \(\mu \rightarrow e \gamma \) or \(\mu \rightarrow e {\bar{e}} e\) larger than the current constraints (this conclusion is fully general and does not assume that the tree-level contribution to \(\mu \rightarrow e {\bar{e}} e\) vanishes). The dipole coefficients \(C_{D,R}^{l \tau }\), at the experimental scale \(m_\tau \), can be written analogously to Eq. (4.6), with the index replacement \(e\rightarrow l\) and \(\mu \rightarrow \tau \), and without the second term, because the RG running ends at \(m_\tau \) for all flavours in the loop. At the order we calculate, the \(\tau \) dipole coefficients are therefore given by \([m_\nu m_\nu ^\dagger ]_{l\tau }\), so are independent of the neutrino mass scale and Majorana phases (see Eq. 2.2), and the ratio

$$\begin{aligned} \frac{BR(\tau \rightarrow \mu \gamma )}{BR(\tau \rightarrow e \gamma )} \simeq \left| \frac{[m_\nu m_\nu ^\dagger ]_{\mu \tau }}{[m_\nu m_\nu ^\dagger ]_{e\tau }} \right| ^2 \sim \frac{1}{2 s^2_{13}} \end{aligned}$$

is predicted. However, in order to be within the reach of Belle II (BR, see Table 1), these branching ratios need to be much larger than BR(\(\mu \rightarrow e \gamma )\), which can be engineered via a cancellation in \(C_{D,R}^{e\mu }\) for specific Majorana phases at a large neutrino mass scale. However, when this cancellation arises, the model predicts a larger branching ratio for \(\mu \rightarrow e {\bar{e}} e\) than \(\tau \rightarrow l \gamma \), because the coefficient \(C_{VLL}^{e\mu ee} \propto m_1^2\) arises at tree level.

5.3 The dipole constraints on boxes for the leptoquark

The leptoquark Lagrangian of Eq. (3.5) allows the leptoquark to interact with doublet and singlet leptons, so it can induce lepton flavour-changing dipole, tensor and scalar operators, without any suppression by the lepton Yukawas. Nonetheless, we neglect four-lepton scalar operators in this model, because the coefficients are suppressed below the upcoming experimental reach by the dipole constraint. This section aims to show that suppression.

Fig. 4
figure 4

A diagram that generates the four-lepton scalar operator \(\mathcal{O}_{SXX}^{e \mu ll}\) in the model (on the left), in the RGEs of SMEFT (centre) and in the RGEs of the QCD\(\times \)QED invariant EFT (on the right). (There is another diagram with \(l_Y\leftrightarrow e_Y\).)

We are interested in scalar four-lepton operators \(\mathcal{O}_{S,XX}\), such as those in the observable Lagrangian of Eq. (2.3). These operators occur at dimension eight in SMEFT with a pair of Higgs legs, for instance in the form \(({\overline{\ell }}_eH \mu _R) ({\overline{\ell }}_eH e_R)\) and can be generated in the leptoquark model via box diagrams with Higgs legs, as illustrated on the left in Fig. 4. Equivalently, these operators are generated in the dimension six\(^2 \rightarrow \) dimension eight RGEs of SMEFT or the low energy EFT, respectively by the fish diagrams at the centre or right of Fig. 4. It is straightforward to see from the model diagram, that

$$\begin{aligned} C_{SRR}^{e \mu ee} \propto N_c [\lambda _L Y^*_u \lambda _R^\dagger ]^{e\mu } [\lambda _L Y^*_u \lambda _R^\dagger ]^{ee} \frac{v^4}{32 \pi ^2 m_{LQ}^4} \ln \frac{\Lambda _{NP}}{m_W} \end{aligned}$$
(5.8)

so that this coefficient could be \( \sim 3\times 10^{-5}\) for leptoquark couplings of \(\mathcal{O}(1)\) to the top quark. This is marginally above the current experimental bound, which is \(|C_{SRR}^{e \mu ee}|\le 2.8 \times 10^{-6}\) from SINDRUM [64]. So now we want to show that the product \(\lambda _L^{\alpha t} \lambda _R ^{\beta t *}\) is constrained to be for almost any combination of lepton flavours \(\alpha , \beta \).

In the presence of both the \(\lambda _L\) and \(\lambda _R\) interactions, the leptoquark matches onto 2l2q tensor operators involving the up-type quarks (see Eq. C.11). These tensor operators then mix to the dipole (see the last term of Eq. 4.7), generating an “invariant” that is not identical to the one appearing in Eq. (5.8):

$$\begin{aligned} \Delta C_{DR}^{ \alpha \beta }\propto & {} \frac{\alpha }{\pi y_\beta }[\lambda _L Y_u^* \eta _{m_Q}\ln \frac{m_{LQ}}{m_Q} \lambda _R^\dagger ]^{\alpha \beta } \nonumber \\{} & {} \quad \sim \frac{\alpha }{\pi y_\beta }\Big ( \mathcal{O} \Big ( \lambda ^{\alpha t}_L y_t \lambda _R^{\beta t *} \Big )\nonumber \\{} & {} + \mathcal{O} \Big ( \lambda ^{\alpha c}_L y_c \lambda _R^{\beta c *} \Big )+\cdots \Big ). \end{aligned}$$
(5.9)

However, the term \(\propto y_t\) in \([\lambda _L Y_u^* \eta _{m_Q}\ln \frac{m_{LQ}}{m_Q} \lambda _R^\dagger ]^{e \mu }\) is of comparable magnitude to the term \(\propto y_t\) of \([\lambda _L Y_u^* \lambda _R^\dagger ]^{e \mu }\) from Eq (5.8).

Now we want to argue that this term , because if its larger it exceeds the experimental bound on \(\Delta C_{DR}^{ \alpha \beta }\), so must be small enough to cancel against next biggest term. In this argument, we assume that , and use a different normalisation of the dipole operator:

$$\begin{aligned} \widetilde{\mathcal{O}}_{D}^{\alpha \beta }= & {} {\overline{\ell }}_\alpha H \sigma ^{\rho \sigma }P_{R} e_\beta F_{\rho \sigma } \end{aligned}$$
(5.10)

where the lepton Yukawa is removed. This is more convenient for comparing the experimental bounds [61, 62, 149,150,151,152,153,154] on NP contributions to the coefficient of the redefined dipole operator, expressed as a matrix in flavour space:

$$\begin{aligned} |[{\widetilde{C}}_{D}]^{\alpha \beta }| \le \left[ \begin{array}{ccc} 2.3\times 10^{-6} &{} 6\times 10^{-12} &{}7\times 10^{-8}\\ 6\times 10^{-12}&{} 4\times 10^{-7} &{}7\times 10^{-8}\\ 7\times 10^{-8} &{} 7\times 10^{-8}&{} - \end{array} \right] \end{aligned}$$
(5.11)

where on the diagonal are the \((g-2)_\beta \) constraints [151, 152] using \(|\Delta a_\beta | \simeq 2m_\beta |C_D^{\beta \beta } + C_D^{\beta \beta *}|/(e v)\), and the EDM constraints [153, 154] on \(\mathcal{I}m \{C_D^{\beta \beta }\} \simeq - v d_\beta /2\) [155] are , and . The contribution of Eq. (5.9), with \(\lambda _L^{\alpha t} \lambda _R^{\alpha t *}\) of order 1, is larger than the bounds of Eq. (5.11) on all the dipole coefficients, except possibly \([{\widetilde{C}}_{D}]^{\tau \tau }\). So the top contribution must cancel against the next largest contribution to the dipole, which is relatively suppressed at least by the charm or \(\tau \) Yukawa, because the dipole operator requires a single Higgs leg. So this implies that

for all flavour combinations \(\alpha \beta \) except \(\tau \tau \), so the scalar four-lepton coefficients are below upcoming experimental sensiticity and can be neglected.

We also explored the possibility that constraints on vector four-quark operators [156] – for instance from meson–antimeson oscillations – allow to set bounds on the vector four-lepton operators. Our hope was that the quark sector could set bounds on the eigenvalues \(\{\lambda _i\}\) of the leptoquark coupling matrices

$$\begin{aligned}{}[\lambda _L] = V_l^\dagger D_\lambda V_Q \end{aligned}$$

where \( D_\lambda =\) diag \(\{\lambda _1, \lambda _2, \lambda _3\}\), and \(V_l\) and \(V_Q\) are unitary matrices. \(K^0\! -\! \overline{K^0}\) and \(D^0\! -\! \overline{D^0}\) mixing constrain two independent combinations of \( V_Q^\dagger D^2_\lambda V_Q \), but in order to set an upper bound on the eigenvalues, at least one more constraint would be required, and we did not find useful constraints involving tops.

5.4 Complex coefficients – what changes?

The coefficients of the observables operator are generally complex numbers, and it is not immediately clear whether experiments can fully determine these coefficients when non-zero phases are present. When considering upper limits on branching ratios, the only difference is that we have two identical 12-dimensional ellipses, each respectively in the space of the real and imaginary parts of the coefficients. This happens because the branching fractions are functions of the absolute values of operator coefficient combinations, which result in a quadrature sum of the real and imaginary components. Therefore, the two do not mix, and the rate is a combination of the same positive defined quadratic forms, to which the upper limit can separately apply.

The complication arises when we consider the possibility of measuring the complex coefficients from data. Expanding the branching ratios of Eqs. (2.7)-(2.10) in terms of the (complex) coefficients of Eq. (2.3), we findFootnote 11

$$\begin{aligned} BR(\mu \rightarrow e_X \gamma )= & {} 384 \pi ^2 |C^{e\mu }_{D,Y}|^2 \end{aligned}$$
(5.12)
$$\begin{aligned} BR(\mu \rightarrow e_X {\overline{e}}_X e_X)= & {} 2 |C^{e\mu ee}_{V,XX}|^2\nonumber \\{} & {} +32e^2\left( \ln \left( \frac{m_\mu }{m_e}\right) -1\right) |C^{e\mu }_{D,Y}|^2\nonumber \\{} & {} +16e |C^{e\mu ee}_{V,XX} C^{e\mu }_{D,Y}\nonumber \\{} & {} \quad | \cos (\phi _{V,XX}-\phi _{D,Y} ) \end{aligned}$$
(5.13)
$$\begin{aligned} BR(\mu \rightarrow e_X {\overline{e}}_Y e_Y)= & {} |C^{e\mu ee}_{V,XY}|^2\nonumber \\{} & {} +32e^2\left( \ln \left( \frac{m_\mu }{m_e}\right) -\frac{3}{2}\right) \nonumber \\{} & {} \qquad |C^{e\mu }_{D,Y}|^2+8e |C^{e\mu ee}_{V,XY} C^{e\mu }_{D,Y}|\nonumber \\{} & {} \cos (\phi _{V,XY}-\phi _{D,Y} ) \end{aligned}$$
(5.14)
$$\begin{aligned} BR(\mu A\rightarrow e_X A)= & {} B_A\left( d^2_A|C^{e\mu }_{D,Y}|^2+|C^{e\mu }_A|^2+2d_A\right. \nonumber \\{} & {} \left. |C^{e \mu }_{D,Y}C^{e\mu }_A|\cos (\phi _{A}-\phi _{D,Y} )\right) \end{aligned}$$
(5.15)

where we have defined \(C_{\square }=|C_\square |e^{i\phi _\square }\), \(X\ne Y\) and A can be Al or Au \(\perp \). The observables only depend on relative phases, so for brevity we relabel \(\phi _\square -\phi _{D,Y}\rightarrow \phi _\square \). We thus have 10 branching fractions that can generally depend on 18 parameters: 10 absolute values and 8 relative phases. For example, observing \(\mu \rightarrow e_X \gamma \) and \(\mu \rightarrow e_X {\overline{e}}_X e_X\) ( \(\mu \rightarrow e_X {\overline{e}}_Y e_Y\)) would not be sufficient to measure the real and imaginary parts of the coefficient \(C^{e\mu ee}_{V,XX}\) ( \(C^{e\mu ee}_{V,XY}\)). However, some observables may be directly related to the coefficient relative phases. For instance, it has been shown in [86,87,88] that the T-odd asymmetry term

$$\begin{aligned}{} & {} A^T_{\mu \rightarrow 3e}\propto -\textrm{Im}(C_{D,R} (3C^*_{V,LL}-2C^*_{V,LR}) + (L\leftrightarrow R)\nonumber \\{} & {} \quad =3|C^{e\mu ee}_{V,LL} C^{e\mu ee}_{D,R}| \sin \phi _{V,LL}\nonumber \\{} & {} \qquad -2|C^{e\mu ee}_{V,LR} C^{e\mu ee}_{D,R}| \sin \phi _{V,LR}+(L\leftrightarrow R) \end{aligned}$$
(5.16)

is accessible via the angular distribution of the outgoing electrons/positrons, and could assist in determining the relative phase of the \(\mu \rightarrow 3e\) vectors and dipole. In addition, interpreting data assuming specific models can help in reducing the number of relevant parameters, and measurements may still be used to find regions of parameter space that are incompatible with the model predictions. For the three models we considered here:

  1. 1.

    In the type II seesaw the observable coefficients are related to the neutrino mass matrix, which is complex (given the hints of the CP violating Dirac phase \(\delta \simeq 3/2\pi \) and due the potential presence of non-zero Majorana phases). In our analysis of the type II seesaw in [50], we identified the region where the model can sit in the space of the angular variables \(\tan \theta =\sqrt{|C_D|^2+|C_{V,LR}|^2}/|C_{V,LL}|\) and \(\tan \phi =|C_D/C_{V,LR}|\), which depend on the absolute values of the observable coefficients. One may wonder whether experiments can identify a point in this space despite having complex phases, which make three branching fractions depend on five parameters (three absolute values and two phases). Since the flavour changing interactions with electron singlets are negligible, the \(A^T_{\mu \rightarrow 3e}\) asymmetry is given by the relative phases of the \(C_{V,LL}, C_{V,LR}\) vectors with the \(C^{e\mu }_{D,R}\) dipole. Combined with the measurements of the branching fractions for \(\mu \rightarrow e_L \gamma \), \(\mu \rightarrow e_L {\overline{e}}_L e_L\) and \(\mu \rightarrow e_L {\overline{e}}_R e_R\), one of five physical parameters would still remain undetermined. Taking advantage of the fact that in the type II seesaw \(C_{A}\propto C_{V,LR}\), a detection of \(\mu A\rightarrow e A\) could be used to extract the value of the last unknown, resulting in the complete knowledge of the relevant complex coefficients (modulo an overall phase). This also opens the interesting possibility of taking advantage of \(\mu \rightarrow e\) observables to determine the unmeasured neutrino parameters, i.e. the lightest neutrino mass and the Majorana phases. Since the complex EFT coefficients depends on these three unknown, measuring the coefficients could allow to infer their values. Additionally, since the system is over-constrained, with three complex coefficients being a function of four parameters (including the overall magnitude of LFV), we can use experimental results to check for consistency with the type II seesaw.

  2. 2.

    The operator coefficients in the inverse seesaw are given by the off-diagonal elements of Hermitian matrices, which can be in general complex. In the case of nearly degenerate sterile neutrinos, we have found that the coefficients satisfy linear relations of the following form [50]

    $$\begin{aligned} C^{e\mu ee}_{V,XY}=a_{XY} C^{e\mu }_{A,L}+b_{XY} C^{e\mu ee}_{D,R} \end{aligned}$$
    (5.17)

    where a and b are real numbers, and \(XY=LL,LR\). These relations hold in general and do not assume real coefficients. However, with non-vanishing phases, two observables are not sufficient to fully determine the \(\mu \rightarrow e\) predictions of the inverse seesaw. Observing \(\mu \rightarrow e_L \gamma \) would give the dipole absolute value, and measuring \(\mu A\rightarrow e_L A\) would only yield a combination of \(|C^{e\mu }_{2q,L}|\) and dipole-four fermion relative phase. Taking advantage of Eq. (5.17), a \(\mu \rightarrow e_L {\overline{e}}_R e_R\) signal could allow to solve for \(C^{e\mu }_{2q,L}\) as a complex number. Then, again by means of Eq. (5.17) for \(XY=LR\), we could predict \(BR(\mu \rightarrow e_L {\overline{e}}_R e_R)\) and compare it against experimental results. We conclude that, despite the non-zero phases, the inverse seesaw with degenerate sterile neutrinos is predictive enough that it could be ruled out by a combination of \(\mu \rightarrow e\) observations

  3. 3.

    In the leptoquark model the \(\mu \rightarrow e\) coefficients depend on a number of invariants that is larger than the number of observables. This means that the model could already fill the experimental ellipse (with the exception of the scalar four lepton directions) even in the case of real couplings. Allowing complex couplings would make the model even less constrained, and thus we do not discuss this case further.

6 Discussion

The purpose of bottom-up EFT is to take low-energy experimental information to high-scale models. In this section, we discuss various aspects of this process, in the light of the three models we considered.

6.1 What differences among models can be identified by the data?

We showed in [50] that the data could rule out the models we consider, because the models predict relations between the Wilson coefficients, so are unable to fill the whole ellipse in coefficient space that is accessible to experiments. But it would be more interesting to ask whether \(\mu \rightarrow e\) observations could identify properties of models.

A simple question is whether \(\mu \rightarrow e\) data could distinguish models with LFV couplings to either lepton doublets or singlets, vs models that interact with both. It seems that the answer is yes. If LFV interactions involve only doublets or singlets, the only lepton-chirality-changing interaction in the theory is the Higgs Yukawa. So the coefficients of chirality-changing LFV operators must be proportional to the Yukawa coupling to an odd power. For instance, the dipole coefficients would satisfy

$$\begin{aligned} \frac{C^{e\mu }_{DR}}{C^{e\mu }_{DL}} \propto \left( \frac{m_e}{m_\mu }\right) ^{\pm 1}. \end{aligned}$$

If this relation is not satisfied, it suggests that LFV involves doublets and singlets. If it is satisfied with exponent \(-1\), then it is probable that LFV involves doublets but not singlets. If in addition, the coefficients of other singlet-LFV operators are negligible, it becomes very probable that LFV involves only doublets – although it could result from accidental cancellations in a model with LFV for doublets and singlets.

It would be interesting and useful if the data could also identify other model properties, such as the loop order at which LFV occurs. But our models suggest this is not possible, because loops that occur in matching the model to the EFT are indistinguishable from small couplings, and because coefficients that arise at tree level can be accidentally small, as occurred in the type II seesaw where \(C_{VLL}^{e\mu ee}\) can vanish. For similar reasons, \(\mu \rightarrow e\) observables can not distinguish NP that interacts only with leptons in its renormalisable Lagrangian, from NP that interacts with quarks and leptons.

6.1.1 What is the role of the RGEs?

There are many reasons to use Renormalisation Group Equations in EFT, as is well-known in quark flavour physics. However, the RGEs may not be motivated in the lepton sector, because LFV has yet to be discovered, so precise, scheme-independent predictions are not crucial. Indeed, the RGEs are often not included in the \(\tau \rightarrow l\) sector, where the data separately constrain most coefficients. In the \(\mu \rightarrow e\) sector where there are fewer restrictive bounds, an EFT analysis suggests that the RGEs are relevant because they mix difficult-to-probe operators into well-constrained processes. In this section, we explore whether this occurs in our models.

The inverse seesaw at the TeV-scale is an example of a model where the RGEs are unnecessary, because low-energy LFV operators are generated via loop diagrams in matching, and the RGEs just contribute an \(\mathcal{O}(10\%)\) renormalisation. A few properties of the model contribute to this behaviour: first, the new interactions couple one light SM particle with a heavy new particle and a weak boson, so LFV occurs via loops that contribute in matching. Then, there is no significant operator mixing via the RGEs, because the photon penguin vanishes below the weak scale at one loop; that is, LFV operators are vectors(or \(\propto \) a lepton yukawa coupling) because only the lepton doublet interacts with new particles, and vector operators only mix into each other via the penguins. But in the one-loop penguin diagrams, the gauge boson attaches to the Higgs, so the penguins are only present above the weak scale, and generate 2l2q operators in matching.

The type II seesaw is an example of a model where the leading contribution to some observables arises in matching, and via the RGEs for others. This behaviour can be reproduced in EFT, and can also be obtained in model calculations that judiciously include log-enhanced loops [26]. However, the correct lower limits of the logarithms are crucial to obtain the dependence of LFV observables on the unknown parameters of the neutrino mass matrix in the type II seesaw model. These lower limits should be implemented automatically in EFT, but to our knowledge were previously missing from the literature. So it seems that the RGEs are required in this model, in order to identify the parameter space the model cannot reach.

In the type II seesaw, a careful one-loop model calculation could include all the terms of our leading-log EFT, because we do not resum \((\alpha \lg )^n\) for all n, but rather work to \(\mathcal{O}(\alpha \ln )\). However, our EFT also includes the “leading” vector to dipole mixing at \(\mathcal{O}(\alpha ^2 log)\), which we did not find in the literature about this model. This mixing causes the dipole to depend on the unknown neutrino parameters (Majorana phases and \(m_{min}\)), thereby allowing it to vanish. It is “well known” that the 2-loop electroweak contribution to \((g-2)_\mu \) is comparable to the one-loop part, but it seems that the implications of this may not have been implemented in all model calculations. However, it is relatively simple to implement in EFT [101], illustrating the first reason to do EFT: it is the simpler way to get a more precise result.

The leptoquark is our model where the RGEs are most useful, because they allow to simultaneously include the multitude of relevant electroweak loops and large QCD effects in an organised fashion. The RGEs are required to obtain model predictions, because they mix difficult-to-constrain coefficients – such as tensor operators involving top quarks – into observable coefficients like the dipole, while simultaneously including the QCD running of the operators. so are required to obtain model predictions.

6.1.2 Do cancellations among coefficients occur in models?

It is common to make tables listing the “sensitivity” of observables to operator coefficients (e.g. [80, 101]); these “one-at-a-time-bounds” are simple to obtain by allowing a single operator to have a non-zero coefficient, and computing the experimental constraint upon it. It can also be common to take these sensitivities as bounds, because it is generally considered unlikely that models generate cancellations among operator coefficients, especially since these coefficients run with scale. However, such cancellations can occur, for instance via the equations of motion, so these “sensitivities” are not true upper bounds (instead, they are the value of the coefficient above which it could be detected).

The operator population in EFTs is often reduced via the equations of motion (as pedagogically discussed in [98, 157]). This can sometimes impose “accidental” but precise cancellations among operator coefficients. An example is discussed in Appendix C4: in a model, the Z penguin diagrams can be \(\propto q^2\) (= the momentum-transfer\(^2\) of the Z), or \(\propto v^2\). Both contributions could contribute to the decay \(Z\rightarrow e^\pm \mu ^\mp \), but the part \(\propto q^2\) gives a negligible contribution to \(\mu \rightarrow e {\bar{e}} e\), due to the \(q^2\sim m_\mu ^2\) suppression. However if the model is matched to SMEFT, the equations of motion are used to to transform the \(q^2\) part of the diagrams into four-fermion and penguin operators (\(\mathcal{O}_{HL3}^{e\mu }\) and \(\mathcal{O}_{HL1}^{e\mu }\)), with coefficients whose sum cancels in low-energy matrix elements where \(q^2 \rightarrow 0\). The “one at a time bounds” on penguin and four-fermion operators miss this cancellation, so would instead suggest that both are strictly constrained by \(\mu \rightarrow e {\bar{e}} e\).

In the type II seesaw model, the Z-penguin diagrams give negligible contribution to \(\mu \rightarrow e {\bar{e}} e\) because the \(\propto v^2\) part is suppressed by lepton Yukawas, and the \(\propto q^2\) part is kinematically suppressed as discussed above. So experimental observations do not exclude a \({\bar{e}} \, Z \! \! \! \! / ~ \mu \) interaction just below the sensitivity of the LHC, despite that the “one-at-a-time-bounds” suggest that they do. Nonetheless, the model can not generate such interactions, because they are controlled by the same model parameters as the photon penguin diagrams, which are constrained by \(\mu \rightarrow e {\bar{e}} e\).

So in summary, apparently accidental cancellations can occur among coefficients in EFT; whether this affects the constraints on models is model-dependent.

6.1.3 Does it matter that coefficients are complex?

\(\mu \rightarrow e\) observables define an experimentally accessible 12-dimensional ellipse in the space of the Wilson coefficients, but these are generally complex numbers. Although this does not complicate the analysis when imposing bounds on the coefficients, because the ellipses for the real and imaginary components are identical, allowing for complex phases could generally hinder the determination of the coefficients from data. If the muon polarization and the electron/positron angular observables could in principle identify 12 real coefficients, unobservable directions will be present when considering the full parameter space spanned by coefficients with non-zero phases. Measuring for instance the branching fraction for \(\mu \rightarrow e_X \gamma \) would identify a circle with radius \(\sqrt{BR(\mu \rightarrow e_X \gamma )/384\pi ^2}\) centered at the origin of the complex plane for the dipole coefficient \(C_{DX}\), but without further assumptions the real and imaginary parts would remain undetermined. However, if our goal is to use data to exclude models, the model predictions can help in reducing the number of unmeasurable direction. Interpreting data in light of a particular model can lead to the determination of absolute values as well as relative phases for the coefficient. The neutrino mass models we considered in this paper are an example where this determination is possible.

In the type II seesaw, the LFV operators are complex because their coefficients are directly related to the neutrino mass matrix, which contains up to three phases in the case of Majorana neutrinos. We discussed in Sect. 5.4 how, taking advantage of the model predictions and making use of observables directly related to the coefficient phases (see Eq. 5.16), one could interpret \(\mu \rightarrow e\) data to determine the (complex) coefficients predicted by the type II seesaw.

Similarly, in the inverse seesaw model, the predictions for flavour-changing observables are determined by the magnitude of off-diagonal matrix elements, and the operators can be complex. In Sect. 5.4, we showed that despite the presence of operator phases, we can use the model predictions to identify points in the experimentally accessible ellipse.

7 Summary

The \(\mu \rightarrow e\) sector is promising for the discovery of LFV, due to the exceptional upcoming experimental sensitivity – to three processes. So this project explored what could be learned about the New Physics in the lepton sector from \(\mu \rightarrow e\) observations, by studying some “representative” TeV-scale models described in Sect. 3. We take the data to be 12 Wilson coefficients, which can be individually constrained and distinguished in measurements (with the exception of vector and scalar four-lepton operators which have indistinguishable angular distributions in \(\mu \rightarrow e {\bar{e}} e\), see Sect. 2).

Bottom-up EFT is an appropriate formalism to compare data with models, because data improves slowly while models can evolve more rapidly. It also gave some relevant effects (such as the two-loop vector to dipole mixing in the type II seesaw, see Eq. 4.6) which we did not find in the literature. Our analysis is at leading order in EFT, meaning that we attempt to include the largest contribution of the model to all the operators to which the data can be sensitive. This includes some 2-loop anomalous dimensions and some operators which are dimension eight in SMEFT. Our notation and assumptions are summarised in Sect. 2. Our models are located at the TeV scale in order to profit from many complementary observables, but the ratio \(m_W/\)TeV is not large, which implies that EFT is poorly motivated between \(m_W \rightarrow \) TeV (see the discussion in Appendix C1), so in practise we match our models to the QCD\(\times \)QED-invariant EFT that is relevant below \(m_W\).

The observable operator coefficients are given in Sect. 4 in terms of model parameters at the TeV, which usually appear with SM parameters in elegant combinations that recall Jarlskog invariants. This unforeseen curiosity (discussed in Sect. 4.1) may be an accident of our simple leading order analysis, which allows analytic expressions. Or possibly it indicates an interesting new role for invariants as stepping stones in the reconstruction of models from EFT coefficients. For instance, our models did not fulfill our expectations: we anticipated that the type II seesaw was predictive because the flavour-changing couplings \(f_{\alpha \beta }\) (Eq. 3.1) are determined by the neutrino mass matrix, and that the inverse seesaw was unpredictive because the \(Y^{\alpha a}\) couplings (Eq. 3.3) are unknown. However, Sect. 4.2 shows that \(\mu \rightarrow e\) flavour change is controlled by two invariants in the inverse seesaw with degenerate singlets, whereas three invariants are needed in the type II model. In any case, it is interesting that \(\mu \rightarrow e\) flavour change in these models is controlled by a few complex numbers of magnitude , which could be obtained for a wide variety of flavour structures in Lagrangians.

We showed in [50] that \(\mu \rightarrow e\) data has the ability to exclude the models we consider, because they cannot fill the whole ellipse in coefficient space accessible to upcoming experiments. In Sect. 5, we explored the more interesting question of whether observations could indicate a model – specifically, the type II seesaw model – by including some complementary observables. In Sect. 5.1 we showed that, in the type II seesaw model, some ratios of \(\mu \rightarrow e\) Wilson coefficients can be confined to relatively narrow intervals (depending on the mass ordering, see Eq. (5.3)) if the lightest neutrino mass is small enough. Therefore, if these ratios where observed outside the ranges quoted in Sect. 5.1, a lower bound on the neutrino mass scale could be inferred (assuming that neutrino masses arise from the type II seesaw mechanism). In Sect. 5.2, we considered \(\tau \)-LFV at Belle II – still in the type II seesaw model – in the case where neither \(\mu \rightarrow e \gamma \) nor \(\mu \rightarrow e {\bar{e}} e\) are observed in upcoming experiments (see table 1). In this case, the model makes specific predictions for \(\tau \)-LFV ratios that could be observed at Belle II, as a function of the neutrino mass scale and ordering. As a result, Belle II could contribute to constraining the neutrino mass scale and ordering or rule out the type II seesaw model.

Finally, in the discussion Sect. 6, we addressed some questions that arise in a bottom-up EFT attempt to reconstruct New Physics from (low-energy) data.