1 Introduction

1.1 Motivation

The observation of neutrino flavour oscillations is one of the strongest hints for the existence of particle physics beyond the Standard Model (SM). The oscillations imply that neutrinos have small masses, while the minimal SM predicts that they are massless. At the same time neutrinos are the only elementary fermions that are only known to exist with left handed chirality \(\nu _L\). If right handed neutrinos \(\nu _R\) exist, one could immediately add a Dirac mass term \(\bar{\nu _L}M_D\nu _R\) to the SM Lagrangian in analogy to all other known fermions. The fact that the \(\nu _R\) have not been seen yet could easily be explained because they are “sterile”, i.e., not charged under any known gauge interactions. The same property also makes it possible for them to have a Majorana mass term \(\bar{\nu _R}M_M\nu _R^c\) in addition to the Dirac mass. For eigenvalues of \(M_M\) that are much larger than the observed light neutrino masses, the smallness of the neutrino masses can be explained via the seesaw mechanism [1,2,3,4,5]. Neutrino oscillation data is, however, not sufficient to pin down the value of \(M_M\), known as seesaw scale, because it is primarily sensitive to the combination \(M_D M_M^{-1} M_D^T\). The range of allowed values spans from a few eV [6] up to the scale of Grand Unification [7]. For specific choices of their Majorana mass the \(\nu _R\) could in addition solve a number of long standing problems in cosmology. For instance, they could explain the baryon asymmetry of our Universe via leptogenesis during the decay [8] or production [9, 10] of the heavy neutrinos or provide a viable dark matter candidate [11, 12]. An overview of the cosmological implications of different choices of \(M_M\) can e.g. be found in Ref. [13].

Experiments can directly search for heavy neutrinos if \(M_M\) is below the TeV scale. Such searches have been performed in various different facilities, including high energy colliders and fixed target experiments. This is the mass range we consider in the present article. In addition, the \(\nu _R\) would indirectly affect precision observables or searches for rare processes. A summary of different existing constraints can be found in the reviews [13,14,15,16,17]. For the future a wide range of different searches have been proposed, an overview can be found in Refs. [18,19,20]. In order to decide about the best possible search strategy is it important to understand which parameter region is already ruled out by past experiments. This is in fact a non-trivial question because different observables are correlated in the seesaw model, and the requirement to simultaneously respect all known experimental results imposes stronger constrains on the model parameter space than superimposing individual bounds. Such global constraints can only be derived within a given model. An important quantity in this context is the unknown number n of right handed neutrino flavours. The minimal number that is required to explain the light neutrino oscillation data is \(n=2\), which would necessarily require the lightest SM neutrino to be massless. The minimal number that is required to generate masses for all three SM neutrinos is \(n=3\). This choice is also somewhat appealing in view of the fact that there are three fermion generations in the SM, and it is mandatory for anomaly freedom in many gauge extensions of the SM. The goal of the present work is to impose global constraints on the parameter space of the model with \(n=3\), based on the combination of direct, indirect and cosmological constraints summarised in Sect. 3.

Several authors have previously imposed global constraints on the properties of right handed neutrinos. Here we exclusively focus on models in which the right handed neutrinos can explain the light neutrino oscillation data.Footnote 1 This e.g. excludes most sterile neutrino Dark Matter models because the feeble coupling of such particles that is required to ensure their longevity implies that its contribution to the light neutrino mass generation can be neglected [22].Footnote 2 One of the most complete studies of indirect constraints on the parameter space for \(n=2\) in the last few years was presented in Ref. [25], where multiple electroweak precision observables and flavour-violating decays were included, along with tests of lepton universality and the unitarity of the CKM matrix. Loop corrections to some of these relations were considered in Ref. [26]. The authors of [27] included direct search constraints and those from big bang nucleosynthesis. The model with \(n=3\) is much less studied. Recent analyses of indirect constraints include Refs. [28, 29], direct search constraints and BBN have been added to this in Ref. [30].

1.2 Main improvements compared to previous studies

In this paper, we present the first full frequentist analysis of the \(n=3\) right-handed neutrino (RHN) extension of the SM, for a wide range of RHN masses from about 60 MeV to 500 GeV. We opted for a frequentist analysis rather than a Bayesian analysis since this is best suited to fully explore the valid parameter space while avoiding prior dependence and volume effects of the parameter space (however, we emphasize that we do not perform a full sampling-based goodness-of-fit analysis and instead resort for practical reasons to an approximate treatment of likelihood and their sampling statistics). We improve on different aspects of earlier analyses by combining all the strongest limits exerted by experiments as well as indirect signatures in a statistically consistent manner. Previous studies that examined the parameter space for \(n=3\) either used a subset of the constraints included here [28, 29] or used less rigorous statistical methods [30] and focused on specific regions of the parameter space [31].

  • While most previous studies fixed the mixing angles and mass differences in the active neutrino sector to the best fit values as presented in [32], we take into account likelihoods for the active neutrino observables.

  • Electroweak observables require precise calculations for its comparison with the extremely accurate measurements. We therefore use the calculation of the SM prediction for \(\sin \theta _w^{eff}\) up to two-loop order [33].

  • Most studies of lepton flavour violation in neutrino models focus exclusively on the most constraining processes, such as \(\mu \rightarrow e \gamma \) and \(\mu \rightarrow eee\) [25, 30]. In this work we include all lepton flavour violating processes, in particular all leptonic \(\tau \) decays, for which we use the most recent average of experimental results provided by HFLAV [34], as well as \(\mu -e\) conversion in nuclei (Pb, Au and Ti).

  • For neutrinoless double-beta decay, in comparison with [30], we opt to carry out our analysis conservatively; in addition, the upper limit on the effective Majorana mass and hence the mixing is encoded in the form of a (one-sided) Gaussian likelihood, not as a strict cut.

  • Lepton universality tests are often centered on leptonic decays of mesons, K and \(\pi \), \(\tau \)-leptons and W-bosons [25]. We supplement these tests of universality with the recently observed semileptonic decays of B-mesons [35,36,37].

  • We improve the treatment of CKM unitarity with respect to the discussion in Ref. [30].

  • Concerning direct searches, previous studies have used only a subset of the experiments considered here [21, 38], or chose to place a hard cut at the upper limits presented in the individual papers [27, 30]. We implement the strongest constraints over the mass range as likelihoods. The statistical combination of these likelihoods also leads to more accurate profile likelihood contours in comparison to simply overlaying individual limits.

  • We study in detail the flavour mixing pattern of the three RHN, for different values of the lightest neutrino mass. We discuss the limit where the lightest neutrino is massless and the connection to the \(n=2\) case.

We use here the open-source software package GAMBIT  [39]. It includes an interface to Diver [40], a differential evolution-based scanner that provides efficient sampling performance for frequentist scans.

This paper is organised as follows. In Sect. 2, the model, parametrisation used and essential quantities are defined. All the observables and experiments that are considered are subsequently discussed in detail in Sect. 3. Our scanning strategy, parameter ranges and applied priors are mentioned in Sect. 4. The results are presented in Sect. 5 and we discuss the implications of the combined constraints for future searches in Sect. 6. In Appendix A we comment on the details of the implementation in GAMBIT, in Appendix B we explicitly give the expressions for the different observables, in Appendix C we provide details on how we interpret our results in view of the criterion of technical naturalness, and in Appendix D we show the different partial likelihoods.

2 Right-handed neutrino physics

2.1 Basic definitions

The addition of three RHNs to the particle content of the Standard Model introduces in total 18 new parameters. In this section we summarise basic relations in the seesaw model and define our notation, following Ref. [30].

The most general renormalisable Lagrangian that can be constructed from SM fields and the \(\nu _R\) has the following form:

(1)

Hereby, \(\ell _L=(\nu _L,e_L)^T\) indicate the left-handed leptonsFootnote 3 of the SM and \(\varPhi \) is the Higgs doublet with \(\tilde{\varPhi }=\epsilon \varPhi ^{*}\) and \(\epsilon \) being the Levi-Civita tensor. \(M_M\) is the Majorana mass matrix for \(\nu _R\) and F is the Yukawa coupling matrix. We work in a flavour basis where \(M_M=\text {diag}(M_1,M_2,M_3)\).

After electroweak symmetry breaking (EWSB), the complete neutrino mass term reads

$$\begin{aligned} \frac{1}{2}(\bar{\nu _L} \bar{\nu _R^c})\mathcal {M} \begin{pmatrix} \nu _L^c \\ \nu _R \end{pmatrix}\;, \end{aligned}$$
(2)

with

$$\begin{aligned} \mathcal {M}=\begin{pmatrix}{\delta }m_{\nu }^{1loop} &{} M_D \\ M_D^T &{} M_M + {\delta }M_N^{1loop} \end{pmatrix}\;, \end{aligned}$$
(3)

where \(M_D=Fv\), v being the Higgs vacuum expectation value (\(v=174\) GeV in the ground state). We include the one loop corrections \(\delta m_{\nu }^{1loop}\) and \({\delta }M_N^{1loop}\) as we aim for performing an analysis to be consistent at second order in the Yukawa couplings F. The mass matrix (3) can be diagonalised by a matrix of the form [26]

$$\begin{aligned} \mathcal {U}= \begin{pmatrix} \cos (\theta ) &{} \sin (\theta ) \\ -\sin (\theta ^\dagger ) &{} \cos (\theta ^\dagger ) \end{pmatrix} \begin{pmatrix} U_{\nu } &{} \\ &{} U_N^{*} \end{pmatrix} \end{aligned}$$
(4)

with

$$\begin{aligned} \cos (\theta )&=\sum _{n=0}^\infty \frac{(-\theta \theta ^\dagger )^n}{(2n)!} \end{aligned}$$
(5)
$$\begin{aligned} \sin (\theta )&=\sum _{n=0}^\infty \frac{(-\theta \theta ^\dagger )^n\theta }{(2n+1)!}. \end{aligned}$$
(6)

Hereby, \(\theta \) indicates the matrix that mediates the mixing between the active neutrinos \(\nu _L\) and the sterile neutrinos \(\nu _R\). We can generally write

$$\begin{aligned} \mathcal {U}^{\dagger }\mathcal {M}\mathcal {U}^{*} =\begin{pmatrix}m_{\nu }^\mathrm{diag} &{} \\ &{} M_N^\mathrm{diag} \end{pmatrix} \end{aligned}$$
(7)

with

$$\begin{aligned} M_N^\mathrm{diag}&=U_N^T M_N U_N=\text {diag}(M_1,M_2,M_3) \end{aligned}$$
(8)
$$\begin{aligned} m_{\nu }^\mathrm{diag}&= U_{\nu }^{\dagger }m_{\nu }U_{\nu }^{*}=\text {diag}(m_1,m_2,m_3). \end{aligned}$$
(9)

The additional complex conjugation of \(U_N\) ensures that the relation among mass and flavour eigenstates will be analogous for left-handed neutrinos (LHNs) and RHNs within the notation. In the second relation in Eq. (8) we have neglected the difference between the eigenvalues of \(M_M\) and \(M_N\), which is of second order in \(\theta \). This is justified for the present purpose because of the experimental constraints on the magnitude of the elements \(\theta _{\alpha I}\), which we discuss further below.

2.2 The seesaw limit

The limit of small \(\theta _{\alpha I}\) is usually referred to as the seesaw limit, it corresponds to \(M_D\ll M_M\) (in terms of eigenvalues). It allows the approximation

$$\begin{aligned} \theta = M_D M_M^{-1} = v F M_M^{-1} \end{aligned}$$
(10)

and

$$\begin{aligned} \mathcal {U}=\Bigg [ \begin{pmatrix} \mathbb {I}-\frac{1}{2}\theta \theta ^{\dagger } &{} \theta \\ -\theta ^{\dagger } &{} \mathbb {I}-\frac{1}{2}\theta ^{\dagger }\theta \end{pmatrix} + \mathcal {O}(\theta ^3) \Bigg ]\begin{pmatrix} U_{\nu } &{} \\ &{} U_N^{*} \end{pmatrix}, \end{aligned}$$
(11)

leading to

$$\begin{aligned} m_{\nu }=m_{\nu }^\mathrm{tree}+\delta m_{\nu }^{1loop} \end{aligned}$$
(12)

with

$$\begin{aligned} m_{\nu }^\mathrm{tree}&=-M_D M_M^{-1}M_D^T=-{\theta }M_M\theta ^T=-v^2FM_M^{-1}F^T \end{aligned}$$

and

$$\begin{aligned} M_N = M_M+\frac{1}{2}\left( \theta ^{\dagger }\theta M_M+M_M^T\theta ^T\theta ^{*} \right) + \delta M_{N}^{1loop}\,. \end{aligned}$$
(13)

The loop correction to the light neutrino mixing matrix is given by [41]:

$$\begin{aligned} \left( \delta m_{\nu }^{1loop} \right) _{\alpha \beta }=\sum _I F_{\alpha I}M_IF_{I\beta }^Tl(M_I)\,, \end{aligned}$$
(14)

where \(l(M_I)\) is a loop function given by

$$\begin{aligned} l(M_I)= & {} \frac{1}{{(4\pi })^2}\left[ \left( \frac{3\text {ln}[(M_I/m_Z)^2]}{(M_I/m_Z)^2 - 1}\right) \right. \nonumber \\&\left. + \left( \frac{\text {ln}[(M_I/m_H)^2]}{(M_I/m_H)^2 - 1}\right) \right] \,. \end{aligned}$$
(15)

The light and heavy neutrino mass eigenstates are described by the flavour vectors

$$\begin{aligned} \nu =V_{\nu }^{\dagger }\nu _L-U_{\nu }^{\dagger }\theta \nu _R^c+V_{\nu }^T\nu _L^c-U_{\nu }^T\theta ^{*} \nu _R \end{aligned}$$
(16)

and

$$\begin{aligned} N=V_N^{\dagger }\nu _R+\varTheta ^T \nu _L^c+V_N^T\nu _R^c+\varTheta ^{\dagger } \nu _L, \end{aligned}$$
(17)

respectively. We can define the matrices \(V_{\nu }\) and \(V_N\) that represent the mixing between mass and interaction eigenstates in the respective sectors as

$$\begin{aligned} V_{\nu }&\equiv \left( \mathbb {I}-\frac{1}{2}\theta \theta ^{\dagger } \right) U_{\nu } \end{aligned}$$
(18)
$$\begin{aligned} V_N&\equiv \left( \mathbb {I}-\frac{1}{2}\theta ^T\theta ^{*} \right) U_N, \end{aligned}$$
(19)

while mixing between the two sectors is encoded in the matrix

$$\begin{aligned} \varTheta =\theta U_N^{*}\,. \end{aligned}$$
(20)

This quantity is of primary interest because it controls the interactions of the heavy neutrinos with the physical Higgs field h and the gauge bosons W and Z,

$$\begin{aligned}&- \frac{g}{\sqrt{2}}\overline{N}_I \varTheta ^\dagger _{I \alpha }\gamma ^\mu e_{L \alpha } W^+_\mu \nonumber \\&\quad - \frac{g}{\sqrt{2}}\overline{e_{L \alpha }}\gamma ^\mu \varTheta _{\alpha I} N_I W^-_\mu \nonumber \\&\quad - \frac{g}{2\cos \theta _W}\overline{N_I} \varTheta ^\dagger _{I \alpha }\gamma ^\mu \nu _{L \alpha } Z_\mu - \frac{g}{2\cos \theta _W}\overline{\nu _{L \alpha }}\gamma ^\mu \varTheta _{\alpha I} N_i Z_\mu \nonumber \\&\quad - \frac{g}{\sqrt{2}}\frac{M_I}{m_W}\varTheta _{\alpha i} h \overline{\nu _{L \alpha }}N_I - \frac{g}{\sqrt{2}}\frac{M_I}{m_W}\varTheta ^\dagger _{I \alpha } h \overline{N_I}\nu _{L \alpha } \ \end{aligned}$$
(21)

Here g is the weak gauge coupling constant and \(\theta _W\) the Weinberg angle. For convenience, we introduce the notation

$$\begin{aligned} U_{\alpha I}^2&\equiv |\varTheta _{\alpha I}|^2 \end{aligned}$$
(22)
$$\begin{aligned} U_I^2&\equiv U_{eI}^2+U_{\mu I}^2+U_{\tau I}^2 \end{aligned}$$
(23)
$$\begin{aligned} U_{\alpha }^2&\equiv \sum _I U_{\alpha I}^2\,. \end{aligned}$$
(24)

From the relations (3) and (7) it is straightforward to derive the relation

$$\begin{aligned} ({\delta }m_{\nu }^{1loop})_{\alpha \alpha }= & {} \sum _i m_i (V_\nu )_{\alpha i}^2 + \sum _I M_I \varTheta _{\alpha I}^2. \end{aligned}$$
(25)

2.3 The role of the matrix \(U_N\)

In our numerical scan we approximate \(U_N\) by unity.Footnote 4 For generic parameter choices this can be justified because we work in a basis where \(M_N\) is diagonal, and the physical mass matrix (13) is also diagonal up to corrections of second order in \(\theta \). These corrections can lead to a large deviation of \(U_N\) from unity only if the eigenvalues of \(M_M\) are quasi-degenerate, so that the \(\mathcal {O}[\theta ^2]\) terms in the matrix (13) are relevant.

If a degeneracy between only two of the RHNs is caused by a symmetry, cf. Sect. 2.5, then it can be shown that the effect of \(U_N\) on the \(U_{\alpha I}^2\) is small even if individual entries of \(U_N\) are larger than the \(U_{\alpha I}^2\) [42]. This means that the production cross sections for heavy neutrinos are not affected. However, the branching ratio between lepton number violating (LNV) and lepton number conserving heavy neutrino decays is affected by \(U_N\) [43]. This has no big effect on our scan because constraints from searches for LNV are sub-dominant in almost the entire mass range that we consider, but it may have important implications for future searches.

\(U_N\) can have a big impact on the individual mixings \(U_{\alpha I}^2\) of each heavy neutrino if all three Majorana masses are degenerate. This can be accommodated in technically natural scenarios discussed in the following Sect. 2.5, cf. in particular footnote 6. The practical impact on experimental searches is, however, limited because most experiments are not able to kinematically resolve small mass splittings and therefore only probe \(U_\alpha ^2\) in this regime (rather than the couplings \(U_{\alpha I}^2\) of individual heavy neutrino flavours). Also in this case observables that are sensitive to LNV are the only ones that are likely to be affected.

Finally, if the degeneracy between the heavy neutrino masses is accidental, then the proof in Ref. [42] does not apply, and \(U_N\) can have a significant effect on the \(U_{\alpha I}^2\) even if only two heavy neutrinos have degenerate masses. Our results contain a significant number of points of this kind because we performed several scans with “agnostic” parameter ranges that do not suppress fine-tuned points, cf.  Table 5. However, the fact that experiments are unlikely to resolve the individual resonances in this case implies that they are only sensitive to the quantities \(U_a^2\), where the summation is to be taken over the mass degenerate heavy neutrino flavours only. As in the previous two cases, the effect of \(U_N\) on the total production rate is minor because the matrix mainly re-distributes coupling between the mass degenerate states. The main affect would again be on LNV observables.

In summary, if any heavy neutrinos are discovered in the future, a comparison between the branching ratios of lepton number violating and lepton number conserving decays will give important insight into the mechanism of neutrino mass generation and will be crucial to identify any underlying symmetries.

2.4 Casas-Ibarra parametrisation

In the current work, we use the Casas-Ibarra (C-I) parametrisation [44], generalised to include the 1-loop correction to the left-handed neutrino mass matrix [45]. This provides a simple way to impose constraints from light neutrino oscillation data in our scan. This parametrisation is based on the observation that \(m_\nu \) in Eq. (12) can be expressed as

$$\begin{aligned} m_\nu =-\theta \tilde{M}\theta ^T \end{aligned}$$
(26)

with

$$\begin{aligned} \tilde{M} = \bigg [ 1 - \frac{1}{v^2} M_M M_N^\mathrm{diag} l(M_N^\mathrm{diag}) \bigg ] M_M. \end{aligned}$$
(27)

Since the loop function is smooth we can neglect the difference in the eigenvalues of \(M_M\) and \(M_N\),

$$\begin{aligned} \tilde{M}_{IJ}\simeq \tilde{M}^\mathrm{diag}_{IJ} = M_I\delta _{IJ}\left( 1 - \frac{M_I^2}{v^2}l(M_I)\right) . \end{aligned}$$
(28)

In this scheme the sterile neutrino mixing matrix, i.e. the matrix encoding the mixing among LHNs and RHNs (20) can be written as

$$\begin{aligned} \varTheta = iU_{\nu }\sqrt{m_{\nu }^\text {diag}}\mathcal {R}\sqrt{\tilde{M}^\text {diag}}^{-1} \;, \end{aligned}$$
(29)

where \(U_{\nu }\) is the PMNS matrix introduced above, \(m_{\nu }^\text {diag}\) is the diagonalised, one-loop-corrected LHN mass matrix and \(\tilde{M}^\text {diag}\) is the analogous RHN mass matrix, given by (28). Furthermore, \(\mathcal {R}\) is a complex, orthogonal matrix that is parametrised by complex angles \(\omega _{ij}\)

$$\begin{aligned} \mathcal {R} = \mathcal {R}^{23}\mathcal {R}^{13}\mathcal {R}^{12}\;, \end{aligned}$$
(30)

where \(\mathcal {R}^{ij}\) has the non-zero elements

$$\begin{aligned} \mathcal {R}^{ij}_{ii}&= \mathcal {R}^{ij}_{jj} = \cos \omega _{ij}, \end{aligned}$$
(31)
$$\begin{aligned} \mathcal {R}^{ij}_{ij}&= -\mathcal {R}^{ij}_{ji} = \sin \omega _{ij}, \end{aligned}$$
(32)
$$\begin{aligned} \mathcal {R}^{ij}_{kk}&= 1; k \ne i,j\;. \end{aligned}$$
(33)

Since we work in the flavour basis in which the Yukawa couplings of the charged leptons are diagonal, \(U_{\nu }\) can be parametrised as

$$\begin{aligned} U_{\nu } = V^{23}U_{\delta }V^{13}U_{-\delta }V^{12}\mathrm {diag}(e^{i\alpha _1/2},e^{i\alpha _2/2},1)\;, \end{aligned}$$
(34)

where \(U_{\pm \delta } = \mathrm {diag}(e^{{\mp }i\delta /2},1,e^{{\pm }i\delta /2})\) and \(V^{ij}\), parametrised by the LHN mixing angles \(\theta _{ij}\), has non-zero elements analogous to \(\mathcal {R}\). Furthermore, \(\alpha _1\), \(\alpha _2\) and \(\delta \) are CP-violating phases.

The C-I parametrisation scheme generates by construction Yukawa couplings and mixing angles \(\varTheta \) that are consistent with light neutrino oscillation data up to second order in \(\theta \). This has two disadvantages. First, one may find it unsatisfactory that we treat light neutrino oscillation data differently from other constraints. Second, the C-I is a “bottom up” parametrisation. There is usually no simple relation between the C-I parameters and parameters that may be well-motivated from a model building viewpoint, and any theory-motivated prior on the RHNs’ mixings and masses would acquire a rather convoluted form in the C-I parametrisation. In particular, there is no simple way to distinguish “natural” from “fine tuned” parameter choices. Hence, we refrain from performing Bayesian scans in the current work, and instead concentrate on a likelihood-based frequentist treatment. In view of the high dimensionality of the parameter space and the complicated functional form of the different constraints, the disadvantages of the C-I parametrisation are, however, compensated for by the numerical advantage that one gains.

2.5 The symmetry protected scenario

The smallness of the light neutrino masses \(m_i\) can be explained in different ways by the seesaw relation (26). One possibility is that the \(N_I\) are very heavy, i.e., \(M_I\gg v\), in which case the smallness of \(m_i\) is due to the smallness of the ratio \(v/M_I\). This choice for the mass scale(s) \(M_I\) is well-motivated by Grand Unified Theories,Footnote 5 but raises the question of radiative corrections to the Higgs potential from the Yukawa couplings of the RHNs [48].

This “hierarchy problem” can be avoided in low scale seesaw scenarios. Low values of \(M_I\) are natural because in the limit \(M_I \rightarrow 0\) the \(B-L\) symmetry in the SM is restored. In this case, however, the smallness of \(m_i\) can no longer be explained efficiently by the suppression of \(v/M_I\), as it typically requires couplings

$$\begin{aligned} \varTheta _{\alpha I} \simeq i (U_\nu )_{\alpha I}\sqrt{\frac{m_i}{M_I}} \ , \ F_{\alpha I} \simeq i (U_\nu )_{\alpha I}\frac{\sqrt{m_i M_I}}{v} \end{aligned}$$
(35)

that are very small, in particular for seesaw scales as low as 100 MeV.

Such small values for fundamental parameters are considered ’unnatural’ by many theorists [49], though some possible explanations have been proposed [50]. However, this estimate relies on the underlying assumption that there are no cancellations (accidental or otherwise) in the seesaw relation (26), which would allow for much larger \(U_{\alpha I}^2=|\varTheta _{\alpha I}|^2\) than the naive estimate (35) suggests while keeping the eigenvalues \(m_i^2\) of \(m_\nu ^\dagger m_\nu \) small.

Hence, a technically natural [51] way to obtain small neutrino masses \(m_i\) can be realised if the Lagrangian (1) approximately respects a \(B-\bar{L}\) symmetry [52, 53] (cf. also [54]), where \(\bar{L}\) is a generalised lepton number under which combinations of the \(\nu _{Ri}\) are charged. Such \(B - \bar{L}\) symmetry is exact if the Yukawa coupling and mass matrix take the form [55]

$$\begin{aligned} M_M^{B-\bar{L}} =\begin{pmatrix} \bar{M} &{} 0 &{} 0 \\ 0 &{} \bar{M} &{} 0 \\ 0 &{}0 &{} M' \end{pmatrix} \, \ F^{B-\bar{L}} =\begin{pmatrix}\ F_e \ &{} \ iF_e &{} 0 \\ \ F_\mu \ &{} \ iF_\mu &{} 0 \\ F_\tau \ &{} \ iF_\tau &{} 0 \end{pmatrix}\;, \end{aligned}$$
(36)

in which case the light neutrinos are exactly massless \(m_i = 0\). In order to generate non-zero light neutrino masses this symmetry has to be slightly broken, i.e.,

$$\begin{aligned} M_M = M_M^{B-\bar{L}} (1 + \upmu ) \ , \ F = F^{B-\bar{L}}(1 + \upepsilon ), \end{aligned}$$
(37)

where the entries of the matrices \(\upmu \) and \(\upepsilon \) are small symmetry breaking parameters.

If the symmetry is not exact \(M_M\) can have off-diagonal elements, see for example Ref. [56]. Throughout this work we use a basis in which \(M_M\) is diagonal. The diagonalisation affects the form of the Yukawa matrix F, but as long as the off diagonal elements of \(\upmu \) are small, this only leads to a small modification of the flavour structure. For the following discussion we will therefore adapt the simpler form [57]Footnote 6

$$\begin{aligned} M_M= & {} \begin{pmatrix} \bar{M}(1 - \mu ) &{} 0 &{} 0 \\ 0 &{} \bar{M} (1 + \mu ) &{}0 \\ 0 &{}0 &{} M' \end{pmatrix} , \nonumber \\ F= & {} \begin{pmatrix}\ F_e(1 + \epsilon _e) \ &{} \ iF_e(1 - \epsilon _e) &{} F_e \epsilon '_e \\ \ F_\mu (1 + \epsilon _\mu ) \ &{} \ iF_\mu (1 - \epsilon _\mu ) &{} F_\mu \epsilon '_\mu \\ F_\tau (1 + \epsilon _\tau ) \ &{} \ iF_\tau (1 - \epsilon _\tau ) &{} F_\tau \epsilon '_\tau \end{pmatrix}, \end{aligned}$$
(38)

with \(\epsilon _\alpha ', \epsilon _\alpha , \mu ,\ll 1\) being small symmetry breaking parameters and \(F_\alpha \) being of the order of one. This means that one heavy neutrino practically decouples while the other two approximately form a Dirac spinor with mass \(\bar{M}\).

In this symmetry protected scenario there is no upper limit on \(U_{\alpha I}^2\) from neutrino oscillation data. In the mass range considered here the upper limit comes from the experimental constraints, while for larger masses there is a theoretical bound \(U_{\alpha I}^2<4\pi (n-1)(v/\bar{M})^2\) from the requirement that the Yukawa couplings remain perturbative [7]. This provides a theoretical motivation for a low scale seesaw with experimentally accessible mixings \(U_{\alpha I}^2\). Specific examples that motivate this limit include “inverse seesaw” [68,69,70,71], “linear seesaw” [72, 73], scale invariant [74] and some technicolour-type models [75, 76] and also the \(\nu \)MSM [10, 52].

2.6 Connection to the model with \(n=2\)

The parametrisation (38) suggests that the \(B-\bar{L}\) symmetric limit for the model with \(n=3\) should contain the model with \(n=2\), as the third heavy neutrino decouples for \(\epsilon '_a\rightarrow 0\). This is, for example, observed in the \(\nu \)MSM. However, some care is required when taking this limit if one wants to be consistent with neutrino oscillation data.

First, it is clear that not all seven symmetry breaking parameters \(\epsilon _a, \epsilon '_a, \mu \) can be set to zero because this would give exactly massless light neutrinos. Which of these parameters are non-zero and how small they are with respect to each other depends on the way how the symmetry is broken and thus on the particle physics model in which the Lagrangian (1) is embedded. It is not possible to make a model independent statement about the relative size of the \(\epsilon '_a\) in relation to other model parameters.

Second, the parametrisation (38) is not the most general one: If we allow for small off diagonal elements in the general form (37), then all three heavy neutrinos can have unsuppressed interactions if \(\bar{M}'\simeq \bar{M}\), cf. footnote 6. Hence, if \(\bar{M}'\) is degenerate with \(\bar{M}\), one cannot expect to recover the \(n=2\) model even if \(\epsilon _\alpha \ll 1\).

Finally, as discussed in more detail in Appendix C, there are Casas-Ibarra parameter choices that yield small values of \(m_{\nu _0}\), but correspond to highly fine-tuned scenarios where this smallness is due to accidental cancellations. These solutions can imitate the symmetry protected scenario and can also circumvent the seesaw upper limit and thus reach high values of \(U_{\alpha I}^2\).

3 Observables, experiments and likelihoods

Models with heavy right-handed neutrinos, as described above, will alter the SM predictions for different observables that are already significantly constrained by experimental results. In this analysis, we implemented all relevant constraints such as active neutrino likelihoods (3.1) and direct detection experiments which currently exert the strongest bounds over the considered mass range (3.3); these include beam dump and peak search experiments, which looked for RHNs in meson, tau and gauge boson decays. Besides, we similarly include the most relevant indirect constraints: electroweak precision observables (3.2.1), lepton flavour violating processes (3.2.2), lepton universality constraints (3.2.3), BBN (3.2.6), neutrinoless double-beta decay (3.2.5) and CKM unitarity (3.2.4).

In this section, we will focus on the physics and statistics aspects of our likelihood functions. The corresponding implementation of GAMBIT capabilities and module functions associated with the various observables are discussed in detail in Appendix A.

3.1 Active neutrino mixing

In contrast to previous studies, we include likelihoods for the active neutrino mixing observables in our analysis: the three mixing angles \(\theta _{12},\theta _{13},\theta _{23}\), the mass splittings \(\varDelta m^2_{21}\) and \(\varDelta m^2_{3\ell }\) with \(\ell = 1\) for normal ordering and \(\ell = 2\) for inverted ordering, as well as the CP-phase \(\delta _{\mathrm {CP}}\). We use the most recent publically available results of the global analysis of solar, atmospheric, reactor and accelerator neutrino data in the framework of three neutrino oscillations provided by the NuFIT collaboration (as of January 2018) [32, 77], including

  • the solar neutrino experiments Homestake chlorine [78], Gallex/GNO [79] and SAGE [80], SNO [81], the four phases of Super-Kamiokande [82,83,84] and two phases of Borexino [85,86,87],

  • the atmospheric experiments IceCube/DeepCore [88],

  • the reactor experiments KamLAND [89], Double-Chooz [90], Daya-Bay [91] and Reno [92],

  • the accelerator experiments MINOS [93, 94], T2K [95] and NO\(\nu \)A [96],

  • the cosmic microwave background measurement Planck [97]

For our global fit, we take the provided one-dimensional \(\varDelta \chi ^2\) tables for both orderings of the NuFIT collaboration [77]. For more detailed information, we refer to [32] and references therein. We emphasize that using higher dimensional tables that account for correlations would in general lead to (slightly) more stringent results on the RHN parameter space, hence our treatment can be considered as conservative.

3.2 Indirect constraints

3.2.1 Electroweak precision observables

The leptonic charge currents are modified by the RHNs, and hence the value of \(G_\mu \) that is measured via the muon decay will differ from the actual Fermi constant \(G_F\) which is defined in terms of the fine structure constant and mass of the Z boson. The correction can be written as [30]

$$\begin{aligned} G_\mu ^2 = G_F^2 (1 - (\theta \theta ^\dagger )_{\mu \mu } - (\theta \theta ^\dagger )_{ee})\ \end{aligned}$$
(39)

and is caused by the non-unitarity of the flavour mixing matrix \(V_\nu \), see Eq. (18), which leads to a slight suppression of the muon decay.

Both the weak mixing angle \(\theta _w\) and the mass of the W boson \(m_W\) depend on \(G_\mu \) at one loop, which means they also get a correction from the active-sterile mixing matrix \(\varTheta \), which is given by [25]

$$\begin{aligned} s^2_w&= [s^2_w]_{SM} \sqrt{1 - (\theta \theta ^\dagger )_{\mu \mu } - (\theta \theta ^\dagger )_{ee}}, \nonumber \\ \frac{m_W^2}{[m_W^2]_{SM}}&= \frac{[s^2_w]_{SM}}{s^2_w} \sqrt{1 - (\theta \theta ^\dagger )_{\mu \mu } - (\theta \theta ^\dagger )_{ee}}\;, \end{aligned}$$
(40)

where \(s_w^2 = \sin ^2\theta _w\). Since experiments typically measure the effective Weinberg angle \(s^2_{eff}\), and assuming the QCD corrections factorize from the leptonic corrections [98], we use for the SM prediction the highly accurate calculation, including corrections up to two-loops, from [33]

$$\begin{aligned} {}[s^2_{eff}]_{SM}&= 0.23152 \pm 0.00010, \nonumber \\ {}[m_W]_{SM}&= 80.361 \pm 0.010 \text { GeV}. \end{aligned}$$
(41)

Other electroweak precision observables affected by the presence of the heavy neutrinos are the decays of the Z and W bosons, in particular the invisible decay width of the Z boson, \(\varGamma _\mathrm{{inv}}\), and the leptonic decays of W. Under the assumption that the radiative corrections factorize from the heavy neutrino contribution, at least up to order \(\theta ^2\) [26, 98], one can write the invisible decay width of the Z as [99]

$$\begin{aligned} \varGamma _\mathrm{{inv}}&= \sum _{i,j} |\varGamma _{Z\rightarrow \nu _i\nu _j}|^{\text {SM}} \Bigg (|V_\nu ^\dagger V_\nu |^2 _{ij} \nonumber \\&+ |V_\nu ^\dagger \varTheta |^2_{ij} \left( 1 - \frac{m_{N_j}^2}{m_Z^2}\right) ^2\left( 1 + \frac{1}{2}\frac{m_{N_j}^2}{m_Z^2}\right) \Bigg ), \end{aligned}$$
(42)

where we have neglected the contribution from \(Z \rightarrow N_i N_j\) due to being of order \(\theta ^4\), and for the SM decay \(Z \rightarrow \nu _i \nu _j\) we use the 2-loop calculation from [100].

The contribution of heavy neutrinos to the W decay widths to leptons can be written as [25]

$$\begin{aligned} \varGamma _{W\rightarrow l_\alpha \bar{\nu }} = \frac{G_\mu m_W^3}{6\sqrt{2}\pi }\frac{(1-\tfrac{1}{2} \theta \theta ^\dagger )_{\alpha \alpha })(1-x_\alpha )^2(1 + x_\alpha )}{\sqrt{1 - (\theta \theta ^\dagger )_{\mu \mu } -(\theta \theta ^\dagger )_{ee})}}\;, \end{aligned}$$
(43)

where we defined \(x_\alpha \equiv m_{l_\alpha }^2 / m_W^2\).

Table 1 Electroweak precision observables measurements and uncertainties, taken from Ref. [101]

We construct Gaussian likelihoods for these observables using the experimental measurements and uncertainties displayed in Table 1. All these observables depend on \(G_\mu \) (Eq. (39)) either directly or through another observable (\(s_w\) or \(m_W\)). Since the experimental measurements of these quantities are independent of each other, we assume them to be uncorrelated.

3.2.2 Lepton flavour violation

Flavour changing neutral processes, such as lepton flavour violation (LFV), are strongly suppressed in the Standard Model at one loop due to the GIM mechanism [102]. Hence, any non-trivial contribution to these processes from physics beyond the Standard Model would dominate over the SM contribution, which in turn makes the experimental determination of these observables a smoking gun of new physics. Several experiments have attempted to measure LFV processes with outstanding precision and they have imposed a set of upper limits on their branching fractions. In Table 2 we list the most significant of these observables, along with the experimental upper bound on their branching ratios and the experiment that provided it.

Table 2 Experimental upper bounds on LFV processes, along with the experiments that provided that bound. When more than one experiment is cited, the HFLAV average is used [34]. All upper bounds are given at the 90% CL

The experimental upper bounds for LFV \(\mu \) and \(\tau \) decays in Table 2 are given as branching fractions with respect to the total decay width of the respective lepton [101, 114],

$$\begin{aligned} \varGamma _{\mu }&= (2.995984 \pm 0.000003) \times 10^{-19}\;,\nonumber \\ \varGamma _{\tau }&= (2.2670 \pm 0.0039) \times 10^{-12}\;. \end{aligned}$$
(44)

In the model with three heavy neutrinos the leading contributions to these observables arise from dipole and box diagrams with mixing between the active and sterile neutrinos, given by the active-sterile mixing matrix \(\varTheta \). The relevant LFV processes containing these diagrams are of the form \(l_\alpha ^- \rightarrow l_\beta ^- \gamma \), \(l_\alpha ^- \rightarrow l_\beta ^- l_\beta ^- l_\beta ^+\), \(l_\alpha ^- \rightarrow l_\beta ^- l_\gamma ^- l_\gamma ^+\) and \(l_\alpha ^- \rightarrow l_\gamma ^- l_\gamma ^- l_\beta ^+\). The associated decay widths can be found in Appendix B.1.

Lastly, LFV processes can result in a neutrinoless \(\mu - e\) conversion inside a nucleus. Muons captured by a nucleus typically decay in orbit providing a continuous spectrum of energy for the electron in the final state. In coherent flavour violating conversion, \(\mu ^- N \rightarrow e^- N\), final state electrons have a discrete energy spectrum, corresponding to the mass of the decaying muon. Consequently experiments measure the rate at which this conversion happens, with respect to the rate of capture by the nucleus,

$$\begin{aligned} R_{\mu -e} = \varGamma _{\text {conv}}/\varGamma _{\text {capt}}. \end{aligned}$$
(45)

The corresponding expressions for the conversion ratio, as well as the nuclear parameters for the two nuclei studied, \(\text {Ti}_{22}^{48}\), \(\text {Au}_{79}^{197}\) and \(\text {Pb}_{82}^{208}\), can be found in Appendix B.1.

The likelihoods for these LFV observables are all Gaussian upper limit likelihoods. They are computed as

$$\begin{aligned} \ln \mathcal {L} = \left\{ \begin{array}{lr} -\frac{1}{2}\log (2\pi \sigma ^2), &{} \quad x < x_0 \\ -\frac{1}{2}\log (2\pi \sigma ^2) - \frac{1}{2} \frac{(x - x_0)^2}{\sigma ^2}, &{} \quad x > x_0 \end{array} \right. \;, \end{aligned}$$
(46)

using the experimental data from Table 2. More specifically, we assume a measured value of \(x_0\) for all observables,Footnote 7 and set \(\sigma = v/1.64\) for full Gaussians and \(\sigma = v/1.28\) for one-sided Gaussians, where v is the quoted upper 90% CL limit.

3.2.3 Lepton universality

Recent measurements of meson decays [35,36,37] have put into question the flavour-independence of leptonic charged currents, as predicted by the SM. Previous tests of lepton universality performed by LEP and SLC, using lifetime measurements of the tau and muon as well as the partial decay widths of the Z boson, showed no such deviation. This has lead to the formulation of many BSM theories attempting to explain the deviation shown in meson decays with sterile neutrinos [99, 115, 116].

The presence of right-handed neutrinos modifies the leptonic currents and thus triggers a contribution to processes testing lepton universality such as in the fully leptonic decays of charged mesons, \(X^+ \rightarrow l^+ \nu \), or the semileptonic decays of B mesons \(B^{0/\pm } \rightarrow X^{0/\pm } l^+ l^-\).

In order to cancel the considerable hadronic uncertainties present in the decays of pseudoscalar mesons, lepton universality tests are best formulated using ratios between lepton species. For fully leptonic and semileptonic decays of mesons, these ratios are expressed as

$$\begin{aligned} R_{\alpha \beta }^X&= \frac{\varGamma (X^+ \rightarrow l_{\alpha }^+\nu _{\alpha })}{\varGamma (X^+ \rightarrow l_{\beta }^+\nu _{\beta })}\;, \end{aligned}$$
(47)
$$\begin{aligned} R_{X}&= \frac{\varGamma (B^{0/\pm } \rightarrow X^{0/\pm } l^+_\alpha l^-_\alpha )}{\varGamma (B^{0/\pm } \rightarrow X^{0/\pm } l^+_\beta l^-_\beta )}\;, \end{aligned}$$
(48)

respectively.

In case of fully leptonic decays, one can express the test of lepton universality in terms of deviations from the SM prediction as

$$\begin{aligned} R_{\alpha \beta }^X = R^X_{\alpha \beta ,SM}(1 + {\varDelta }r_{\alpha \beta }^X)\;, \end{aligned}$$
(49)

where the sterile neutrino contribution can be calculated from the active-sterile mixing matrix \(\varTheta \) as [30, 117]

$$\begin{aligned} {\varDelta }r_{\alpha \beta }^X = \frac{1+\sum _I{|\varTheta _{{\alpha }I}|^2[G_{{\alpha }I} - 1]}}{1+\sum _I{|\varTheta _{{\beta }I}|^2[G_{{\beta }I} - 1]}} - 1\;, \end{aligned}$$
(50)

where we used

$$\begin{aligned} G_{{\alpha }I}&= \vartheta (m_X - m_{l_{\alpha }} - M_I)\frac{r_{\alpha } + r_I + (r_{\alpha } - r_I)^2}{r_{\alpha }(1 - r_{\alpha })^2}\nonumber \\&\quad \cdot \sqrt{1 - 2(r_{\alpha } + r_I) + (r_{\alpha } - r_I)^2}\;, \end{aligned}$$
(51)

with \(\vartheta \) being the Heaviside step function, \(r_{\alpha } \equiv m_{l_{\alpha }}^2/m_X^2\) and \(r_I \equiv M_I^2/m_X^2\). The SM predictions used in Eq. (49) for the tests of lepton universality for pions and kaons are \(R^\pi _{e\mu ,SM} = 1.2354 \times 10^{-4}\) and \(R^K_{e\mu ,SM} = 2.477 \times 10^{-5}\), respectively [118].

The contribution from heavy right-handed neutrinos to the semileptonic decays of B mesons is much less significant than to the leptonic decays. As argued in Ref. [99], the effect on B decays to charmed mesons, \(B^\pm \rightarrow D l \nu \), is completely negligible. Semileptonic decays to K mesons are more affected, particularly the decays \(B^+ \rightarrow K^+ l^+ l^-\) and \(B^0 \rightarrow K^{*0} l^+ l^-\). Assuming that \(m_l \ll m_{K^{(*)}}\) and that the Wilson coefficient \(C_7 \ll C_9, C_{10}\), one can approximate the ratios \(R_K\) and \(R_{K^*}\) as [119]

$$\begin{aligned} R_{K^{(*)}}&= \frac{\varGamma (B^{\pm /0} \rightarrow K^{\pm /*0} \mu ^+ \mu ^-)}{\varGamma (B^{\pm /0} \rightarrow K^{\pm /*0} e^+ e^-)} \nonumber \\&\approx \frac{|C_{10}^{SM} + \varDelta C_{10}^\mu |^2 + |C_{9}^{SM} + \varDelta C_{9}^\mu |^2}{|C_{10}^{SM} + \varDelta C_{10}^e|^2 + |C_{9}^{SM} + \varDelta C_{9}^e|^2}\;, \end{aligned}$$
(52)

and the BSM contributions to the Wilson coefficients \(\varDelta C_9^\alpha \) and \(\varDelta C_{10}^\alpha \) can be expressed as [120]

$$\begin{aligned} \varDelta C_9^\alpha = - \varDelta C_{10}^\alpha = -\frac{1}{4 s_w^2} \sum _I |\varTheta _{\alpha I}|^2 E(x_t, x_I)\;, \end{aligned}$$
(53)

with \(x_t = m_t^2/m_W^2\), \(x_I = M_I^2/m_W^2\) and the loop function

$$\begin{aligned} E(x,y)= & {} xy \Bigg \{-\frac{3}{4}\frac{1}{(1-x)(1-y)} \nonumber \\&+ \Big (\frac{1}{4} - \frac{3}{2(x-1)} - \frac{3}{4(x-1)^2}\Big )\frac{\log x}{x-y} \nonumber \\&+ \Big (\frac{1}{4} - \frac{3}{2(y-1)} - \frac{3}{4(y-1)^2}\Big )\frac{\log y}{y-x} \Bigg \}\;. \end{aligned}$$
(54)

NNL calculations for the Standard Model contribution to the Wilson coefficients \(C_9\) and \(C_{10}\) used in Eq. (52) gives \(C_9^{SM} = 4.211\) and \(C_{10}^{SM} = -4.103\) [121, 122].

In addition to meson decays, other common tests of lepton universality include the decays of the W boson to leptons as well as \(\tau \) decays. The ratio of decay widths of W to charged leptons \(l_\alpha \) and \(l_\beta \) can be written as [25]

$$\begin{aligned} R^W_{\alpha \beta } = \frac{\varGamma (W^+ \rightarrow l^+_\alpha \nu _\alpha )}{\varGamma (W^+ \rightarrow l_\beta ^+ \nu _\beta )}= \sqrt{\frac{1 - (\theta \theta ^\dagger )_{\alpha \alpha }}{1 - (\theta \theta ^\dagger )_{\beta \beta }}}\;. \end{aligned}$$
(55)

Deviations from the SM for the lepton universality test in \(\tau \) decays follow the same form as in Eq. (50) and the SM prediction is \(R^\tau _{\mu e, SM} = 0.973\) [123].

Table 3 Experimental measurements for all tests of lepton universality

These tests of lepton universality are implemented as Gaussian likelihoods centered on the experimentally measured value. The experimental measurements, with their corresponding uncertainties,Footnote 8 are shown in Table 3. The measurements of \(R^\pi _{e\mu }\) include subleading decays with \(\gamma \)’s, hence the upper limit shown is the PDG average of the ratios of \(\varGamma (\pi ^+ \rightarrow l_\alpha ^+ \nu _\alpha ) + \varGamma (\pi ^+ \rightarrow l_\alpha ^+ \nu _\alpha \gamma )\), based on the measurements in [128,129,130]. Two experimental measurements are shown for \(R_{K^*}\) corresponding to two regions of the dilepton invariant mass \(0.045< q^2 < 1.1 (\text {GeV}^2/c^4)\) for (1) and \(1.1< q^2 < 6.0 (\text {GeV}^2/c^4)\) for (2).

3.2.4 CKM unitarity

The determination of the CKM matrix elements \((V_{CKM}^{exp})_{ab}^i\) is usually done under the implicit assumption of a zero active-sterile mixing matrix, \(\varTheta = 0\). The measurements of the \((V_{CKM}^{exp})_{ab}^i\) therefore need to be adjusted to take into account effects of RHNs.

Firstly, the smallest element of the CKM matrix, \((V_{CKM})_{ub}\), can be neglected in our study as its absolute value \(|(V_{CKM})_{ub}|^2 \sim 10^{-5}\) is much smaller than our sensitivity to the \(\varTheta \) parameter. Hence, under the assumption of the unitary of the CKM matrix, one can derive the following relation:

$$\begin{aligned} |(V_{CKM})_{ud}|^2 + |(V_{CKM})_{us}|^2 = 1. \end{aligned}$$
(56)

Thus, we use the various experimental measurements of \((V_{CKM}^{exp})_{us}\) [131,132,133] and \((V_{CKM}^{exp})_{ud}\) [134] to simultaneously constrain the true value of \(|(V_{CKM})_{us}|\) and active-sterile mixing matrix \(\varTheta \).

Following Refs. [25, 30], the experimental measurements and true value of CKM matrix element \((V_{CKM})_{us,ud}\) are related via

$$\begin{aligned} |(V_{CKM}^{exp})_{us,ud}^i|^2 = |(V_{CKM})_{us,ud}|^2[1+f^i(\varTheta )]\;, \end{aligned}$$
(57)

where we defined the functions \(f^i\) to encode the contribution of RHNs to the process considered in each experiment. The decay processes considered to extract the value of \(|(V_{CKM}^{exp})_{us}|\), and the \(f(\varTheta )\) functions, are given by [25]

$$\begin{aligned}&K_L \rightarrow \pi ^+e^-\bar{\nu }_e: 1+f^1(\varTheta ) = \frac{G_F^2}{G_{\mu }^2}[1-(\theta \theta ^{\dagger })_{ee}], \end{aligned}$$
(58)
$$\begin{aligned}&K_S \rightarrow \pi ^+e^-\bar{\nu }_e: f^2(\varTheta ) = f^1(\varTheta ), \end{aligned}$$
(59)
$$\begin{aligned}&K^- \rightarrow \pi ^0e^-\bar{\nu }_e: f^3(\varTheta ) = f^1(\varTheta ), \end{aligned}$$
(60)
$$\begin{aligned}&K_L \rightarrow \pi ^+\mu ^-\bar{\nu }_{mu}: 1+f^4(\varTheta ) = \frac{G_F^2}{G_{\mu }^2}[1-(\theta \theta ^{\dagger })_{\mu \mu }], \end{aligned}$$
(61)
$$\begin{aligned}&K^- \rightarrow \pi ^0\mu ^-\bar{\nu }_{mu}: f^5(\varTheta ) = f^4(\varTheta ), \end{aligned}$$
(62)
$$\begin{aligned}&\frac{\tau ^- \rightarrow K^-\nu _{\tau }}{\tau ^- \rightarrow \pi ^-\nu _{\tau }}: 1+f^6(\varTheta ) = 1+(\theta \theta ^{\dagger })_{\mu \mu }, \end{aligned}$$
(63)
$$\begin{aligned}&\tau ^- \rightarrow \pi ^-\bar{\nu }_{\tau }:\nonumber \\&\quad 1+f^7(\varTheta ) = 1+(\theta \theta ^{\dagger })_{ee}+(\theta \theta ^{\dagger })_{\mu \mu } -(\theta \theta ^{\dagger })_{\tau \tau }, \end{aligned}$$
(64)
$$\begin{aligned}&\tau \rightarrow s : 1 + f^8(\varTheta ) \nonumber \\&\quad =1 + 0.2(\theta \theta ^\dagger )_{ee} - 0.9(\theta \theta ^\dagger )_{\mu \mu } - 0.2(\theta \theta ^\dagger )_{\tau \tau }. \end{aligned}$$
(65)

The situation is simpler in the determination of the \(|(V_{CKM}^{exp})_{ud}|\) element as the uncertainty is dominated by the superallowed \(0^+ \rightarrow 0^+\) nuclear beta transitions measurements, which need to be modified accordingly to:

$$\begin{aligned} 0^+ \rightarrow 0^+: 1+f^1(\varTheta ) = \frac{G_F^2}{G_{\mu }^2}[1-(\theta \theta ^{\dagger })_{ee}] \end{aligned}$$
(66)

The experimentally measured values of \(|(V_{CKM}^{exp})^i_{us}|\) in each of the decay processes above are listed in Table 4, and the value of \(|(V_{CKM}^{exp})_{ud}|=0.97417\pm 0.00021\) is taken from the world average [134].

Table 4 Experimental values of \((V_{CKM})_{us}\) and the average value of \((V_{CKM})_{ud}\) used in the calculation of the CKM likelihood. The factor \(f_+(0)=0.959\pm 0.005\) is taken from [135]

We thus construct the likelihood for this constraint from a chi-squared function, \(2\ln \mathcal {L} = -\chi ^2\), where the discriminant measures the deviation of the true value \((V_{CKM})_{us,ud}\) and the experimental measurements \((V_{CKM}^{exp})^i_{us,ud}\), and is given by

$$\begin{aligned} \chi ^2 =&\sum _{i=1}^7 \frac{\left( (V_{CKM}^{exp})_{us}^i - (V_{CKM})_{us} \cdot (1+f^i(\varTheta ) \right) ^2}{\sigma _i^2} \nonumber \\&+ \frac{\left( (V_{CKM}^{exp})_{ud} - (V_{CKM})_{ud} \cdot (1+f^1(\varTheta ) \right) ^2}{\sigma ^2} \;. \end{aligned}$$
(67)

Due to the unitarity relation in Eq. (56), the value \((V_{CKM})_{ud}\) is obtained from \((V_{CKM})_{us}\) for every parameter point, and thus the only free floating parameters are the value of \((V_{CKM})_{us}\) and the active-sterile mixing matrix, \(\varTheta \). For simplicity, and since this is the only constraint to depend strongly on the value of \((V_{CKM})_{us}\), we optimise on its value for each \(\varTheta \), which removes the necessity of making \((V_{CKM})_{us}\) part of the scanning model. This approach is similar to the discussion in [30], but we improve upon it by optimising on the true value \((V_{CKM})_{us}\), including the \(\varTheta \) corrections, for each parameter point, rather than the value measured experimentally.

3.2.5 Neutrinoless double-beta decay

Double-beta decay refers to the decay of two neutrons into two protons while emitting two electrons and two anti-neutrinos. In case of neutrinos having a Majorana nature, lepton number would be violated and neutrinoless double-beta decay (\(0\nu \beta \beta \)) induced. Besides the exchange the light neutrinos, the exchange of RHNs is similarly possible and would alter the expected effective neutrino mass \(m_{\beta \beta }\). The effective mass is constrained by half life measurements of \(0\nu \beta \beta \) decay. The most stringent limits are currently set by the GERDA experiment (Germanium) [136] with \(m_{\beta \beta }<0.15-0.33\;\text {eV}\) (90% CL), and KamLAND-Zen (Xenon) [137], \(m_{\beta \beta }<0.061-0.165\;\text {eV}\) (90% CL). The effective mass \(m_{\beta \beta }\), can be theoretically evaluated in term of the mixings and masses of the light and right handed neutrinos [138]

$$\begin{aligned} m_{\beta \beta } = \Bigg |\sum _i{(U_{\nu })_{ei}^2 m_i}+\sum _I{\varTheta _{eI}^2 M_I f_A(M_I)}\Bigg |. \end{aligned}$$
(68)

Hereby, the first term denotes the contribution from LHNs, the second the one from RHNs. With a typical momentum exchange of around 100 MeV in \(0\nu \beta \beta \) decay, RHNs with a mass above this threshold participate in the process only virtually. This suppression is taken into account by the factor [138]

$$\begin{aligned} f_A(M) \approx \frac{p^2}{p^2+M^2}. \end{aligned}$$
(69)

The typical momentum exchange \(p^2\) depends not only on the specific isotope in consideration but is also subject to the theoretical model in which the constraints are derived and the value of the nucleon axial-vector constant. An overview is given in [139]: For our analysis, we use the “Argonne” model and the lower of the two values for \(p^2\) (quenched), which yields the most conservative constraints: \(\sqrt{\langle p^2\rangle } = 178 \; \text {MeV}\) for xenon, and \(\sqrt{\langle p^2\rangle } = 159 \; \text {MeV}\) for germanium. A more dedicated analysis of the impact of different limits due to nuclear uncertainties is beyond the scope of this work. Since we are focusing on profile likelihood for our results, this approach is largely equivalent to profiling over systematic uncertainties assuming a flat prior that spans the entire range of values \(\langle p^2\rangle \) in Ref. [139]. For our analysis we use the experimental values, as stated above, as one-sided Gaussian likelihoods, choosing the higher of the two values in order to remain conservative.

3.2.6 Big Bang Nucleosynthesis

If RHNs decay shortly before or during BBN, the typical energy of decay products, here \(\sim M_I \ge 50\mathrm \; MeV\), is significantly higher than the plasma temperature at that time, \(\sim 100\) keV. Therefore, either by dissociating formed nuclei, or by causing deviations from thermal equilibrium, they will affect the abundances of primordial elements, which are however observationally well constrained. The requirement that the RHN decay happens sufficiently early enough before BBN implies an upper limit on the lifetime (\(\tau _I\)) of RHNs, or equivalently, a lower bound on the mixing \(U_I^2\) [140]. However, in the presence of multiple RHN species, BBN cannot constrain individual mixing angles \(U_{\alpha I}^2\) (22) but only the total mixing \(U_I^2\) (23).

We consider leptonic decay channels for all RHNs masses, when kinematically allowed, as well as hadronic decays to mesons and leptons. As shown in [141], for low masses the hadronic decay width is dominated by channels with a single meson and a lepton, while for masses above the hadronisation scale, \(\varLambda _{\text {had}} \sim 1\) GeV, it can be approximated by computing the decay to free quarks. The decay width for each topology is listed in Appendix B.2, with expressions and values for the decay constants taken from  [14, 141,142,143,144], along with a detailed comparison of the various expressions.

In the current study, we require the lifetime of each RHN to be less than 0.1 s [145], which is implemented in the likelihood as a step function. In principle, this limit can be weakened if the lightest active neutrino has a mass \(< \mathcal {O}(10^{-3})\) eV, since the RHNs do not necessarily thermalize in this case [146]. We leave, however, the implementation of refined BBN constraints in GAMBIT for future work. Note that a lifetime bound that is stronger by a factor of two would lead to proportionally stronger constraints on the total mixing \(U_I^2\).

3.3 Direct RHN searches

Different experiments search with various approaches directly for RHNs. One can distinguish between three types: peak searches (PIENU), searches at beam dump experiments (PS-191, CHARM, E949, NuTeV), and searches at \(e^+ e^-\) or pp colliders (DELPHI, ATLAS, CMS).

One possibility to look for RHN, is to search for peaks in the lepton energy spectrum of a meson decay. If, for example, a meson of mass \(m_X\) decays into an RHN of mass \(M_I\) and an electron/muon with mass \(m_{l_{\alpha }}\), this peak will be approximately at

$$\begin{aligned} E_{peak} \simeq \frac{m_X^2 + m_{l_{\alpha }}^2 - M_I^2}{2m_X}\;. \end{aligned}$$
(70)

Even in situations where backgrounds are sizeable, a peak search can hence be used to impose constraints on the mixing.

In beam dump experiments, the large background signal that is usually present near the target hinders the detection of charged particles that are produced along with the RHNs. On the other hand, RHNs with mass below the D meson scale can be long-lived enough to travel macroscopic distances. Looking for their charged decay products some distance away from the target leads to (almost) background-free experimental situations.

In collision experiments (\(e^+ e^-\) or pp), vector bosons or mesons get produced that subsequently can decay leptonically. The bounds on these processes are then able to constrain the corresponding active-sterile mixing angles in a certain mass range.

To implement the direct detection constraints as likelihoods, we follow two different approaches, depending on the information that is provided in each study. Firstly, some of the experiments found no signal events and had no background counts after cuts (DELPHI, CHARM, PS191 and NuTeV). In this case, since the processes in the experiments are essentially Poissonian, we construct the likelihood (to observe n events) as a Poisson distribution. The number of expected counts, \(\mu \), is a function of the RHN masses and mixings, i.e. \(\mu = \mu (M_I, U_{{\alpha }I}^4)\) (assuming the experiment does so as well, the fourth power takes both production and decay of RHNs into account). For expected \(\mu \) events and background b, the likelihood is:

$$\begin{aligned} \mathcal {L}(n | \mu ) = (\mu +b)^n\frac{e^{-(\mu +b)}}{n!}\,. \end{aligned}$$
(71)

With no reported detections (\(n = 0\)) and background cuts reducing b to approximately zero,

$$\begin{aligned} \ln \mathcal {L}(n=0 | \mu ) = -\mu \,. \end{aligned}$$
(72)

To connect \(\mu \) with our model parameters, we use the fact that the expected signal counts are proportional to the LHN-RHN mixing, \(\mu \propto U_{{\alpha }I}^4\). The factor of proportionality is set to reproduce the results from the experimental papers (assuming that these limits are based on the common Feldman-Cousins procedure [147], where e.g. a 95% CL upper limit would correspond to an expected number of signal counts of \(\mu =3.09\)).

On the other hand, for the experiments which either quote non-zero signal events and/or backgrounds, or if this information is ambiguous (CHARM (\(\nu _{\tau }\) re-interpretation), PIENU, ATLAS and E949), we model the constraint likelihood as Gaussian upper limits, i.e. we model them as half-Gaussians with zero mean and error set according to the confidence level at which the results are presented. For example, in the case of an experiment that presents limits at \(90\%\) CL, for a half Gaussian, this lies within \(1.28\sigma \) of the mean.

It is worth noting that collider experiments often use simplified model assumptions to compute the confidence level intervals presented in their results. Since we use these to construct our likelihoods, we are incorporating these assumptions as well, in spite of the fact that our confidence intervals are computed by profiling over the multidimensional parameter space. Given that a full collider simulation is beyond the scope of this study, we employ the provided simplified model limits as given. We acknowledge, however, that the true limits may be slightly weaker due to, e.g a reduction of the production cross-section, and we defer the exploration of the differences between the collider predictions of simplified and full models to future work.

3.3.1 PIENU

The PIENU experiment [148] sought to detect RHNs in the mass range of 68–129 MeV by searching for peaks in the energy spectrum of the decay process \(\pi ^+ \rightarrow e^+ \nu \). It was, hence, sensitive to the mixing \(|\varTheta _{eI}|^2 \equiv U_{eI}^2\) and \(\mu \) in Eq. (72) is also taken to scale as \(U_{eI}^2\) in our analysis. Although no peaks were found, exact information on the number of background events is unavailable. Further, production processes in peak searches are, in general, unaffected by the Majorana/Dirac nature of the RHNs; hence, no correction is necessary here.

The constraints on \(U_{eI}^2\) are at 90% CL, so it is implemented in GAMBIT as a half-Gaussian with zero mean and error set at 1.28\(\sigma \).

After our analysis was complete we became aware of the slightly stronger updated constraints presented in Ref. [149], which are not included in our scan.

3.3.2 PS-191

This experiment [150] was designed for the purpose of detecting neutrino decays. RHNs would be produced via either of the following mechanisms: \(\pi ^+/K^+ \rightarrow e^+ \nu _e\), or \(\pi ^+/K^+ \rightarrow \mu ^+ \nu _{\mu }\), and would then decay via \(\nu _R \rightarrow \mu ^- e^+ \nu \), \(\nu _R \rightarrow e^- \mu ^+ \nu \), \(\nu _R \rightarrow e^- \pi ^+\), \(\nu _R \rightarrow \mu ^- \mu ^+ \nu \), \(\nu _R \rightarrow \mu ^- \pi ^+\) or \(\nu _R \rightarrow e^- \pi ^+ \pi ^0\). Thus, PS-191 could constrain the quantities \(U_{eI}^4\) and \(U_{\mu I}^4\) for RHNs with a mass between \(20-450\) MeV.

Having found no signal or background events, it placed constraints on these quantities at 90% CL. We deviate from the original analysis in two ways. The first is necessitated by the fact that in the original analysis, the constraints were derived under the assumption that the RHNs interact only through the charged current. In [38], these limits were re-interpreted with the inclusion of neutral current interactions. Thus, instead of the signal count being proportional to the fourth power of the relevant flavour mixing, it is proportional to \(U_{e/\mu I}^2 \times \sum _{\alpha } c_{\alpha }U_{\alpha I}^2\), with the coefficients given by

$$\begin{aligned} c_e&= \frac{1+4\sin ^2\theta _W+8\sin ^4\theta _W}{4} \nonumber ,\\ c_\mu ,c_\tau&= \frac{1-4\sin ^2\theta _W+8\sin ^4\theta _W}{4}. \end{aligned}$$
(73)

We use these revised bounds here. The limits are encoded in likelihood form as in Eq. (72), with the aforementioned proportionality factor being 2.44.

3.3.3 CHARM

RHNs were searched for in CHARM [151] using two strategies, one with a neutrino beam from dumping protons on copper (BD) and another using a wide-band neutrino beam (WBB) from primary protons.

In BD, the production of RHNs was assumed to occur through the decay of D mesons. They would then decay via \(\nu _R \rightarrow e^+ e^- \nu _e\), \(\nu _R \rightarrow \mu ^+ \mu ^- \nu _{\mu }\) or \(\nu _R \rightarrow e^+ \mu ^- \nu _e\), \(\mu ^+ e^- \nu _{\mu }\) (and the anti-particle counterparts) and the decay products were looked for.

In WBB, RHN production was assumed to occur via neutrino-nucleus neutral current scattering \(\nu _{\mu } N \rightarrow \nu _R X\). The subsequent decay \(\nu _R \rightarrow \mu R\), R representing hadrons, was then searched for. The limits from the WBB analysis are, however, weaker than those exerted by other experiments in the same mass range, and are not considered here.

The BD analysis yielded no candidate events or background and hence placed limits on \(U_{eI}\) and \(U_{\mu I}\) at 90% CL. Further, the original analysis assumed the possibility of RHNs interacting solely via the charged current; we use the results re-interpreted after the inclusion of neutral current interactions [38] as discussed in Sect. 3.3.2, i.e. the signal count is proportional to \(U_{e/\mu I}^2 \times \sum _{\alpha } c_{\alpha }U_{\alpha I}^2\) and once again use Eq. (72) to represent the likelihood, with the proportionality factor being 2.44.

In [152], the data from the CHARM experiment was re-analyzed assuming that RHNs mix solely with tau-flavoured leptons, and was able to place limits at 90% CL on \(U_{\tau I}\), which we implement as a half-Gaussian with zero mean and error set at 1.28\(\sigma \).

Dirac RHNs were assumed in both the original and tau-specific analyses, so the limits presented are also re-scaled by dividing them by \(\sqrt{2}\).

3.3.4 E949

In this experiment [153,154,155], RHNs were searched for in the decay of kaons produced in a beam dump: \(K^+ \rightarrow \mu ^+ \nu _R\). Constraints on \(U_{\mu I}\) were placed at 90% CL in the mass range \(175-300\) MeV; we also divide the limits by a factor of \(\sqrt{2}\) to account for the Majorana nature of RHNs in our model.

The likelihood is modeled as a half-Gaussian with zero mean, error set at 1.28\(\sigma \) and \(\mu \propto U_{\mu I}^2\).

3.3.5 NuTeV

The NuTeV experiment [156] searched for RHNs through their decay into the following final states: \(\mu e \nu \), \(\mu \mu \nu \), \(\mu \pi \) and \(\mu \rho \). They were assumed to be produced in the decay of mesons. \(90\%\) CL limits on \(U_{\mu I}\) were placed for RHNs with a mass between 0.25 and 2 GeV.

Information about the assumed Dirac or Majorana nature of the RHNs is not present, so we take the conservative route and presume Majorana RHNs were considered in the analysis. No candidate events or background were detected, so the likelihood is modeled as in Eq. (72), with a proportionality factor of 2.44 and \(\mu \) scaling as \(U_{\alpha I}^4\).

3.3.6 DELPHI

At DELPHI [157], \(e^+ e^- \rightarrow Z^0 \rightarrow \nu _R \bar{\nu }\) was the dominant RHN production mechanism; the process \(Z^0 \rightarrow \nu _R \bar{\nu _R}\) would be suppressed due to the additional \(U^2\) factor. The products of the RHN decaying via the weak and neutral current were then searched for, according to: \(\nu _R \rightarrow \nu Z^{*}\), \(Z^{*} \rightarrow \nu \bar{\nu }\), \(l \bar{l}\), \(q \bar{q}\) or \(\nu _R \rightarrow l' W^{*}\), \(W^{*} \rightarrow \nu \bar{l},q \bar{q'}\). DELPHI could constrain \(\varTheta _{eI}\), \(\varTheta _{\mu I}\) and \(\varTheta _{\tau I}\) for RHNs having a mass between \(0.5-80\) GeV.

Since the RHNs could have existed long enough to travel macroscopic distances of upto 100 cm, different signatures had to be considered and the analysis was split to tackle the short- and long-lived cases separately.

In the short-lived RHN case, depending on the particle mass, two signatures were looked for. For masses less than about 30 GeV, due to the large boost received by the RHNs, the signature would be a monojet. Background coming from leptonic Z boson decays or \(\gamma \gamma \) processes were accounted for. Higher masses open the decay channel into \(q\bar{q}\) (and a lepton, depending on the channel), and the signature in this case would be two acollinear jets which are also acoplanar with respect to the beam axis. Most of the background in this scenario came from hadronic Z decays with missing energy; a neural network was used to remove all of them from the final data.

Longer-lived RHNs were looked for using displaced vertices and calorimeter clusters. The former was useful in tracking RHNs with an intermediate lifetime; however, a cluster finding algorithm along with vertex reconstruction did not find any signals. Calorimeter clusters were used to detect the longest-lived RHNs, whose decay products would interact with the outermost layers/components of the experimental setup: the signature would be a cluster of hits in a small angular region coincident with the beam collision, which could be traced back to the initial interaction point.

The analysis was carried out assuming Majorana RHNs and yielded one candidate event and no background events. In our analysis, this means the proportionality factor is 3.09 and \(\mu \) scales as \(U_{\alpha I}^4\).

A caveat must be mentioned here: the DELPHI analysis presented bounds on the mixing in a flavour-independent manner: the limit on \(U^2\), as presented in the paper, applies equally to \(U_e^2\), \(U_\mu ^2\) and \(U_\tau ^2\), as they mention. In the mass range under consideration, the mass of the tauon will, of course, influence the strength of the limit and, as they quote, the presented bounds become weaker for masses below \(\sim 4\) GeV. However, the extent of the kinematic suppression due to the tauon mass is not quantitatively discussed; we use the limits as is, noting that it is highly likely that NA62 will subsume these bounds in the near future [158].

3.3.7 ATLAS

The process relevant for RHN production in ATLAS [159] is \(pp \rightarrow (W^{\pm })^{*} \rightarrow l^{\pm } \nu _R\). The RHNs were taken to be heavier than the W boson, allowing it to decay to a lepton a W boson: \(\nu _R \rightarrow l^{\pm } W^{\mp }\); the W boson would then decay predominantly into a quark–antiquark pair, and the signature of this decay chain was searched for, with either two electrons or muons in the final state.Footnote 9 Hence, in our analysis, \(\mu \propto U_{\alpha I}^4\), \(\alpha = e, \mu \). The original analysis was carried out under the assumption of Majorana RHNs, so no additional correction is necessary.

The analysis placed 95% CL limits on the two mixing angles in the mass range of 100–500 GeV. Details on the number of observed/expected events and background is available and could be cast into a likelihood function combining Poissonian and Gaussian errors; however, we find that implementing the limits in GAMBIT as a half-Gaussian with zero mean and error set at 1.64\(\sigma \) reproduces the experimental limits well enough for the purpose of a global fit.

3.3.8 CMS

With the LHC having run with a center-of-mass energy of 13 TeV, the CMS detector searched for different event signatures of the same process as ATLAS. \(95\%\) CL limits were calculated for \(U_{eI}\) and \(U_{\mu I}\) for RHNs with mass between 1 GeV and 1.2 TeV [160].

As before, Majorana RHNs were assumed in the analysis, and our implementation of the limits mirrors that of ATLAS.

Note that updated bounds from ATLAS [161] and CMS [162, 163] have been released, but are not included, since these papers came out after our scans were completed. However, the new bounds from ATLAS are comparable to those from DELPHI, and the newer dilepton search from CMS only produces stronger bounds for RHN masses above \(\sim 500\) GeV, which is beyond our range of study.

3.3.9 LHCb

LHCb has performed direct searches for heavy neutrinos. The most recent results [164] were derived with an inconsistent model and have been corrected in Ref. [165]. They are subdominant in the mass range considered here. In Ref. [166] the results of a generic long lived particle search [167] has been re-interpreted in the context of heavy neutrinos. We do not include these results here because the conservative interpretation does not yield stronger bounds than the ones we include.

3.3.10 Other experiments

Further measurements at Borexino [168, 169], Bugey [170], SIN [171], BEBC [172], JINR [173], TRIUMF [128, 174], OKA [175, 176], ISTRA [177], NOMAD [178], NA62 [179], Belle [180], KEK [181, 182] and T2K [183] have both published constraints on RHNs. We do not indculde them here because, with the present data, they are subdominant or cover a different mass range.

4 Scanning strategy and parameter ranges

In this work, we focus on the exploration of the RHN parameter space using frequentist statistics. Our main goal is to establish the ranges of RHN parameters that are not yet explored by experiments, and a frequentist approach delivers a suitable and prior-independent method. We are dealing with a high dimensional parameter space, which we have to project into two-dimensional plots. To this end, the central quantity of interest is the profile likelihood,

$$\begin{aligned} \ln \mathcal {L}_\text {prof}(\theta _1, \theta _2) = \max _{\mathbf {\eta }} \ln \mathcal {L}(\theta _1, \theta _2, \mathbf {\eta })\;. \end{aligned}$$
(74)

which is, for fixed parameters of interest \(\theta _1\) and \(\theta _2\), the maximum value of the (log-)likelihood function that can be obtained when maximizing over the remaining parameters \(\mathbf {\eta }\).

We emphasize that the main goal of this work is to establish conservative constraints on RHN mixings and masses by profiling over all relevant parameters. We do not perform a proper goodness-of-fit analysis to experimental data, which would require sampling of experimental results; given the large range of included experimental results and the sometimes limited knowledge about individual experiments this is beyond the scope of the current work. Instead, likelihoods are included in a approximate fashion that allows to reproduce published experimental results, and we use Wilks’ theorem [184] to approximate the sampling statistics of log likelihood ratios and estimate confidence contours when necessary.

Our scanning strategy is designed in order to explore the complex parameter space of the RHN model such that we obtain reliable results for the projections shown in this work. To this end, we perform a large set of scans with different settings which we then merge into a single dataset. We study the normal (NH) and inverted (IH) hierarchy independently, in order to avoid artificially favouring one over the other due to the different normalisation of the active neutrino likelihoods (c.f. Sect. 3.1). Hence, we make independent scans for each of the neutrino mass hierarchies, normal and inverted, for the full set of scans described below.

4.1 Parameters and priors

The parameter ranges and priors for the original scans can be seen in Table 5. We emphasize that ‘priors’ do here not correspond to priors in the Bayesian sense, but rather determine the efficiency with which different regions of the parameter space are explored. For convergent scans, the results are prior-independent. We have chosen to split the complex angles \(\omega _{ij}\) into their real and imaginary parts. The active-sterile mixings depend strongly on the imaginary parts of \(\omega _{ij}\) \(\left( \varTheta ^2 \sim \frac{\exp (2\text {Im}(\omega ))}{M}\right) \) and large values of Im\(\omega \) produce mixings that are too large to pass any constraints, so we take a conservative range Im\(\omega \in [-15,15]\), and also pre-emptively disallow choices that lead to \(|\varTheta |_{ij}^2 > 1\). As discussed in 2.5, a condition for an approximate \(B - \bar{L}\) symmetry to be realized is for two RHNs to have almost degenerate masses, which extends the range of the mixings so that they can be probed by experiments. This provides motivation for using a logarithmic prior on the RHN masses, also allowing the scanner to sample better the region close to the limits of the most constraining experiments/observables.

Table 5 Parameter ranges adopted for the full model scans, with \(+\) (−) for normal (inverted) hierarchy of the active neutrino masses

The C-I parametrisation, as defined in Sect. 2.4, together with the particular parametrisation choice of R in Eq. (30), was found to not fully cover the entire parameter space. To circumvent this and ensure that all possible couplings are covered by the scans, we introduce an additional parameter to the scan \(R_\mathrm{{order}}\) with discrete values [1, 6] corresponding to each of the possible permutations of the definition of R in terms of \(R^{ij}\). This allows full coverage of the coupling space and, since the likelihood is conceptually independent of the order in R (and confirmed by the data), it ensures an uniform distribution of values in the parameter \(R_\mathrm{{order}}\).

Out of the active neutrino parameters, only \(\alpha _1\) and \(\alpha _2\) are unconstrained by oscillation data, hence they are allowed to vary freely from 0 to \(2\pi \) with flat priors. The ranges for the other neutrino phases and angles are taken as the widest of the \(3\sigma \) ranges, for normal or inverted hierarchy, from the NuFit collaboration [77], also with flat priors. The mass of the lightest active neutrino, \(m_{\nu _0}\), has a definite impact on the lower bound of \(U_I^2\) (23) [30], so we choose a logarithmic prior, which enables us to examine this impact in greater detail than a flat prior would allow and keeps the BBN limits relevant [146]. The upper limit on \(m_{\nu _0}\) is chosen as the broad cosmological bound given by Planck [97], \(\sum m_\nu < 0.23\) eV.Footnote 10 In order to better fit the active neutrino data, the mass splittings \(\varDelta m^2_{21}\) and \(\varDelta m^2_{3l}\) are chosen as scan parameters, where \(l = 1\) and \(\varDelta m^2_{3l} > 0\) for normal hierarchy and \(l = 2\) and \(\varDelta m^2_{3l} < 0\) for inverted hierarchy.

Since the construction of the mixing matrix in the C-I parametrisation depends on \(m_H\) (1-loop correction), as seen in 2.4, we take \(m_H\) as a nuisance parameter with a Gaussian distribution around its averaged measured value [101] and a flat prior. Other SM parameters are fixed to their PDG values [101].

4.2 Targeted scans

We encountered a number of challenges while sampling the full RHN parameter space. One reason is connected to the behaviour of the likelihood function over the whole parameter range. The adopted scanning algorithm (Diver, see below for details) is designed to find regions of maximum likelihood across the parameter space. However, as we will discuss later when we study the effect of each individual observable, most constraints have flat contributions to the likelihood in a large portion of the parameter space. Hence, the scanner often does not fully explore large regions with equal or worse likelihood. This happens especially near the experimental bounds. Furthermore, although high couplings are possible between active and sterile neutrino sector, they often lie in the symmetry protected regime, as described in Sect. 2.5 and/or require severe fine-tuning of the parameters. Again, exploring these regions turned out to be challenging.

Therefore, we designed and performed a large set of targeted scans to fully saturate the experimental bounds, the list of which can be found in Table 6. The design strategies we adopted for these targeted scans can be summarised as follows.

First, all targeted scans were performed using a differential RHN model, where the parameter \(M_2\) is replaced by \(\varDelta M_{21}\), with a logarithmic prior. This allows the exploration of the symmetry protected region, with near degenerate masses for two right-handed neutrinos.

Most of the experimental bounds occur at high couplings, thus in order to encourage the scanner to explore the high coupling regions, we added an artificial likelihood to the scan to drive the scan to the unexplored boundaries. To saturate the experimental bounds for each coupling \(U_{\alpha I}^2\), \(\alpha =e,\mu ,\tau \), different targeted scans were performed using this coupling slide likelihood on each of the couplings, of the form \(s\log U_{\alpha I}^2 + m \log M_I\). Table 6 shows the parameter that is optimised in each scan, \(\alpha \), and the coefficients, (sm). This contribution was later removed from the data in the postprocessing stage.

The targeted scans were further split along the \(M_I\) axis following the limits of the various experimental constraints (mostly from direct searches). This ensures that each coupling (with the selection above) saturates the most relevant experimental upper bound in each mass range. Additionally, some scans used different values of \(\varDelta M_{21}\) and/or \(m_{\nu _0}\) to further force the scan into fine-tuned regions of parameter space. The ranges used for \(M_I\), \(\varDelta M_{21}\) and \(m_{\nu _0}\) for each scan are specified in Table 6.

A similar strategy was used to saturate the BBN bound at low couplings. Three scans were performed for each hierarchy, with slide coefficients \((s,m)=(-0.5,-0.5)\) on each coupling \(U_{\alpha I}^2\), \(\alpha =e,\mu ,\tau \) . To further optimise on low couplings, these scans were performed for fixed \(m_{\nu _0} = 10^{-10}\) and a narrow range on Im\(\omega \) \(\in [-0.5,0.5]\). With these settings the BBN bound was fully saturated in the explored mass range.

We found that some of the experimental likelihoods provide positive contributions to the total likelihood in specific regions of the parameter space. This forced the scan towards those regions, leaving others unexplored. Although this is a rather interesting feature, and will be discussed in detail later, it prevented a thorough exploration of the full parameter space. We thus chose to remove the likelihood contribution of \(R^K_{e\mu }\) from the total likelihood that drives the scan, adding it later in postprocessing. Other likelihoods with positive contributions, \(\varGamma _\mathrm{{inv}}\), CKM and \(R^\tau _{e\mu }\), tended to force the scan towards large \(U_{\tau I}^2\) couplings. Although desirable to saturate the limits, this also left regions with low \(\tau \) coupling undersampled. Thus, a cut on the coupling \(U_{\tau I}^2\) was enforced in some scans to fully sample all regions.

Table 6 Set of targeted scans performed for normal (N) and inverted (I) hierarchy in addition to the full parameter scans. Parameters not shown in this table are taken as in Table 5

The adopted strategy for scanning was driven by the need to fully sample the parameter space. The results from all the diverse scans were combined into a single dataset after some postprocessing (see below). This does not pose a problem for the statistical interpretation, since we are interested in the profile likelihood, which only becomes more accurately estimated when adding additional chains.

Fig. 1
figure 1

Profile likelihood in \(M_I\) vs \(U_{eI}^2\) plane for normal (left) and inverted hierarchy (right). Tables with the 90% and 95% CLs for both hierarchies can be found in Zenodo [186]

4.3 Scanning framework

To perform the detailed scans, we make use of the GAMBIT framework, as described in Appendix A, and the differential evolution scanner Diver, version 1.0.4 [40], which is a self-adaptive sampler, capable of sampling the profile likelihood more efficiently than other scanners. We choose a population size of = 19200 and a convergence threshold of = \(10^{-10}\). After some tests, we have concluded that the aggressive \(\lambda \)jDE setting in Diver provides an improvement on the sampling of the parameter space, since it is more suited for sampling fine-tuned regions.

These scanner settings, including the very low convergence threshold, together with the scanning strategy described above, ensure a thorough exploration of the parameter space, albeit at the price of CPU time. Despite the fact that none of the observables used required heavy computation or simulations, most scans took between 2 and 10 hours of running time on a large number of supercomputer cores varying between 250 and 780. All tests and scans were carried out across several supercomputer facilities, including the MareNostrum supercluster in Barcelona, Marconi in Bologna, LISA/Surfsara through the University of Amsterdam and Prometheus in Krakow.

4.4 Data postprocessing

Upon completion of the scans, a number of postprocessing tasks were performed on the data to prepare it for plotting. As previously mentioned, the first of these tasks was to remove the artificial coupling slide likelihood used to drive the scans to high couplings.

Due to the large amount of scans performed and the low convergence threshold used, the size of the samples surpassed 1TB for each hierarchy, rendering them unmanageable for most plotting routines. We hence performed a few operations on the scan results prior to combining them. With the target of showing profile likelihood plots in the \(M_I\) vs \(U_{\alpha I}^2\) planes, we hence extracted a subset of the data points optimised in these planes, with a resolution of \(10^{-5}\). Since most scans were targeted to saturate the limits for a particular coupling (see Table 6) we perform this reduction of the data in the respective mass vs coupling two-dimensional planes. The combined set will hence be optimised for all couplings. Additionally, and independent reduction of the data is performed on the planes \(m_{\nu _0}\) vs \(U_{\alpha I}^2\), since we intent to study the effect of \(m_{\nu _0}\) cuts on the coupling limits.

The flavour label of the heavy neutrinos is arbitrary, and the experimental constraints on a heavy neutrino with a given mass cannot depend on the labelling. However, for reasons explained in more detail in appendix C, the scanning strategy outlined in Sect. 4.2 introduces a bias that suggests that the constraints differ for \(N_1\), \(N_2\) and \(N_3\). Hence, to remove this bias in the labels, after combining the reduced datasets for all the scans, we conduct a symmetrization procedure over the combined datasets. We therefore symmetrize over \(M_I\) as well as \(U_{\alpha I}\), which will increase the size of the datasets six fold.

Lastly, in order to compare with the \(n=2\) case, two further datasets were obtained, for normal and inverted ordering, where the data points are required to lie in the symmetry protected region.

Out of the incalculable amount of data points we collected through our scanning procedures, a total of 40.7 million valid data samples were used for plotting. Of which 11M correspond to normal hierarchy and 10M for inverted hierarchy, optimised on \(M_I\) vs \(U_{\alpha I}^2\) planes, and 9.9M for normal and 9.7M for inverted hierarchy, optimised on \(m_{\nu _0}\) vs \(U_{\alpha I}^2\) planes. The datasets with points in the symmetry protected region have over 71k and 20k valid data samples for normal and inverted hierarchy, respectively. These samples can be found in Zenodo [186].

Fig. 2
figure 2

Profile likelihood in \(M_I\) vs \(U_{\mu I}^2\) plane for normal (left) and inverted hierarchy (right). Tables with the 90% and 95% CLs for both hierarchies can be found in Zenodo [186]

Fig. 3
figure 3

Profile likelihood in \(M_I\) vs \(U_{\tau I}^2\) plane for normal (left) and inverted hierarchy (right). Tables with the 90% and 95% CLs for both hierarchies can be found in Zenodo [186]

Fig. 4
figure 4

Profile likelihood in \(M_I\) vs \(U_I^2\) plane for normal (left) and inverted hierarchy (right). Overlaid are the lowest limits for various values of \(m_{\nu _0}\) [42]. Tables with the 90% and 95% CLs for both hierarchies can be found in Zenodo [186]

Fig. 5
figure 5

Profile likelihood in \(M_I\) vs \(|U_{eI}U_{\mu I}|\) plane for normal (left) and inverted hierarchy (right). Tables with the 90% and 95% CLs for both hierarchies can be found in Zenodo [186]

Fig. 6
figure 6

Profile likelihood in \(M_I\) vs \(|U_{eI}U_{\tau I}|\) plane for normal (left) and inverted hierarchy (right). Tables with the 90% and 95% CLs for both hierarchies can be found in Zenodo [186]

Fig. 7
figure 7

Profile likelihood in \(M_I\) vs \(|U_{\mu I}U_{\tau I}|\) plane for normal (left) and inverted hierarchy (right). Tables with the 90% and 95% CLs for both hierarchies can be found in Zenodo [186]

4.5 Capped likelihood

The figures in this article show the so-called capped profile likelihood (unless stated otherwise), which is defined in each of the scanned point to an equal or worse fit than the SM: \(\mathcal {L} = \min [\mathcal {L}_\mathrm{{SM}}, \mathcal {L}_\mathrm{{RHN}}]\). It can thus be interpreted as exclusion-only likelihood. Capped likelihoods have been used in previous studies, particularly in the context of collider searches [187, 188]. The rationale behind the use of this capped likelihood is the presence of positive (above SM) contributions to the log likelihood from various observables. Importantly, these ‘excesses’ would not show up as localized features in the total profile likelihood, as there is enough of freedom to add points in the \(M_I-U_{\alpha I}^2\) plane to find \(M_J\), \(J\ne I\) with values that would saturate the excess likelihood. Thus a very large fraction of the parameter points would have the maximum allowed likelihood from the combination of all excesses. This effect forces to separate the exclusion studies from the possible signal observation. Thus, in most of the paper, we use the capped likelihood to present parameter constraints. The excess likelihoods will be discussed separately in Sect. 5.4.

5 Results and discussion

5.1 General constraints on the RHN mass and mixing

The constraints are shown in Figs. 1, 2, 3 and 4 for the couplings \(U_{\alpha I}^2\) to the active neutrino flavours \(\alpha = (e,\mu ,\tau )\), as well as their combination \(U_I^2 = \sum _\alpha U_{\alpha I}^2\), as functions of the heavy neutrino masses \(M_I\). Here, the second index can refer to any of the heavy neutrino flavours \(I=(1,2,3)\), because their labelling is not physical. Figures 5, 6 and 7 show the combinations of couplings \(U_{\alpha I}U_{\beta I}\) with \(\alpha \ne \beta \). The allowed profile likelihood regions are flat for most of the parameter space, in particular for small couplings \(U_{\alpha I}^2\), and drop smoothly at high couplings following the relevant upper limits. The white lines around the experimental limits mark the 1\(\sigma \) and 2\(\sigma \) contours, which are estimated assuming Wilks’ theorem with 2 degrees of freedom.Footnote 11

The largest values of mixings \(U_{\alpha I}^2\) and \(U_{\alpha I}U_{\beta I}\) for all flavours are allowed for \(M_I\) above the masses of the weak gauge bosons. In this regime the direct searches at colliders are sub-dominant, and the heavy neutrino properties are primarily constrained from above due to electroweak precision observables, lepton flavour violation and CKM constraints. The upper limits on the couplings \(U_{\alpha I}^2\) and \(U_{\alpha I}U_{\beta I}\) within \(2\sigma \) of the highest likelihood for each hiearchy and flavour in the high mass region can be found in Table 7. It can be readily noticed that the upper limits for the \(\tau \) couplings is much larger than for the other two flavours, which can be understood because the limits from EWPO and LFV are stronger for e and \(\mu \) (see also Sect. 5.4). In particular, the combination \(U_{eI}U_{\mu I}\) has the smallest of upper limits, as shown as well in Fig. 5, due to strong constraints from LFV observables, specifically \(\mu \rightarrow e\gamma \) and \(\mu - e\) conversion (see Figs. 34 and 35 in Appendix  D).

Table 7 Upper limits on \(U_{\alpha I}^2\) and \(U_{\alpha I}U_{\beta I}\) within 2\(\sigma \) in the high mass region \(M_I > rsim 80\) GeV, for normal (N) and inverted (I) hierarchy

For \(M_I\) between the masses of the D mesons and the W boson the limits from direct searches dominate because the heavy neutrinos can be produced efficiently via the s-channel exchange of on-shell W bosons. In the range between the D meson masses and the W boson mass, the limits from the DELPHI [157] and CMS [160] experiments compete to impose the strongest bound.

Below the D meson mass the constraints on \(U_{e I}^2\) and \(U_{\mu I}^2\) are dominated by direct search constraints from fixed target experiments, in particular CHARM [151] and NuTeV [156] above the kaon mass, PS-191 [150] and E949 [155] between the pion and kaon mass and pion decay experiments at even lower masses. In this regime the global constraints on \(U_{e I}^2\) and \(U_{\mu I}^2\) are in good approximation given by the direct search constraints, as discussed in Sect. 5.2 and Figs. 8, 9 and 10. This is in contrast to the model with \(n=2\), where the global fits rule out a significant mass range below the kaon mass that appears to be allowed if one simply superimposes the direct constraints in the mass-mixing planes [27]. For \(U_{\tau I}^2\), the direct search constraints are much weaker, the limit from long-lived particle searches by DELPHI remains the most significant one in our scans. Figure 3 shows that direct searches become subdominant for the \(\tau \) coupling and the EWPO limit is saturated for a considerable range of masses below the kaon mass.

For masses below roughly 0.3 GeV the global constraints are stronger than the sum of their ingredients due to an interplay of the lower bound from BBN on the mixings, the upper bounds on \(U_{e I}^2\) and \(U_{\mu I}^2\) from direct searches and the constraints on the heavy neutrino flavour mixing pattern from neutrino oscillation data (discussed further below in Sect. 5.3). The latter disfavours large hierarchies amongst the couplings to individual SM flavours, though these constraints are weaker than in the model with \(n=2\) [27, 190]. This implies that upper bounds on combinations of \(U_{eI}^2\) and \(U_{\mu I}^2\) indirectly constrain \(U_{\tau I}^2\). The BBN constraint on the lifetime does not impose a constraint on any individual coupling \(U_{\alpha I}^2\), but requires at least some of them to be sizeable and practically translates into a lower bound on \(U_I^2\) that is visible in Fig. 4. Both, the BBN constraint and the constraint on the flavour mixing pattern (that will be discussed in more detail in Sect. 5.3 and is visible in Fig. 11) leads to the lower and upper bounds on \(U_{\tau I}^2\) that are visible in Fig. 3.

The upper bound on the total mixing \(U_I^2\) from the global constraints can roughly be identified with the bound on \(U_{\tau I}\) across the entire mass range as it is constrained the weakest. The lower bound is again given by the lifetime constraint from BBN. In addition, there is a lower bound from the requirement to explain the light neutrino oscillation data that depends on \(m_{\nu _0}\) and is therefore only visible if one imposes a cut on this unknown quantity. Our results agree with the analytic estimates made in Ref. [42], as will be discussed in Sect. 5.3, and are illustrated in Fig. 4.

Fig. 8
figure 8

Profile likelihood in \(M_I\) vs \(U_{eI}^2\) plane with \(M_I <10\) GeV and overlaid direct detection limits, for normal (left) and inverted hierarchy (right)

Fig. 9
figure 9

Profile likelihood in \(M_I\) vs \(U_{\mu I}^2\) plane with \(M_I < 10\) GeV and overlaid direct detection limits, for normal (left) and inverted hierarchy (right)

Fig. 10
figure 10

Profile likelihood in \(M_I\) vs \(U_{\mu I}^2\) plane with \(M_I < 0.4\) GeV and overlaid direct detection limits, for normal (left) and inverted hierarchy (right)

Fig. 11
figure 11

\(U_{\alpha I}^2/U_I^2\) (in percent) for different upper limits of \(m_{\nu _0}\) (see legend). Solid (dashed) lines delineate the 1\(\sigma \) (2\(\sigma \)) contours, for normal (left) and inverted hierarchy (right). As discussed in footnote 12, these constraints apply to those heavy neutrinos that can be found experimentally

5.2 Discussion of individual bounds

Figures 8 and 9 show explicitly the effect of direct searches on the upper limits of the e and \(\mu \) couplings in the mass range \(M_I \in [0.1, 10]\) GeV. Most of the limits shown are at 90%CL, with the exception of DELPHI at 95%. As expected, they lie between the 1\(\sigma \) and \(2\sigma \) contours. Some of the experimental limits, PS-191 and CHARM, do not directly constrain an individual coupling, but rather the combination \(U_{e/\mu I}^2 \times \sum _{\alpha } c_{\alpha }U_{\alpha I}^2\) (as mentioned in Sects. 3.3.2 and 3.3.3), with the coefficients \(c_\alpha \) from Eq. (73). As we profile over the other two couplings, the strongest limit for the \(\alpha \) flavour for these experiments would correspond to \((U_{I}^\mathrm{{exp}})^2 / c_\alpha \), with \((U_I^\mathrm{{exp}})^2\) being the reported limit by the experiment. Hence the former ratio is what is shown in the figures as the PS-191 and CHARM limits. As observed, \(U_{e I}^2\) is constrained by PS-191 and CHARM in the lowest and next-to-lowest mass regions, whereas they are superseded by the limits from E949 and NuTeV for \(U_{\mu I}^2\). In the lowest mass region for the \(\mu \) coupling it would appear that the E949 bound is in fact not saturated as the experimental limit falls below the data. This is however just an artifact of binning and interpolation in that region and the fact that the E949 limit is quite jagged. To illustrate this, we show in Fig. 10 a zoom into the lowest mass region from Fig. 9, where it can be seen clearly that the profile likelihood follows the limits of E949.

Neutrino oscillation data imposes very strong constraints on the parameter space and disfavours vast volumina in the 18 dimensional model parameter space. In the scenario with \(n=2\) this has a visible effect on the projections of the global constraints on the \(M_I\)-\(U_{\alpha I}^2\) planes [27], in particular for heavy neutrinos lighter than the kaon, where the interplay between neutrino oscillation data, BBN and direct searches rules our most values of \(M_I\). This effect strongly depends on the light neutrino mass ordering, and varying the light neutrino oscillation parameters within their experimentally allowed limits leads to visible differences [27]. In the present analysis with \(n=3\) the impact of neutrino oscillation data on the likelihoods in the \(M_I\)-\(U_{\alpha I}^2\) planes is much smaller. This is primarily visible in the third generation and the total mixing, cf. Figs. 3 and 4, where the dependence on the light neutrino mass ordering is weak. The reason is that the larger dimensionality of the parameter space with \(n=3\) makes it easier to avoid conflicts with direct or other indirect bounds. With \(n=2\) neutrino oscillation data also imposes strong constraints on the flavour mixing pattern [27, 190, 191]. These are also visible in the present analysis, cf. Figs. 11, 12 and 13, but can be avoided by choosing a sufficiently large value for \(m_{\nu _0}\). The constraints on the flavour mixing pattern strongly depend on the light neutrino mass ordering, and varying the light neutrino oscillation parameters within their experimentally allowed range has a considerable impact on the predictions.

As mentioned above, EWPO (including \(\varGamma _\mathrm{{inv}}\), \(m_W\), W-decays and \(s_w\)), LFV and CKM constraints become relevant for very large couplings and are thus the dominant limit in the high mass region, as well as a small region at small masses for the \(\tau \) coupling (Fig. 3). Besides providing constraints, in particular \(\varGamma _\mathrm{{inv}}\) and CKM observables are also responsible for the slight excesses in the total likelihood, which we will discuss in Sect. 5.4. Other constraints included in the analysis have little to no effect on the profile likelihoods as shown above.

Fig. 12
figure 12

Upper limits on the coupling ratios \(U_{\alpha I}^2/U_I^2\) within \(2\sigma \) as a function of the lightest active neutrino mass \(m_{\nu _0}\), for normal (left) and inverted hierarchy (right). As discussed in footnote 12, these constraints apply to those heavy neutrinos that can be found experimentally

Fig. 13
figure 13

Profile likelihood for \(U_{\alpha I}^2/U_I^2\) (in percent) in the limit of \(n=2\) in the symmetry protected region for normal (left) and inverted (right) hierarchy. For the detailed cuts we refer to the text

Among the leptonic decays, only \(R^K_{e\mu }\) has some impact on the likelihood, with a negative contribution at masses below 0.45 GeV. Both \(R^K_{e\mu }\) and \(R^\tau _{e\mu }\) show minor excesses in total likelihood, which again will be discussed later. Other lepton universality constraints have only little effect on the likelihood.

Neutrinoless double beta decay sets strong upper bounds on \(U_{eI}^2\) for generic parameter choices, which strongly disfavours considerable regions of parameter space. However, in the limit where lepton number is approximately preserved the expected signal from \(0\nu \beta \beta \) is suppressed. Since many of our parameter points are in this symmetry protected scenario, particularly at high couplings, the impact of this constraint on the likelihoods in the projection on the mass-mixing plane is minimal. This is consistent with what was found in Refs. [27, 30, 192].

The effect of BBN can be seen in the lower limits of Figs. 3 and 4. The lower limit on \(U_I^2\) is a direct consequence of BBN, as lower couplings would mean that right-handed neutrinos would not decay before BBN and thus affect the abundance of primordial elements. Although no individual limits are imposed by the BBN constraint on the couplings, the strong upper limit on the e and \(\mu \) flavours at low masses has the side effect of setting a lower limits on \(U_{\tau I}^2\), as seen in Fig. 3.

For a more detailed explanation of the effect of each partial likelihood, and associated figures, we refer to Appendix D.

5.3 Lightest neutrino mass and flavour mixing

Oscillation data strongly constrains most of the active neutrino parameters, in particular the mass splittings \(\varDelta m^2_{21}\) and \(\varDelta m^2_{3l}\), the mixing angles \(\theta _{ij}\) and CP phase \(\delta _{CP}\), whereas the lightest neutrino mass \(m_{\nu _0}\) remains unknown. There are upper bounds from cosmology on the sum \(\sum _i m_i\) that depend on the active neutrino mass hierarchy, the data set used and the underlying cosmological model. The value quoted by the Planck collaboration for a standard cosmological model is \(\sum _i m_i < 0.12\) eV [193], a discussion of how this value changes with different assumptions can e.g. be found in the Particle Data Group Report [124]. In fact, using the best fit values for the mass splittings from the NuFit data [77] and the conservative value \(\sum _i m_i < 0.23\) eV, we can infer the upper limits of \(m_{\nu _0} < 7.12 \times 10^{-2}\) eV for normal and \(m_{\nu _0} < 6.55 \times 10^{-2}\) eV for inverted hierarchy.

The value of \(m_{\nu _0}\) strongly impacts on the lower limit on \(U_I^2\). One can obtain a reliable estimate of the lower bounds on \(U_I^2\) by setting \(\mathcal {R}=1\) [42]. This makes the PMNS matrix unitary, and the lower limit one the smallest mixing can be estimated as \(U_I^2 > rsim m_{\nu _0}/M_I\). Using this approximation, we show in Fig. 4 the lower limits on \(U_I^2\) that we obtain in our scans for different values of \(m_{\nu _0} = (0.05, 10^{-2}, 10^{-3}, 10^{-4})\) eV. In the case of \(m_{\nu _0} = 0\) there is no absolute lower limit on the coupling from the seesaw mechanism, and the residual lower limit on \(U_I^2\) is due to the BBN constraint.

The lightest neutrino mass has an important effect on the pattern of flavour mixing. In the limit of large \(m_{\nu _0}\), there is almost no constraint on the allowed flavour ratios \(U_{\alpha I}^2/U_I^2\). This is shown in Fig. 11 by the black solid (dashed) contours, which indicate the allowed region within \(1\sigma \) (\(2\sigma \)) where the lightest neutrino mass is \(m_{\nu _0} < 10\, \hbox {meV}\) (close to the cosmological bound stated above). In this case, there is no visible upper limit on \(U_{\mu I}^2/U_I^2\) or \(U_{\tau I}^2/U_I^2\) for normal hierarchy, whereas \(U_{eI}^2/U_I^2\) is constrained \(\lesssim 0.95\). Conversely, for inverted hierarchy there is an upper limit for the \(\mu \) and \(\tau \) flavours, but none for the e flavour. However, for smaller values of \(m_{\nu _0}\), the allowed range for the flavour mixing pattern becomes significantly constrained.Footnote 12 This is shown by the lines for \(m_{\nu _0} < 1\) meV (blue), \(m_{\nu _0} < 0.1\) meV (green) and \(m_{\nu _0} < 0.01\) meV (red). For masses lower than 0.01 meV the constraints saturate and the size of the ellipse remains almost constant. This can be also seen in Fig. 12, where the largest coupling ratio is plotted for each flavour as function of neutrino mass.

It is instructive to compare our results to the constraints on the flavour mixing pattern in the scenario with \(n=2\) that were found in Refs. [27, 190, 191]. For this purpose it is not sufficient to simply insert very small values for \(m_{\nu _0}\) in the parameterisation (29) because such values can also be achieved due to accidental cancellations in the light neutrino mass matrix (without decoupling of any of the heavy neutrinos), cf. Sect. 2.6. To remove such fine tuned points we impose the following cuts

$$\begin{aligned} \frac{|M_2-M_1|}{M_2+M_1}< \epsilon ,&\quad \frac{m_{\nu _0}}{\mu \text {eV}}< 1, \nonumber \\ |F_{\alpha 3}|< \epsilon ,&\quad \frac{|F_{\alpha 1} + i F_{\alpha 2}|}{|F_{\alpha 1}| + |F_{\alpha 2}|} < \epsilon . \end{aligned}$$
(75)

Here \(\epsilon \) is an arbitrarily small number, which we choose as \(\epsilon = 0.01\) for convenience. In addition, we work in the limits as defined by \(|\mathrm {Im}\omega _{23}| \gg 1\) and \(\mathrm {Re}\omega _{13} \sim \pi /2\) for normal hierarchy, and \(|\mathrm {Im}\omega _{12}| \gg 1\) for inverted hierarchy (c.f. Appendix C). Note that we randomised the order of the matrices \(\mathcal {R}^{ij}\), and hence for normal hierarchy we can only reproduce the true symmetry protected regime for the permutation \(\mathcal {R} = \mathcal {R}^{23} \mathcal {R}^{13} \mathcal {R}^{12}\). The inverted hierarchy limit is independent of permutations as two of the \(\omega _{ij}\) are zero. In Fig. 13, we show the triangle plots with 1\(\sigma \) and 2\(\sigma \) contours for NH and IH in the symmetry protected region after applying the aforementioned cuts to remove fine-tuned points. The results are consistent with what was found in Ref. [190] for \(n=2\) RHNs. It is worth noting that there is a sharp upper limit on \(U_{eI}^2/U_I^2\) where the contours do not show. This is due to the hard upper limit imposed on \(m_{\nu _0}\) in order to reach the \(n=2\) case and it is, as before, consistent with the results in [190].

5.4 Discussion of excesses likelihoods

In the previous subsections, we have made use of an exclusion-only ‘capped’ profile likelihood to study the constraining effect of the various observables on the parameter space (for a justification see Sect. 4.5). The total likelihood, however shows a pattern of excesses in some small regions of the parameter space. As discussed in Sect. 4.5, experimental results with a preference for specific heavy neutrino masses and mixings would in general not show up as localized excesses in the total profile likelihood. This is due to the fact that for each value of \(M_I\) it would be in general possible to find a value of \(M_J\) (with \(J\ne I\)) and associated couplings that would maximize the excess likelihood, irrespective of the values of \(M_I\). In order to extract the specific masses and couplings preferred by an excess likelihood, we adopt through this subsection the following strategy. We only allow one of the three RHNs (which we take to be \(I=1\)) to acquire the required masses and couplings, while disallowing the other two RHNs to enter the preferred region. This is emphasized in the plots by specifying \(M_1\) and \(|U_{\alpha 1}|^2\) instead of \(M_I\) and \(U_{\alpha I}^2\). Mind that these results would be identical for \(M_2\) and \(M_3\).

Fig. 14
figure 14

Profile likelihood in \(M_I\) vs \(U_{\tau I}^2\) plane without likelihood cap showing the excesses due to the \(\varGamma _{inv}\), CKM and \(R_\tau \) constraints, for normal (left) and inverted hierarchy (right)

Fig. 15
figure 15

One-dimensional profile likelihood for \(U_{\tau 1}^2\), \(\mathcal {L}_{total}\), and partial likelihoods for \(\varGamma _{Z}\), CKM and combination of the rest of constraints, \(\mathcal {L}_0\), in the low mass region, \(M_1 < 1\) GeV, for normal (left) and inverted hierarchy (right)

The invisible width of the Z boson is modified by the presence of the right-handed neutrinos through their mixing, as described in Sect. 3.2.1. For very high \(\tau \) couplings, \(U_{\tau I}^2 > 10^{-3}\), the prediction from the RHN model is actually a better fit to the experimental measurement than the SM, and thus there is a slight (\( < 2\sigma \)) excess. A similar effect occurs for the CKM and \(R^\tau _{e\mu }\) constraints, where the modified contribution on the neutrino mixing in the decay products of K-mesons and \(\tau \), enhances the prediction with respect to that of the SM. Figure 14 shows the excesses on the total profile likelihood in the \(M_1\) vs \(U_{\tau I}^2\) plane, zoomed in at high couplings (as discussed above, we excluded \(M_2\) and \(M_3\) from entering the excess regions). Since there are no constraints from direct searches at masses above \(M_1 > 80\) GeV or in the range \(0.3< M_1 < 0.5\) GeV, there is a combined excess shown of about \(2\sigma \).

Fig. 16
figure 16

One-dimensional profile likelihood for \(U_{\tau 1}^2\), \(\mathcal {L}_{total}\), and partial likelihoods for \(\varGamma _{Z}\), CKM , \(R_\tau \) and combination of the rest of constraints, \(\mathcal {L}_0\), in the high mass region, \(M_1 > 60\) GeV, for normal (left) and inverted hierarchy (right)

In order to study the impact of the different partial likelihoods on the total likelihood excess, we show in Figs. 15 and 16 the partial one-dimensional likelihoods for \(\varGamma _\mathrm{{inv}}\) (blue), CKM (green) and \(R^\tau _{e\mu }\) (pink) with respect to the total likelihood (red) for \(M_1 < 1\) GeV and \(M_1 > 60\) GeV, respectively. All likelihoods are normalised so that they show up as a bump over the combination of all other likelihoods \(\mathcal {L}_0\) (grey). These plots show that the combination of excesses from all three sources amounts to a deviation of around (high mass) or above (low mass) \(2\sigma \) with respect to the background. As observed in the figures, the effect of \(R_{e\mu }^\tau \) is rather negligible compared to the other two relevant likelihoods. Even larger couplings are severely penalised by the steep drop in the \(\varGamma _\mathrm{{inv}}\) likelihood.

Fig. 17
figure 17

Profile likelihood in \(M_1\) vs \(U_{e 1}^2\) plane without likelihood cap showing the excesses due to the \(R_K\) constraint, for normal (left) and inverted hierarchy (right)

The excesses shown in Figs. 14, 15 and 16 in \(|U_{\tau 1}|^2\), for both low and high masses, are the most significant excesses arising in our three RHN scenario, but not the only ones. At masses around the K-meson resonance, there is an even dimmer excess in \(|U_{e1}|^2\), arising from the constraint on fully leptonic decays of K-mesons, \(R^K_{e\mu }\). As seen in Fig. 17, for both normal and inverted hierarchy, there is a \(\sim 1\sigma \) excess at \(M_1 \sim 0.45\) GeV. As before, we show in Fig. 18 the one-dimensional likelihoods for \(R_{e\mu }^K\) (purple) with respect to the total likelihood (red), over the background of the combination of the rest of constraints (grey). Although the \(R^K_{e\mu }\) likelihood keeps increasing for larger values of \(|U_{e1}|^2\), the total likelihood drops at the limit shown in the figures due to the constraints from the CHARM experiment (orange).

Although the identified excesses provide interesting hints towards specific regions of the RHN parameter space, they should not be over-interpreted, since their significance remains rather small and probably consistent with statistical fluctuations. The presence of such excesses was already observed before, identified in EWPOs [25] (cf. also [195]) and CKM constraints, and particularly in \(\tau \rightarrow s\) transitions [30].

6 Conclusions and outlook

We presented here the first frequentist global analysis of the extension of the Standard Model by three heavy right-handed Majorana neutrinos for a large range of their masses, from 60 MeV to 500 GeV, and for normal and inverted hierarchy of the active neutrino masses. As detailed in Sect. 1.2, our analysis improves on previous studies in numerous ways. Most notable is the inclusion of a larger number of experimental constraints than in previous studies, such as EWPOs, all LFV decay channels, active neutrino mixing and masses, as well as many direct searches. Furthermore, we have performed a proper statistical combination of all constraints using a composite likelihood approach, and studied the overall constraints on the parameter space using robust profile likelihood methods. To this end, we have used the advanced BSM inference tool GAMBIT [39], which we appropriately extended with the relevant model specifications and experimental constraints.

Fig. 18
figure 18

One-dimensional profile likelihood for \(U_{e 1}^2\), \(\mathcal {L}_{total}\), and partial likelihoods for \(R_K\), CHARM and combination of the rest of constraints, \(\mathcal {L}_0\), in the low mass region, \(M_1 < 1\) GeV, for normal (left) and inverted hierarchy (right)

The results shown in Sect. 5 cover the full studied mass range for all couplings down to \(U_I^2\sim 10^{-16}\). The profile likelihood contours are consistent with the results found in previous studies. The upper limits on the heavy neutrino mixing with electron and muon flavour mostly follow the confidence levels provided by direct search experiments. In the projection of the likelihoods on the \(M_I\)-\(U_{\alpha I}^2\) planes the interplay becomes visible only in the constraints on the third generation and for masses below a GeV. This is qualitatively different from the model with only two heavy neutrinos (\(n=2\)), where combination of direct, indirect and cosmological bounds imposes stronger constraints than each of them individually on the mixing with all three SM flavours, and this interplay can rule out a considerable mass region below the kaon mass [27]. We for the first time studied the global constraints on the heavy neutrino flavour mixing pattern, which strongly depends on the mass of the lightest SM neutrino \(m_{\nu _0}\). We explicitly studied the limit of vanishing lightest neutrino masses, where we have shown that the flavour mixing pattern becomes significantly constrained for small values of \(m_{\nu _0}\). For \(m_{\nu _0}<\) 0.01 meV these constraints become independent of the precise value of \(m_{\nu _0}\) in both mass hierarchies, which suggests that one heavy neutrino has effectively decoupled. In this regime we demonstrated that one can recover the results that have previously been found in the model with only two RHN in earlier works.

Furthermore, we identified a few excesses in the profile likelihood, which are due to the invisible decay width of the Z-boson, the CKM unitarity constraint and \(R_{e\mu }^K\). Our best fit has a significance (w.r.t. SM) slightly above \(2\sigma \). Although these excesses are not significant enough to favour the \(n=3\) right-handed neutrino model in favour of the SM at the moment, an improvement on the measurements of the relevant observables will increase/decrease their significance in the future. Future \(e^+e^-\) colliders, such as the ILC, FCC-ee or CEPC, might measure EW observables, including the Z decay width, with higher precision [196] than the current value from LEP [197]. The NA62 experiment, which targets kaon decays, might be able to improve the measurements of the CKM matrix elements \(V_{us}\) and \(V_{ud}\), as well as the lepton universality ratio \(R_{e\mu }^K\) through more precise measurement of the fully leptonic decays of kaons [198].

Since the strongest constraints on the absolute value of the couplings come from direct searches, it is expected that the results obtained in this analysis will change significantly with the next generation of direct search experiments. An overview of projected sensitivities can e.g. be found in Refs. [18,19,20]. Many of these searches can be performed at existing facilities, including the LHC, NA62, T2K or the DUNE near detector. The sensitivity of the LHC will soon be upgraded with the recently approved FASER experiment [199] and other proposed dedicated detectors [200,201,202,203,204,205,206]. In the more distant future the SHiP experiment [207, 208] can search for heavy neutrinos in the GeV mass range [209], while future folliders such as FCC [210] or CEPC [211] can explore larger masses. These experimental perspectives make the study of right handed neutrinos an exciting topic for the years to come. Additional motivation for such searches comes from cosmology because the baryon asymmetry of the universe can be explained by low scale leptogenesis for all experimentally allowed values of the mixing angles in the model considered here if the heavy neutrino masses lie below the electroweak scale [9, 57]. If any heavy neutral leptons are found in experiments then our results for their properties, such as the flavour mixing pattern as a function of light neutrino parameters, provide a powerful test to assess whether these particles are responsible for the generation of light neutrino masses and/or the baryon asymmetry of the universe [212], and to distinguish the model with three heavy neutrinos considered here from the model with two heavy neutrinos or other extensions of the SM.