Right-handed Neutrinos and $R(D^{(*)})$

We explore scenarios where the $R(D^{(*)})$ anomalies arise from semitauonic decays to a right-handed sterile neutrino. We perform an EFT study of all five simplified models capable of generating at tree-level the lowest dimension electroweak operators that give rise to this decay. We analyze their compatibility with current $R(D^{(*)})$ data and other relevant hadronic branching ratios, and show that one simplified model is excluded by this analysis. The remainder are compatible with collider constraints on the mediator semileptonic branching ratios, provided the mediator mass is of order TeV. We also discuss the phenomenology of the sterile neutrino itself, which includes possibilities for displaced decays at colliders and direct searches, measurable dark radiation, and gamma ray signals.

As discussed in Refs.[16,17] (see also Refs.[18,19]), the observed enhancements of R(D ( * ) ) can be achieved not only through NP contributions to the b → cτ ντ decay, where ν τ is the SM left-handed τ neutrino, but also via a new decay channel, b → cτ NR , where N R is a sterile right-handed neutrino.The b → cτ ν decay becomes an incoherent sum of two contributions: To streamline notation we denote ν = N R or ν τ , so that Br[b → cτ ν] = Br[b → cτ ντ ] + Br[b → cτ NR ].Since the NP couples to right-handed neutrinos, this can relax many of the electroweak constraints from the τ processes mentioned above.
In the specific context of Refs.[16,17], the b → cτ NR decay is mediated by an SU (2) L singlet W , which can be UV completed in a '3221' model.In this paper we generalize the EFT studies of Refs.[16,17] to the full set of dimension-six operators involving N R (for earlier partial studies see [20][21][22]).Assuming that the NP corrections are due to a tree level exchange of a new mediator, there are five possible simplified models for b → cτ NR , whose mediators are: the SU (2) L -singlet vector boson -the W ; a scalar electroweak doublet; and three leptoquarks.
For each simplified model we identify which regions of parameter space are consistent with the R(D ( * ) ) anomaly, subject to exclusions from the B c → τ ν branching ratio [23][24][25].We further examine the variation in the signal differential distributions expected for each simplified model.While some electroweak constraints are relaxed, these simplified models nonetheless typically imply various sizeable semileptonic branching ratios for the tree-level mediators, for which moderately stringent collider bounds already exist.We show that, depending on the ratios of NP couplings in the simplified model, these in turn set lower bounds of O(TeV) on the mediator masses.We then proceed to examine the implications for neutrino phenomenology, such as bounds from radiative contributions to the SM neutrino masses, astrophysical constraints from sterile neutrino electromagnetic decays, plausible cosmological histories that admit these sterile neutrinos, and displaced decays at colliders and direct searches.In our analysis, we will require the N R to be light -m N R O(100) MeV -in order not to disrupt the measured missing invariant mass spectrum in the full B → D ( * ) τ ν decay chain.Whether heavier sterile neutrinos are compatible with data requires a dedicated forward-folded study, performed by the experimental collaborations.
The paper is structured as follows.Section 2 contains the EFT analysis of the R(D ( * ) ) data for the case of the right-handed neutrino and introduces the five possible tree-level mediators.Collider constraints on these simplified models are studied in Section 3, while Section 4 contains the related sterile neutrino phenomenology.Our conclusions follow in Section 5. Appendix A examines the structure of the b → cτ ν differential distributions for the simplified models.

EFTs and simplified models
We consider the extension of the SM field content by a single new state, a right handed, sterile neutrino transforming as N R ∼ (1, 1, 0) under SU (3) c × SU (2) L × U (1) Y .This state may couple to the SM quarks via higher dimensional operators.Above the electroweak scale, one therefore adds to the renormalizable SM Lagrangian the following effective interactions, where Q a are dimension-d operators, C ad are the corresponding dimensionless Wilson coefficients (WCs), and Λ eff is the effective scale defined to be TeV . (2. 2) The most general basis of dimension-6 operators that can generate the charged current b → cτ NR decay is given by Here a, b are SU (2) L indices, ab is an antisymmetric tensor with 12 = − 21 = 1, and we use the four-component notation, with Q L the SM quark doublet, u R and d R the up-and down-quark singlets, and L L the SM lepton doublet.(As usual, there is only one nonvanishing tensor operator, since σ µν P L ⊗ σ µν P R = 0, which immediately follows from the relation σ µν ⊗ σ µν γ 5 = σ µν γ 5 ⊗ σ µν .)One may also include the dimension-8 operator where H = H * , as well as the operators with the left-handed sterile neutrino field, N c R , that start at dimension-7, and the dimension-9 equivalent of Q VL , Each of the SM fields also carries a family index, i.e., , and similarly for the Wilson coefficients, C ijk ad , and the operators, Q ijk ad , in Eq. (2.1), which we have omitted for the sake of simplicity.Since we focus exclusively on the generation of b → cτ ν decays below, we drop the family indices hereafter, unless otherwise stated.Consistency with bounds from direct searches requires that the Wilson coefficients in Eq. (2.1) be at most O (1).
Below the electroweak scale, the top quark, the Higgs, and the W and Z bosons are integrated out.At the scale µ ∼ m c,b , the effective Lagrangian, including SM terms (see, e.g., [26]), can be written in which the NP contributions to b → cτ ν, induced by the dimension-6 operators in (2.3), are described by the following four-fermion operators, The scalar and tensor operators run under the Renormalization Group.The RG evolution from M > m t to µ < m b gives at one-loop order in the leading log approximation for the Wilson coefficients at the low scale [27,28], for X = SR, SL, T, and cR t L ντ N R .The Wilson coefficients of these operators, c s SR,T,SL , correspond to c SR,T,SL , respectively, up to one-loop or higher-order corrections.
Each of the dimension-six operators in Eq. (2.3) can arise from the tree level exchange of a new state, either a scalar or a vector.The possible mediators, together with the Wilson coefficients c X they can contribute to, are listed in Table 1.Two of these mediators are color singlets: the charged vector resonance W µ , discussed extensively in Refs.[16,17], and the weak doublet scalar Φ.The remaining mediators are leptoquarks, for which we use the notation from Ref. [29].In some cases the structure of the mediator Lagrangian, δL int , implies relations between the various Wilson coefficients, denoted by equalities in Table 1.In particular, for the R2 and S 1 models, c SR (Λ eff ) = ±4c T (Λ eff ), which evolves to at the B meson scale.
For completeness, we list the remaining b → cτ NR dimension-6 operators at µ ∼ m c,b , ) The generation of these operators from the electroweak scale four-Fermi operators (2.4)-(2.6)requires additional insertions of the Higgs vev, v EW , and, apart from O VL , also the left-handed sterile neutrino N c R .These O a operators are the same as those in Ref. [27],  The SM predictions, e.g.making use of the model-independent form factor fit 'L w≥1 +SR' of Ref. [8] (see also Refs.[9,10]), are R(D) th = 0.299 ± 0.003, R(D * ) th = 0.257 ± 0.003, corr.= +0.44 . (2.15) With the addition of a right-handed neutrino decay mode, the B → D ( * ) τ ν decays become an incoherent sum of two contributions: the SM decay b → cτ ντ and the new mode b → cτ NR .The N R contributions therefore increase both of the B → D ( * ) τ ν branching ratios above the SM predictions, as would be required to explain the experimental measurements of R(D ( * ) ).
In Fig. 1, we show for each simplified model of Table 1 the accessible contours or regions in the R(D) − R(D * ) plane, compared to the experimental data.The predictions for NP corrections to R(D ( * ) ) are obtained from the expressions in Ref. [30], making use of the form factor fit 'L w≥1 +SR' of Ref. [8].This fit was performed at next-to-leading order in the heavy quark expansion, with matching scale µ = √ m b m c and quark masses defined in the Υ(1S) scheme, relevant for a self-consistent treatment of the B c → τ ν constraints below.The W and R2 simplified models have only a single free Wilson coefficient and are constrained to a contour: Since the N R contributions add incoherently to the SM, the phase of each Wilson coefficient is unphysical.By contrast, Φ, U 1 , and S 1 have two free Wilson coefficients, corresponding to two free magnitudes and a physical relative phase, permitting them to span a region.
Assuming first that all Wilson coefficients are real, we show in Fig. 2 the 0.5σ, 1σ CLs (dark, light blue) and 1.5σ, 2σ CLs (dark, light green) in the relevant Wilson coefficient spaces for each simplified model.These CLs are generated by the χ 2 defined with respect to the R(D ( * ) ) experimental data and correlations (2.14), not including the possible effects of NP errors.That is, The χ 2 CLs (dof =2) in Fig 2 then correspond simply to projections of the CL ellipses in Fig. 1.We will hereafter refer to the minimal χ 2 points in the WC space for each simplified model as the model's 'best fit' points with respect to the R(D ( * ) ) results (2.14), though it should be emphasized that this is not the same as a NP WC fit to the experimental data, which would require inclusion of the NP errors in the underlying experimental fits.In Fig. 2 the best fit points are shown by black dots, with explicit values provided in Table 2.For the W and R2 models, we show the explicit χ 2 , as well as the intervals corresponding to 1σ and 2σ CLs (dof = 2).
The additional NP currents from the operators (2.8) also incoherently modify the τ ν decay rate with respect to the SM contribution (cf.Refs.[23,24]), such that ). Self-consistency with the form factor treatment of Ref. [8] requires these masses to be evaluated at µ = √ m b m c in the Υ(1S) quark mass scheme.In Fig. 2 we show the corresponding exclusion regions for the relevant Wilson coefficient spaces (shaded orange), requiring Br(B c → τ ν) < 10% [23,24].For a sense of scaling, we also include a more aggressive Br(B c → τ ν) < 5% exclusion demarcated by a dashed orange line.One sees that the Φ simplified model is excluded, while the R2 2σ CL is not quite excluded by the Br(B c → τ ν) < 10% constraint.The U 1 and S 1 best fit points are in mild tension with the aggressive Br(B c → τ ν) < 5% exclusion, but also exhibit allowed regions for their 1σ CLs.
Lifting the requirement of real Wilson coefficients, the Φ, U 1 , and S 1 models now have a physical phase and inhabit a three dimensional parameter space: two Wilson coefficient magnitudes, schematically denoted |c 1,2 |, and a relative phase ϕ.For the basis of Wilson coefficients defined by the N R operators (2.8), however, the amplitudes for the B → D ( * ) lν decay alone have no physical relative phases.(Physical phases do exist once the D * and τ decay amplitudes are included.)Consequently, for a given choice of |c 1,2 |, there may exist a nontrivial value for cos ϕ that minimizes the χ 2 for R(D ( * ) ) in Eq. (2.16).We refer to this scenario as the 'phase optimized' case, denoted ϕ = ϕ 0 (|c 1 |, |c 2 |).In explicit numerical terms, for the form factor and R(D ( * ) ) inputs described above, the Φ, U 1 , and S 1 models have non-trivial solutions

Real
valid only on the domain | cos(ϕ 0 )| < 1, and otherwise cos(ϕ 0 ) = ±1.These phaseoptimized CLs for the Φ, U 1 , and S 1 models are shown in Fig. 3, with the explicit best fit points listed in Table 2.The best fit points for U 1 and S 1 remain the same, and one sees that these models continue to have non-excluded 1σ CLs.An additional best fit point emerges for the Φ simplified model; however, this model remains excluded, and we therefore do not consider it further in this paper.
Finally, the exchange of mediators that generates the c SR,T Wilson coefficients also results in c s SR,T of similar size (see Eq. (2.11)).The two operators in Eq. (2.11) contribute to b → sν ν rates.This gives, for instance, for the B → Kν ν decay rate (far enough from the kinematic threshold so that we can neglect all the final state masses) [33] dΓ B→Kν with the three B → K form factors, f 0 (q 2 ), f + (q 2 ), f T (q 2 ), functions of q 2 , the invariant mass squared of the neutrino pair, and z = q 2 /m 2 B .The present experimental bound, Br(B + → K + ν ν) < 1.6 × 10 −5 [34], is only a factor of a few above the SM prediction, [35].This implies that c s SR and c s T are highly suppressed, to the level of O(10 −2 ), introducing tensions with the required size of c SR , c T to explain the R(D ( * ) ) anomaly.In the single mediator exchange models in Table 1, this means that the product α 3 Ld α 2 QN for R2 and the product z 3 d z 2 Q for S 1 (and y 32 d for Φ) need to be much smaller than what is required to explain R(D ( * ) ).This excludes the R2 as a simple one mediator solution to R(D ( * ) ): Additional operators coupling to the second generation of quark doublets must be introduced, whose couplings are tuned appropriately to suppress the contributions to b → sν ν.However, this approach would in turn induce large radiative contributions to the neutrino masses, which would also need to be tuned away (see Sec. 4).The S 1 model also generates too large a b → sν ν transition rate at the (non-excluded) best fit point, where c SR and c T are nonzero.The dangerous b → sν ν contribution can be suppressed by taking z 23  Q → 0 (see Table 1), which forces c SR = c T → 0. This c SR = c T = 0 point leads to only a small change in χ 2 , corresponding to a less than 0.5 σ shift in significance, see Fig. 2.

Differential distributions
The reliability of the above R(D ( * ) ) fit results turns upon the underlying assumption that the differential distributions, and hence experimental acceptances, of the B → D ( * ) τ ν de-cays are not significantly modified in the presence of the NP currents.The B → D ( * ) τ ν branching ratios are extracted from a simultaneous float of background and signal data, so that significant modification of the acceptances versus the SM template may alter the extracted values.
To estimate the size of these potential effects, we examine the cascades B → (D * → Dπ)(τ → ν ν τ )ν and B → D(τ → νν)ν, comparing the purely SM predictions with the predictions for the 2σ fit regions of the simplified models.We take N R to be massless, and include the phase space cuts, as an approximate simulation of the BaBar and Belle measurements performed in Refs.[2,3].These distributions are generated as in Ref. [30], using a preliminary version of the Hammer library [36].In Appendix A we show the variation of the normalized differential distributions over the 2σ fit regions in Fig. 2 -i.e.assuming real couplings, for simplicity -for the detector observables E D , E , m 2 miss , cos θ D and q 2 compared to the SM distributions.As already found in Ref. [17], the variation of the W model with respect to the SM is negligible.However, the R2 , U 1 and S 1 theories, since they include interfering scalar and/or tensor currents, may significantly modify the spectra, as seen also in Ref. [30] for the NP tensor current coupling to a SM neutrino.Thus, a fully self-consistent R(D ( * ) ) fit for these models will require a forward-folded analysis by the experimental collaborations: Our analysis above and CLs should be taken only as an approximate guide, within likely 1σ variations in the values of R(D ( * ) ).

Collider constraints on simplified models
The simplified models are subject to low energy flavor constraints as well as bounds from collider searches.These depend crucially on the assumed flavor structure of the couplings in Table 1.Furthermore, the sensitivity of the collider searches depend on other open decay channels of the mediators.In this section, we discuss these constraints for the simplified models.
For the S 1 and R2 models, the best fit points are naively excluded by bounds on b → sν ν transitions.These can be avoided by including higher dimensional operators, due to a new set of heavy states, inevitably introducing greater model dependence for LHC studies.To remain as model independent as possible, we study the collider signatures for these models using their (B c → τ ν consistent) best fit points for R(D ( * ) ) as a benchmark, assuming that any new fields required to ameliorate large b → sν ν (and/or large neutrino mass contributions) are sufficiently heavy that they do not affect mediator production or decay.

W coupling to right-handed SM fermions
The charged vector boson W µ couples to SU (2) L singlets only, and transforms as where i, j = 1, 2, 3 are generational indices.As in Table 1, the coefficients c ij q and c i N encode the flavor structure of the interactions, while g V is the overall coupling strength (in simple gauge models for W it can be identified with the gauge coupling constant [16,17]).A tree level exchange of W generates the operator O VR , cf. eqs.(2.8b) and (2.7), with The best fit values for c VR in Table 2 then imply [17] m W 540 c 23 GeV .
In Fig. 4 we show the minimal set of experimental constraints on such models, applicable to the simplified W model.For this plot we set c 23 q = c 3 N , take Eq.(3.3) to provide the W mass that fits the R(D ( * ) ) data, and set the W couplings to all other SM quarks to zero.For this scenario, the ATLAS search at 13 TeV with 36.1 fb −1 luminosity [37] converts to a 95 % CL bound on Br(W → τ ν) shown in Fig. 4 (blue line), see also Refs.[38,39].The dashed blue line denotes a naive extrapolation of the expected bound from Ref. [37] to the end of the high-luminosity LHC Run 5, assuming 3000 fb −1 integrated luminosity at 14 TeV.For c 23 q = c 3 N the two branching ratios of W are Br(W → τ ν) : Br(W → 2j) 1 : 3; the former is denoted by the horizontal grey dashed line in Fig. 4. The two branching ratios can be correspondingly smaller if other decay channels are open (for instance, to extra vector-like fermions, as contemplated in Refs.[16,17]).The grey shaded region is excluded by unitarity, which constrains 3(c 23 q ) 2 + (c 3 N ) 2 < 16π/g 2 V [40].Bounds on W from di-jet production [41][42][43][44][45] are less stringent and are not relevant for this simplified model.
Since the W µ couples to right-handed quarks, there is significant freedom in terms of the flavor structure of the c ij q and c i N couplings.We have limited the discussion to the minimal case, taking only c 23 q , c 3 N = 0, which is non-generic but possible, for instance, in flavor-locked models [17,46].In most flavor models all the c ij q , c i N are non-zero, leading to constraints from precision measurements.In UV completions (see Refs. [16,17]), the W boson is expected to be accompanied by a Z state.The Z can, however, be parametrically heavier than the W , in particular if additional sources of symmetry breaking are present.The collider constraints on W and Z are often comparable, while the flavor constraints from FCNCs are far more stringent for Z in the presence of any appreciable off-diagonal couplings [17]: Contributions from W exchange to flavor changing neutral currents only arise at one-loop and are significantly less constraining.

Vector leptoquark U µ 1
The interaction Lagrangian for the U µ 1 ∼ (3, 1) 2/3 vector leptoquark is while the kinetic term, following the notation in [47], is with U µν = D µ U 1ν − D ν U 1µ the field strength tensor, and κ a dimensionless coupling.When the leptoquark is integrated out, eq.(3.4) gives two four-fermion operators, relevant for R(D ( * ) ) anomalies, with the Wilson coefficients The best fit values for the U 1 WCs in Table 2 then imply TeV , with where we used the lower set of best fits for U 1 in Table 2 (the upper set is excluded by TeV . (3.9) At the LHC, the U 1 leptoquark can be singly or pair produced.The pair production, pp → U 1 U † 1 , proceeds through gluon fusion, via the color octet term in (3.5), for which we take κ = 1 following Ref.[48].The collider signatures of U 1 pair production depend on the U 1 decay channels.In the minimal set-up we switch on only three couplings, α 33  LQ , α where Here, for simplicity, we have neglected the final state masses and the small corrections due to the off-diagonal CKM matrix elements in the The presence of left-handed quark doublets also inevitably leads to CKM suppressed transitions U 1 → cν τ , uν τ , sτ, dτ .
The corresponding LHC bounds for U 1 are shown in Fig. 5, assuming no other decay channels are open.The most stringent bounds come from pp → U 1 U 1 pair production, with both leptoquarks decaying either as U 1 → cN R [48] (grey region) or U 1 → bτ [49] (brown region).Ref.
[48] also gives bounds for the decay channel U 1 → tν τ , which are not shown in Fig. 5 as they are always weaker in our setup.We see that direct searches still allow for m U 1 ≥ 1.5 TeV, where the parameters of the model are still perturbative, as an explanation for the R(D ( * ) ) anomalies.It is worth noting that a simultaneous fit to all three decay channels by the experiments would improve the sensitivity to U 1 ; such an analysis is likely the most optimal strategy for discovering a U 1 state responsible for the R(D ( * ) ) anomalies.

Scalar leptoquark S 1
The scalar leptoquark S 1 ∼ ( 3, 1) 1/3 has the following interaction Lagrangian, Integrating out the leptoquark generates the following interaction Lagrangian above the electroweak scale where the operators The two operators in the second line give rise to the b → cτ ν i decay for z 3i Q z 23 u = 0, where ν i are the SM neutrinos, which interfere with the SM contribution; for simplicity, we therefore only consider the b → cτ NR decay, setting z 3i Q = 0, so that only the operators in the first line in (3.13) are generated (alternatively, one may consider the regime z u , z Q z d , so that the contribution from the second line is negligible).In the analysis of collider constraints, we conservatively keep only the minimal set of S 1 couplings required for the R(D ( * ) ) anomaly nonzero: The best fit values for the S 1 WCs in Table 2 then imply TeV , where we have defined The resulting bounds from pp → S 1 S 1 pair production at the 13 TeV LHC are shown in Fig. 6.The grey shaded region is excluded by the CMS search [48] with 35.9 fb −1 integrated luminosity, assuming both S 1 decay as S 1 → bN R with the branching ratio in (3.17).The brown shaded region is excluded by the CMS search [49] using 12.9 fb −1 integrated luminosity, assuming pp → S 1 S 1 followed by S 1 → cτ decay, with the r du dependent branching ratio in (3.17).We have assumed the S 1 best fit mass relation (3.16) to R(D ( * ) ) data to derive these bounds.Our analysis indicates that the S 1 leptoquark can be consistent with the R(D ( * ) ) anomaly for m S 1 as low as 600 GeV, and with perturbative couplings (the required values of z 23  u are shown by dashed blue lines in Fig. 6).

Scalar leptoquark R2
The scalar leptoquark R2 ∼ (3, 2) 1/6 has the following interaction Lagrangian, Integrating out the R2 generates The best fit values for the R2 WC in Table 2 then imply TeV . (3.21) The leptoquark doublet R2 contains two states: the charge +2/3 state R2/3 .Keeping only the couplings relevant for the R(D ( * ) ) anomaly nonzero, α 33  Ld , α 2 QN = 0, the R2 states have two decay channels where we have neglected differences due to the masses of the final state particles.Assuming R2/3

Sterile Neutrino Phenomenology
In this section, we discuss the phenomenology associated with the right-handed (sterile) neutrino N R .As we will see below, the coupling of N R to the SM fermions through one of the higher dimension operators in Eq. (2.8), needed to explain R(D ( * ) ), carries interesting implications for neutrino masses, cosmology, and collider signatures.We will assume that N R is a Majorana fermion with mass O(100) MeV so that it remains compatible with the measured missing invariant mass spectrum in the B → D ( * ) τ ν decay chain.As in Sec. 3, we do not consider the Φ model as it is excluded by B c → τ ν constraints.

Neutrino masses
The effective operators (2.8) induce a N R -ν L Dirac mass at the two loop order via contributions of the form Here, the simplified model mediator has been integrated out, producing an effective fourfermion vertex, shown in gray.Depending on the chiral structure of the simplified model, various mass insertions are mandated on the internal quark and lepton lines.In particular, the O VR operator requires three mass insertions, while the scalar and tensor operators require only one.The corresponding Dirac masses can be estimated as ) ) In the above estimates, we have ignored O(1) prefactors and loop integral factors apart from those implied by naïve dimensional analysis.Note that for diagrams with a single mass insertion, the Wilson coefficients c SL , c SR appear without the 1/Λ 2 eff prefactor.In such cases, strictly speaking, it is the couplings of the mediators rather than the Wilson coefficients that should appear in the estimates.However, since the collider constraints require mediators to be heavy, with mass approximately equal to Λ eff , it is a reasonable approximation to use the Wilson coefficients everywhere in the above estimates.
Furthermore, for R2 , U 1 , and S 1 mediators, which couple to the left-handed τ L , there are additional two loop contributions to the neutrino mass matrix arising from the SU (2) L related operators involving ν L .A representative diagram is shown in Fig. 8.While such diagrams contain similar mass insertions and WC scalings as the corresponding c SL,SR terms in Eqs.(4.2), they are GIM suppressed and thus expected to produce only subleading corrections to the Dirac mass estimates in Eqs.(4.2).
Since N R is assumed to have a Majorana mass m N R 100 MeV, the contribution to the SM neutrino masses is ∼ m 2 D /m N R , which should not exceed the observed neutrino mass scale m ν ∼ 0.1 eV.From the best fit regions shown in Figs. 2 or 3 (and the best fit values from Table 2), it follows that the W -mediated diagram gives a Dirac mass m D ∼ 10 −3 eV, which is consistent with observed neutrino masses, whereas the R 2 mediated digram gives m D ∼ 100 eV, which is in some tension for m N R 10 MeV.Likewise, the U 1 and S 1 models produce similarly problematic contributions to the neutrino masses at their best fit points (see Table 2).However, from Figs 2 and 3 we also see that the 1σ CLs of the U 1 and S 1 models do contain regions with the scalar Wilson coefficients |c SL,SR | 1, corresponding to small couplings α LQ 1 and z Q 1 (cf.Eqs.(3.6) and (3.14)), which remain compatible with observed neutrino masses.
If additional operators are present, neutrino mass contributions can also be generated at one loop.For instance, as discussed in Sec.2.2, new operators coupling to second generation quark doublets can be introduced to cancel away large contributions to b → sν ν from the operators in Eq. (2.11).Such 1-loop neutrino mass contributions scale as m ∼1 16π 2 m f and, depending on whether the new operators couple to νν or νN R , contribute to the Majorana or Dirac mass terms for the neutrinos.Unless suppressed by small couplings in the diagram, such mass contributions are generally several orders of magnitude larger than what is allowed by the observed neutrino mass scale m ν ∼ 0.1 eV, and would need to be cancelled by fine-tuned values of bare neutrino masses.
Additional Dirac mass contributions beyond the diagrams considered above could worsen or improve the outlook.For instance, if the mediators also couple to other quarks, in particular the top quark, the corresponding two loop diagrams with a top quark mass insertion would lead to unacceptably large contributions to neutrino masses.On the other hand, additional Dirac mass terms that interfere destructively with the two loop contributions here could restore consistency in otherwise problematic regions of parameter space, albeit at the cost of some fine-tuning of parameters.

Sterile Neutrino Decay
The two loop diagrams considered above also give rise to the decay process N R → νγ via the emission of a photon from one of the internal propagator lines (a representative diagram is shown in Fig. 9 (left)).The approximate N R → νγ decay rates 1 for the simplified models, along with the corresponding decay lifetime estimates, are listed in Table 3 (for related calculations, see Ref. [50][51][52][53]).Note that for a given mediator and sterile neutrino mass m N R , the decay rate is completely fixed by the Wilson coefficients consistent with the R(D ( * ) ) anomaly.
For appreciable mixing between N R and the SM neutrinos, the leading tree-level decay is into three SM neutrinos (Fig. 9 center) and, if kinematically accessible, into charged leptons (Fig. 9 right).The N R → 3ν decay rate is where θ is the mixing angle between N R and the SM neutrino.The N R → 3ν decay width is in general subdominant to the N R → νγ decay width induced by the R(D ( * ) ) anomaly.For a direct comparison, one can rewrite the N R → νγ decay rate in Table 3 in terms of Sterile neutrino decay modes induced by the NP couplings (left) and by tree level sterile-active mixing (centre, right).
Table 3. Approximate N R → νγ decay rates (middle column) and lifetimes (final column) for the mediators listed in the first column.For U 1 (S 1 ), we only show the contribution from the c SL (c SR ) operators, which are expected to dominate; if these coefficients vanish, the decay rates and lifetimes get contributions from c VR of the same form as that for the W operator.
the Dirac mass from Eq. 4.2, then convert to the mixing angle via sin θ ≈ m D /m N .For instance, for S 1 this gives Γ(N → νγ)

Sterile Neutrino Cosmology
The above estimates imply that the sterile neutrino N R can be fairly long-lived.The interactions with SM fermions mandated by consistency with the R(D ( * ) ) anomaly also lead to copious production of N R in the early Universe.The cosmological aspects of the sterile neutrino therefore require careful treatment.
The interactions with SM fermions thermalize the N R population with the SM bath at high temperatures.These interactions are active until the temperature drops below the masses of the SM fermions involved in these interactions, i.e., around the GeV scale.Since we have assumed m N R 100 MeV, the N R abundance is not Boltzmann suppressed, and N R survives as an additional relativistic neutrino species in the early Universe.It then becomes crucial to determine the fate of this N R population.
For the R2 , U 1 , and S 1 mediated models, it follows from Table 3 that the N R lifetime is ∼ 10 14 (m N R /keV) −3 s.For m N R ∼ O(eV-keV), this implies a late decay of the N R population into the γν channel, which injects an unacceptable amount of photons into the diffuse photon background.The exception are masses close to the upper limit of the range we consider, m N R 100 MeV, for which the lifetime is reduced to 1 s.The decays then occur before big bang nucleosynthesis (BBN) and do not leave any visible imprints.
In contrast, for the W mediated case (or for U 1 , S 1 in the parts of the Wilson coefficient 1σ CL regions where c SL , c SR are vanishingly small), the lifetime is much longer because of the additional mass insertions in the decay diagrams, and a lifetime 1s cannot be achieved for any realistic choices of parameters.However, for m N R 100 keV, the sterile neutrino has a lifetime greater than the age of the Universe and could in principle form a component of dark matter or dark radiation.
The dark matter and dark radiation possibilities of N R in the W model have been extensively discussed in Ref. [17].In contrast to traditionally studied frameworks of sterile neutrino dark matter, where the relic abundance is produced via freeze-in mechanisms (see, e.g., [54][55][56][57][58][59]), the W model involves the sterile neutrino freezing out as a relativistic species, leading to too large of a relic abundance for masses greater than O(keV).This can be fixed with appropriate entropy dilution from, for instance, late decays of GeV scale sterile neutrinos [52,52,60,61], which also makes the dark matter colder, improving compatibility with warm dark matter constraints.The γ-ray bounds from various observations [62] rule out dark matter lifetimes of O(10 26−28 ) s in the keV-MeV window, ruling out the case that N R constitutes all of dark matter.This leaves us with the possibility that N R may constitutes a small fraction -at the sub-percent level -of dark matter.Future γ-ray observations will probe this possibility and could discover a line signal from the N R → γν decay.For masses m N R keV, N R can act as dark radiation and contribute to the effective number of relativistic degrees of freedom ∆N eff ≈ O(0.1) at BBN and/or CMB decoupling, which could be detected with future instruments such as CMB-S4 [63].Lifetimes shorter than the age of the Universe, however, are incompatible with current observational constraints.

Displaced Decays at Direct Searches and Colliders
As discussed in the previous section, in the R2 , U 1 , and S 1 models, cosmology favors the regime m N R ∼ 100 MeV, with a lifetime 1s.Since the dominant decay channel is N R → νγ, this would give rise to displaced decays into a photon+MET.Such displaced signals could provide an interesting, but challenging, target for proposed detectors such as SHiP [64], MATHUSLA [65], FASER [66], and CODEX-b [67].Displaced decays can also occur in the W UV completion of Refs.[16,17], where, as discussed earlier, GeV scale sterile neutrinos with lifetimes 1s might be needed to entropy dilute problematic overabundances of the N R ; these can also lead to several other observable signals at various direct and cosmological probes (see, e.g., the discussion in [68]).

Conclusions
We have performed an EFT study of the lowest dimension electroweak operators that can account for the R(D ( * ) ) anomalies, assuming they arise because of incoherent contributions from semitauonic decays involving a right-handed sterile neutrino N R .These dimension-six operators can arise from a tree-level mediator exchange in five possible simplified models.We examined the fits and constraints for each simplified model.While all five models have 1σ fit regions consistent with the R(D ( * ) ) data, the case of the scalar doublet mediator is conservatively in tension with constraints from Br[B c → τ ν], while the experimental bounds on b → sν ν rates are in tension with the predicted rates from the scalar leptoquark R2 .
The fit regions of remaining three simplified models imply sizeable semileptonic branching ratios for the tree-level mediators.We have shown that each model can be further compatible with already quite stringent collider constraints provided the mediator masses are O(TeV), while their couplings may still remain in the perturbative regime.This analysis indicates promising paths to future discovery at the LHC of the tree-level mediators, with couplings and masses consistent with the fit to the R(D ( * ) ) data.The W mediator can likely be probed in the W → τ ν channel at the high-luminosity LHC (see Fig. 4 and surrounding discussion).The vector leptoquark U µ 1 can best be probed at the LHC with simultaneous fits to the three decays U 1 → cN R , U 1 → bτ and U 1 → tν τ .Likewise, the scalar leptoquark S 1 can be probed via S 1 → bN R and S 1 → cτ decays.Since the mediators cannot be arbitrarily heavy if the couplings are to remain perturbative, prospects of detecting them at the LHC are quite encouraging.
We have also discussed the phenomenology associated with the sterile neutrino N R .In simplified models involving R2 , U 1 , and S 1 , constraints from contributions to neutrino masses as well as cosmology indicate a preference for m N R ∼ 10 -100 MeV with a decay lifetime 1s in the dominant channel N R → νγ.This opens up the potential for detecting displaced decays of N R at various detectors.It also implies potentially measurable distortions of the kinematical distributions in semileptonic B meson decays due to the heavy sterile neutrino in the final state.For the W simplified model, the predicted contribution to neutrino masses is much smaller and poses no constraints on the model.The predicted decay lifetime of N R is correspondingly much longer than the age of the Universe.Consequently, a significant relic abundance of N R is likely present in the universe, which can contribute to dark radiation and give measurable deviations to the effective number of relativistic degrees of freedom ∆N eff ≈ O(0.1) at BBN and/or CMB decoupling for m N R keV, or constitute a small fraction of dark matter for N R in the keV-MeV mass range with possible gamma ray signals at future probes.
The interpretation of the R(D ( * ) ) anomaly in terms of new physics coupling the SM fermions to a right-handed sterile neutrino is therefore an exciting possibility with testable predictions in multiple directions, spanning kinematic distributions of the measured B meson decays, searches for heavy TeV scale particles at the LHC, displaced decay signals at various detectors, as well as astrophysical and cosmological signatures.
for Physics, which is supported by National Science Foundation grant PHY-1066293.DR thanks Florian Bernlochner, Stephan Duell, Zoltan Ligeti and Michele Papucci for their ongoing collaboration in the development of Hammer, which was used for part of the analysis in this work.The work of DR was supported in part by NSF grant PHY-1720252.

A Differential distributions
In this appendix we collect the predictions for several normalized differential distributions for B → (D * → Dπ)(τ → ν ν τ )ν and B → D(τ → ν ν τ )ν decay chains, shown in the left and right columns in Figs.10-13, respectively.In each plot, the SM predictions (blue dashed curves) are compared with the predictions for the particular simplified model (grey bands), obtained by varying the relevant Wilson coefficients over the 2σ regions in Fig. 2 The comparison between the SM predictions (blue dashed curves) and the predictions for the W simplified model (grey bands) is shown in Fig. 10.The differences between the two predictions are small, below about 10% for (1/Γ)(dΓ/dE ) and well below this for the other distributions.Similarly small corrections from NP to the shapes of distributions are found for the R2 model, Fig. 11.In this case the largest deviation is found for the (1/Γ)(dΓ/dE D ) distribution for the B → D * → Dπ decay (Fig. 11, first row, right panel) and is at the level of about O(20%).The deviations are potentially sizable for the U 1 and S 1 models for at least some of the distributions, see Figs. 12 and 13, respectively.
) with anomalous dimensions γ SR,SL = −8, γ T = 8/3 and the one loop β-function coefficient β (n) 0 = 11 − 2n/3.The running of c SR,SL,T depends only weakly on the high scale M , and hereafter we set M = Λ eff .Fixing the scale low scale to µ = √ m c m b -anticipating the chosen matching scale of QCD onto HQET for the B → D ( * ) form factor parametrization -one finds ρ SR,SL 1.7 , ρ T 0.84 .(2.10) Assuming the flavor indices are given in the mass eigenstate basis, the NP operators (2.1) can be matched onto the operators (2.3) as c X (Λ eff ) = C 233 X , neglecting the tiny mixing of active neutrinos into N R .Note that the operators O SR,T,SL are accompanied by the SU (2) L related operators

Figure 1 .
Figure 1.The enhancements of R(D ( * ) ) from b → cτ NR decays for various simplified models.The world average experimental 1σ, 2σ, and 3σ fit regions are shown in decreasing shade of gray.The SM point is denoted by a black dot.

17
) in which f Bc 0.43 GeV[31] and τ Bc 0.507 ps[32], and m c,b are the MS quark masses, obeying m Q

Figure 4 .
Figure 4.The bound on Br(W → τ ν) as a function of W mass from the 13 TeV ATLAS search[37] (solid blue) and the projected reach at the end of the high-luminosity LHC run (dashed blue), for the case c 23 q = c 3 N , W mass given by Eq. (3.3) to fit to R(D ( * ) ) data, and the W couplings to all the other SM quarks set to zero.In this case Br(W → τ ν) = 0.25 (dashed grey line) if no other W decay channels are open.The region excluded by unitarity is shaded in grey.
2).If one instead sets c SL = 0, the best fit simply maps onto the W result (since both models then have the same non-zero coupling c VR ): |c VR | 0.46, and

Figure 6 .
Figure 6.The LHC bounds from pair production of S 1 leptoquarks followed by S 1 → bN R decays [48] (grey region) and S 1 → cτ [49] (brown region) as a function of m S1 and the ratio r S1 = (z 3 d /z 23 u ) 2 (3.18).The remaining ratio of coupling constants is fixed by the relation z 23 u 1.1z 23 Q , arising from the S 1 best fit WCs to the R(D ( * ) ) data (2.14).Contours satisfying the S 1 best fit mass relation (3.15) are shown by blue dashed lines for z 33 u = 0.25, 0.5, 1.0, and 2.0.

Figure 8 .
Figure 8. Dirac mass contribution by virtue of SU(2) counterparts of the four-Fermi operators that give rise to the R(D ( * ) ) enhancements.These diagrams are GIM suppressed and give subdominant contributions to the Dirac mass.
. In each of the figures the first row shows the normalized distribution (1/Γ)(dΓ/dE D ), where E D is the energy of the outgoing D meson in the B meson rest frame.The second row contains the (1/Γ)(dΓ/dE ) distribution, with E the energy of the final state charged lepton, while the third row shows the (1/Γ)(dΓ/dm 2 miss ) distribution, with m 2 miss the combined invariant mass of the system of three final state neutrinos.The final row in each Figure shows the (1/Γ)(dΓ/d cos θ D ) normalized distribution, where θ D is the angle between the three momenta of the D meson and the charged lepton, , in the rest frame of the B meson.

Figure 10 .
Figure 10.Gray bands show kinematic distributions for B → (D * → Dπ)(τ → ν ν τ )ν (left) and B → D(τ → ν ν τ )ν (right) in the B rest frame for the W simplified model in Table1, with the Wilson coefficient c VR ranging over 2σ best fit regions in Fig.2, and applying the phase space cuts (2.20).The blue dashed curves show the SM prediction.

Table 1 .
The tree-level mediators that can generate the four-fermion operators with right-handed neutrino, N R , in Eqs.(2.8).The relevant Wilson coefficients are shown in the final column, explicitly defined at scale µ where relevant, and including the factor r ≡ ρ SR /ρ T 2.0.but with N c R replacing the SM neutrino ν τ .Eqs. (2.8) and (2.13) together form a complete basis of b → cτ NR dimension-six four-fermion operators.Since the Wilson coefficients of the operators in Eq. (2.13) are suppressed by additional powers of v EW /Λ eff , we will only focus on the dimension-6 operators listed in Eq. (2.3) and (2.8) in the remainder of this paper.
The fit regions for Φ, U 1 , and S 1 models with respect to the R(D ( * ) ) results(2.14)in the relevant Wilson coefficient spaces, assuming that all Wilson coefficients are real.Shown are 0.5σ, 1σ CLs (dark, light blue) and 1.5σ, 2σ CLs (dark, light green).Best fit points are shown by black dots.Bottom: The χ 2 (dof = 2) for the W and R2 models in the relevant Wilson coefficient space.The 1σ and 2σ CLs are shown by blue and green dots, respectively.Also shown are B c → τ ν exclusion regions requiring Br[B c → τ ν] < 10% (dark orange).For a sense of scaling, a more aggressive Br[B c → τ ν] < 5% exclusion region is demarcated by a dashed orange line.

Table 2 .
Best fit points for each model with respect to the R(D ( * ) ) results (2.14), for real and phase-optimized Wilson coefficients.In the phase-optimized case, we show best fits up to an overall phase, by choosing the first WC to be real and positive definite.

Table 1
, with the Wilson coefficient c VR ranging over 2σ best fit regions in Fig.2, and applying the phase space cuts (2.20).The blue dashed curves show the SM prediction.

Table 1
, with the Wilson coefficients c SR = 4c T ranging over 2σ best fit regions in Fig.2, and applying the phase space cuts (2.20).The blue dashed curves show the SM prediction.

Table 1
, with the Wilson coefficients c SL , c VR ranging over 2σ best fit regions in Fig.2, and applying the phase space cuts (2.20).The blue dashed curves show the SM prediction.