1. The Standard Model (SM) contains three fermion families in the identical representations of the gauge symmetry \(SU(3)\times SU(2)\times U(1)\) of strong and electroweak interactions. One of its fundamental predictions is the unitarity of the Cabibbo–Kobayashi–Maskawa (CKM) matrix of quark mixing in charged current

$$\begin{aligned} V_\mathrm{CKM} = \left( \begin{array}{ccc}{V_{ud} }&{}{V_{us} }&{}{V_{ub} }\\ {V_{cd} }&{}{V_{cs} }&{}{V_{cb} }\\ {V_{dd} }&{}{V_{ts} }&{}{V_{tb} }\end{array}\right) . \end{aligned}$$
(1)

Deviation from the CKM unitarity can be a signal of new physics beyond the Standard Model (BSM). The experimental precision and control of theoretical uncertainties in the determination of the elements in the first row of \(V_\mathrm{CKM}\) are becoming sufficient for testing the condition

$$\begin{aligned} \vert V_{ud} \vert ^2 + \vert V_{us} \vert ^2 + \vert V_{ub} \vert ^2 = 1 . \end{aligned}$$
(2)

Since \(\vert V_{ub} \vert \simeq 0.004\) is very small, its contribution is negligible and (2) reduces essentially to the check of the Cabibbo mixing: \(\vert V_{us} \vert = \sin \theta _C\), \(\vert V_{ud} \vert = \cos \theta _C\) and \(\vert V_{us}/V_{ud} \vert = \tan \theta _C\). In essence, this is the universality test for the W-boson coupling \((g/\sqrt{2}) W^+_\mu J^\mu _L + \mathrm{h.c.}\) to the relevant part of the charged left-handed current

$$\begin{aligned} J_L^\mu = V_{ud} \overline{u_L} \gamma ^\mu d_L + V_{us} \overline{u_L} \gamma ^\mu s_L + \overline{\nu _{e} } \gamma ^\mu e_L + \overline{\nu _{\mu }} \gamma ^\mu \mu _L \nonumber \\ \end{aligned}$$
(3)

For energies smaller than W-boson mass this coupling gives rise to the effective current \(\times \) current interactions

$$\begin{aligned} - \frac{4G_F}{\sqrt{2}} \, \overline{u_L} \big ( V_{ud} \gamma _\mu d_L + V_{us} \gamma _\mu s_L\big ) \big (\overline{e_L} \gamma ^\mu \nu _e + \overline{\mu _L} \gamma ^\mu \nu _\mu \big )\nonumber \\ \end{aligned}$$
(4)

which are responsible for leptonic decays of the neutron, pions, kaons etc., as well as to the interaction

$$\begin{aligned} - \frac{4G_F}{\sqrt{2}}\, \big (\overline{e_L} \gamma _\mu \nu _{e} \big ) \big (\overline{\nu _{\mu }} \gamma ^\mu \mu _L \big ) \end{aligned}$$
(5)

responsible for the muon decay. All these couplings contain the Fermi constant \(G_F/\sqrt{2}=g^2/8M_W^2\).

Precision experimental data on kaon decays, in combination with the lattice QCD calculations of the decay constants and form-factors, provide accurate information about \(\vert V_{us}\vert \). On the other hand, recent calculations of short-distance radiative corrections in the neutron decay allow to determine \(\vert V_{ud}\vert \) with a remarkable precision.

In this paper we analyze the present individual determinations of \(V_{ud}\) and \(V_{us}\) and find significant (about \(4\sigma \)) deviation from the CKM unitarity (2). We discuss two possible BSM scenarios which can explain this deviation. In the first one the three-family unitarity is extended to four species, by introducing the 4th non-sequential down-type quark \(b'\) in the form of vector-like weak isosinglet with mass of few TeV. The second scenario assumes the existence of horizontal gauge symmetry between the lepton families which is spontaneously broken at the scale of few TeV. The corresponding flavor changing gauge bosons induce the effective four-lepton interaction having exactly the same form as (5), with the new Fermi-like constant \(G_{{\mathcal {F}}}\). In this case, muon lifetime would determine \(G_\mu = G_F+G_{{\mathcal {F}}}\) rather than \(G_F\). In this way, one can nicely restore the three family unitarity (2) without introducing new quark species. We discuss implications of these scenarios for the lepton flavor violation (LFV) and for the Standard Model precision tests. At the end, we also discuss the problem of neutron lifetime related to the discrepancy between its values measured using two different (trap and beam) methods.

2. The most precise determination of \(\vert V_{ud} \vert \) is obtained from superallowed \(0^+-0^+\) nuclear \(\beta \)-decays which are pure Fermi transitions sensitive only to the vector coupling constant \(G_V=G_F \vert V_{ud} \vert \) [1]:

$$\begin{aligned} \vert V_{ud} \vert ^2 = \frac{K }{ 2 G_F^2 {{\mathcal {F}}}t\, (1+ \Delta _R^V)} = \frac{0.97147(20)}{ 1+ \Delta _R^V } \end{aligned}$$
(6)

where \(K = 2\pi ^3 \ln 2/m_e^5 = 8120.2776(9) \times 10^{-10}\) s/GeV\(^4\) and \({{\mathcal {F}}}t\) is the nucleus independent value obtained from the individual ft-values of different \(0^+-0^+\) nuclear transitions by absorbing in the latter all nucleus-dependent corrections, while \( \Delta _R^V\) accounts for short-distance (transition independent) radiative corrections. For the second step, we take \({{\mathcal {F}}}t = 3072.07(72)\) s [2] obtained by averaging the individual \({{\mathcal {F}}}t\)-values for fourteen superallowed \(0^+-0^+\) transitions determined with the best experimental accuracy, and plug in the Fermi constant as \(G_F=G_\mu = 1.1663787(6) \times 10^{-5}\) GeV\(^{-2}\) determined from the muon decay [3]. The major uncertainty is related to the so called inner radiative correction \(\Delta _R^V\).

The element \(\vert V_{us} \vert \) can be determined from the analysis of semileptonic \(K\ell 3\) decays (\(K_L \mu 3\), \(K_L e3\), \(K^{\pm }e3\), etc.) [4]:

$$\begin{aligned} f_+(0) \vert V_{us}\vert = 0.21654\pm 0.00041 \end{aligned}$$
(7)

where \(f_+(0)\) is the \(K\rightarrow \pi \ell \nu \) vector form-factor at zero momentum transfer. On the other hand, by comparing the kaon and pion inclusive radiative decay rates \(K\rightarrow \mu \nu (\gamma )\) and \(\pi \rightarrow \mu \nu (\gamma )\), one obtains [5]:

$$\begin{aligned} \vert V_{us}/V_{ud} \vert \times (f_{K^\pm }/f_{\pi ^\pm }) = 0.27599 \pm 0.00038 . \end{aligned}$$
(8)

Hence, the values \(\vert V_{us}\vert \) and \(\vert V_{us}/V_{ud} \vert \) can be independently determined using the lattice QCD results for the form-factor \(f_+(0)\) and the decay constant ratio \(f_K/f_\pi \).

3. Let us first consider the values of the CKM matrix elements \(\vert V_{us} \vert \), \(\vert V_{ud} \vert \) and their ratio \(\vert V_{us}/V_{ud} \vert \) as quoted by Particle Data Group (PDG) review 2018 [5]:

$$\begin{aligned}&\vert V_{us} \vert = 0.2238(8) \nonumber \\&\vert V_{us}/V_{ud} \vert = 0.2315(10) \nonumber \\&\vert V_{ud} \vert = 0.97420(21) \end{aligned}$$
(9)

Here \(\vert V_{us} \vert \) and \( \vert V_{us}/V_{ud} \vert \) are obtained respectively from Eqs. (7) and (8) using the FLAG 2017 averages of 3-flavor lattice QCD simulations \(f_+(0) = 0.9677(27)\) and \(f_{K^\pm }/f_{\pi ^\pm }=1.192(5)\) [6]. \(\vert V_{ud} \vert \) is obtained from Eq. (6) by taking \(\Delta _R^V = 0.02361(38)\) as calculated in Ref. [7].

Fig. 1
figure 1

Upper panel: three independent \(\vert V_{us} \vert \) determinations A, B, C obtained from the PDG 2018 data (9) by assuming the CKM unitarity. The grey shaded band corresponds to the average A + B + C (with formal error not rescaled by a factor \(\sqrt{\chi ^2_\mathrm{dof}} \)). Pulls of C, B, A and A + B are shown. Lower panel: the same for A, B, C values obtained from the dataset (10)

By imposing the CKM unitarity (2), the three data (9) reduce to three independent determinations of \(\vert V_{us} \vert \). These determinations shown as A, B, C in upper panel of Fig. 1 (see also Table 1 for numerical values) are compatible within their error-bars. Throughout this paper A is the direct determination of \(\vert V_{us} \vert \) obtained from Eq. (7). B and C are the values of \(\vert V_{us} \vert \) obtained respectively from \( \vert V_{us}/V_{ud} \vert \) and \(\vert V_{ud} \vert \) by assuming unitarity. Namely, B and C are almost equal while there is a modest tension (\(1.4\sigma \)) between A and B. Their average \(\overline{A+B} =0.2245(6)\) practically coincides with the PDG 2018 average of \(\vert V_{us} \vert \) [5]. By averaging all three values we get \(\overline{A+B+C}= 0.2248(5)\) with \(\chi ^2_\mathrm{dof} =1.7\). Pulls of A, B and C relative to this average (given in Fig. 1) are compatible with a standard deviation. Summarizing, the dataset (9) adopted from PDG 2018 [5] is consistent with the CKM unitarity (2).

Table 1 The 1st column shows independent \(\vert V_{us} \vert \) determinations A, B, C from the PDG dataset (9) by assuming 3-family CKM unitarity (2), their averages and total \(\chi ^2\) value

However, recent progress in the determination of the CKM elements allows to test the unitarity with improved precision. Significant redetermination of \(\vert V_{ud} \vert \) is related to new calculation of inner radiative corrections with reduced hadronic uncertainties, \(\Delta _R^V = 0.02467(22)\) [8]. Employing also the recent result \(f_+(0) = 0.9696(18)\) from new 4-flavor \((N_f=2+1+1)\) lattice QCD simulations [9] and the FLAG 2019 four-flavor average \(f_{K^\pm }/f_{\pi ^\pm }=1.1932(19)\) [10], one arrives to the following:

$$\begin{aligned}&\vert V_{us} \vert = 0.22333(60) \nonumber \\&\vert V_{us}/V_{ud} \vert = 0.23130(50) \nonumber \\&\vert V_{ud} \vert = 0.97370(14) \end{aligned}$$
(10)

This dataset, again by imposing the CKM unitarity, reduces to independent \(\vert V_{us} \vert \) values A, B, C shown in lower panel of Fig. 1 (numerical values are given in Table 1).

Now we see that the values A, B, C are in tensions among each other. Namely, there is a \(5.3\sigma \) discrepancy between A and C, and \(3.2\sigma \) between B and C. The tension between the determinations A and B, both from kaon physics, is \(2.7\sigma \). More conservatively, one can take their average \(\overline{A+B}\). The discrepancy of the latter with C is \(4.5\sigma \). Fitting these values, we get \(\overline{A+B+C} = 0.22546(31)\) but the fit is bad, \(\chi ^2_\mathrm{dof} =13.9\). C, A and A + B have large pulls, \(3.9\sigma \), \(-\,3.6\sigma \) and \(-\,2.3\sigma \).

Let us remark that the chosen dataset (10) is rather indicative since there are tensions in various determinations of \(\vert V_{us} \vert \) which may disappear with more accurate lattice simulations. In particular, we have employed the latest and most precise result \(f_+(0) = 0.9696(18)\) from 4-flavor lattice QCD simulations [9] which is perfectly compatible with the FLAG 2019 4-flavor value \(f_+(0)=0.9706(27)\) [10] which does not include the result of Ref. [9]. Their average yields \(f_+(0)=0.9699(15)\) which would give \(\vert V_{us} \vert = 0.22326(55)\). However, this result from \(K\ell 3\) decays is discordant with the independent determination \(\vert V_{us} \vert = 0.22567(42)\) from \(K\mu 2\) decays recently reported in Ref. [11]. Therefore, for determination A one can take a conservative average between these two results, \(\vert V_{us} \vert = 0.22478(69)\), where the errors are quadratically combined because of their poor compatibility. Let us remark also that the latter determination A is well compatible with the determination B \(\vert V_{us} \vert = 0.22535(45)\) deduced from \(\vert V_{us}/V_{ud} \vert \) given in (10), so that for the average \(\overline{A+B}\) we get \(\vert V_{us} \vert = 0.22518(37)\).

Regarding the determination of \(\vert V_{ud} \vert \), we adopted the result \(\Delta _R^V = 0.02467(22)\) of Ref. [8] which is mildly different from the value \(\Delta _R^V = 0.02426(32)\) deduced in Ref. [12]. For being more conservative, we can average these two results as \(\Delta _R^V = 0.02454(32)\), without reducing the largest uncertainty. In doing so, from Eq. (6) we obtain \(\vert V_{ud} \vert = 0.97376(10)_{{{\mathcal {F}}}t}(10)_{\Delta _R^V}= 0.97376(14)\) which in turn gives determination C as \(\vert V_{us} \vert = 0.22756(72)\). Therefore, between the determination C and \(A+B\) remains \(3\,\sigma \) tension even with more conservative treatment.

Fig. 2
figure 2

The horizontal, vertical and slightly bended bands correspond to \(\vert V_{ud} \vert \), \(\vert V_{us} \vert \) and \(\vert V_{us}/V_{ud} \vert \) from (10). The best fit point (red cross) and 1, 2 and \(3\sigma \) contours are shown. The red solid line corresponds to the three family unitarity condition (2), and the dashed red line corresponds to the “extended” unitarity (11) with \(\vert V_{ub'}\vert = 0.04\)

Let us analyze our dataset (10) also in a different way. Without imposing the unitarity condition (2), we perform a two parameter fit of the three independent values (10). In Fig. 2 we show the gaussian hill of the probability distribution with the confidence level (C.L.) contours around the best fit point (\(\vert V_{us} \vert = 0.22449\), \(\vert V_{ud} \vert = 0.97369\)), with \(\chi ^2_\mathrm{min} = 6.1\). (This \(\chi ^2\)–value seems large for a two parameter fit, but it is dominated by the tension between \(\vert V_{us}\vert \) determinations A and B from the kaon data which may be not real as we discussed above.) The red solid line corresponding to the three family unitarity condition \(\vert V_{ud} \vert ^2 + \vert V_{us} \vert ^2 = 1- \vert V_{ub} \vert ^2 = 1- O(10^{-5})\) is about \(4.3\sigma \) away from this hill top (\(\Delta \chi ^2 = 21.6\)) which means that the new (after 2018) dataset (10) disfavors the CKM unitarity at \(99.998\%\) C.L.

4. “If the Hill does not come to the CKM, the CKM will go to the Hill.” The unitarity line can be moved down towards the probability distribution hill in Fig. 2 if the unitarity condition is extended to more families.

In fact, by introducing just one additional (fourth) family which is also involved in quark mixing, the first row unitarity condition (2) will be modified to

$$\begin{aligned} \vert V_{ud} \vert ^2 + \vert V_{us} \vert ^2 +\vert V_{ub} \vert ^2 = 1 - \vert V_{ub'} \vert ^2 . \end{aligned}$$
(11)

In particular, the red dashed line in Fig. 2 passing through the best fit point on the top of the probability hill corresponds to \(\vert V_{ub'} \vert = 0.04\) (at 95% C.L. this additional mixing is limited as \(\vert V_{ub'} \vert = 0.04 \pm 0.01\)). Plugging this value in Eq. (11), the dataset (10) gives the modified determinations of \(\vert V_{us} \vert \) for the three cases named above as A, B and C (for numerical values see in 3rd column of Table 1). Clearly, the case A in this list remains the same as in 2nd column but B and especially C are shifted down. Figure 3 shows that consistency between these values is significantly improved compared to lower panel of Fig. 1. The fit for \(\overline{A+B+C}\) is acceptable, \(\chi ^2_\mathrm{dof} =3\). Pulls of C and A + B are practically vanishing. There remains a tension between A and B but it is softened to \(2.4\sigma \) from \(2.7\sigma \) of Fig. 1.

In the SM the three families (\(i=1,2,3\) is the family index) of left-handed (LH) quarks \(Q_{Li}=(u_i,d_i)_L\) and leptons \(\ell _{Li}=(\nu _i,e_i)_L\) transform as weak isodoublets of \(SU(2)\times U(1)\) and the right-handed (RH) quarks \(u_{Ri},d_{Ri}\) and leptons \(e_{Ri}\) are isosinglets. Their masses emerge from the Yukawa couplings with the Higgs doublet \(\phi \):

$$\begin{aligned} Y_u^{ij} {\tilde{\phi }} \, \overline{Q_{Li}} u_{Rj} + Y_d^{ij} \phi \, \overline{Q_{Li}} d_{Rj} + Y_e^{ij} \, \phi \, \overline{\ell _{Li}} e_{Rj} \, + \, \mathrm{h.c.} \end{aligned}$$
(12)

\(Y_{u,d,e}\) being the Yukawa constant matrices and \({\tilde{\phi }}=i\tau _2 \phi ^*\).

The existence of a fourth sequential family is excluded by the SM precision tests in combination with the lower limits on their masses from the LHC, as well as from the LHC data on the Higgs production via gluon fusion and and its decay in \(2\gamma \) [13]. However, one can introduce additional vector-like fermions.

Fig. 3
figure 3

Determinations of \(\vert V_{us} \vert \) obtained from the dataset (10) using Eq. (11) with \(\vert V_{ub'} \vert = 0.04\)

Let us briefly sketch a simple picture of this type introducing just an additional vector-like couple of isosinglet down-type quarks \(d_{4L},d_{4R}\). Since 4 species of RH quarks \(d_{1R}, d_{2R},d_{3R},d_{4R}\) have identical quantum numbers, \(d_{4R}\) can be identified as their combination which makes a mass term M with isosinglet LH state \(d_{4L}\). Then, besides the standard Yukawa terms (12) the Lagrangian should contain the additional terms

$$\begin{aligned} Y_d^{i4} \phi \, \overline{Q_{Li}} d_{4R} + M \overline{d_{4L}} d_{4R} \, + \, \mathrm{h.c.} \end{aligned}$$
(13)

Fermion masses of three normal (chiral) families emerge from the vacuum expectation value (VEV) of the Higgs, \(\langle \phi ^0\rangle = v_\mathrm{w} = 174\) GeV (for a convenience, we use this normalization of the Higgs VEV instead of “standard” normalization \(\langle \phi \rangle = v/\sqrt{2}\), i.e. \(v=\sqrt{2} v_\mathrm{w}\)). Without loss of generality, the \(3\times 3\) Yukawa matrix \(Y_u\) of up quarks can be chosen diagonal, \(Y_u = Y_u^\mathrm{diag}= \mathrm{diag}(y_u,y_c,y_t)\), in which bases the states \(u_{1,2,3}\) coincide with the mass eigenstates uct so that \(m_t = y_t v_\mathrm{w}\), etc. In this basis the Yukawa matrix \(Y_d^{ij}\) is non-diagonal, and in addition the terms in (13) induce the mixing of three known down quarks with the 4th species. Thus, \(4\times 4\) mass matrix of all down-type quarks has a form:

$$\begin{aligned} {{\mathcal {M}}} = \left( \begin{array}{cc}{Y_d^{ij} v_\mathrm{w}}&{}{\quad Y_d^{i4} v_\mathrm{w}}\\ {0}&{}{\quad M} \end{array}\right) . \end{aligned}$$
(14)

It can be diagonalized by bi-unitary transformation \({{\mathcal {M}}} \rightarrow {{\mathcal {M}}}_\mathrm{diag} = V_L^\dagger {{\mathcal {M}}} V_R\) where \(4\times 4\) unitary matrices \(V_{L,R}\) connect \(d_{1,2,3,4}\) with the mass eigenstates \(d,s,b,b'\):

$$\begin{aligned} \left( \begin{array}{c} d_1 \\ d_2 \\ d_3 \\ d_4 \end{array}\right) _{L,R} = \left( \begin{array}{cccc} V_{1d}&{}V_{1s}&{}V_{1b} &{} V_{1b'}\\ V_{2d}&{}V_{2s}&{}V_{2b} &{} V_{2b'}\\ V_{3d}&{}V_{3s}&{}V_{3b} &{} V_{3b'} \\ V_{4d}&{}V_{4s}&{}V_{4b} &{} V_{4b'} \end{array}\right) _{L,R} \left( \begin{array}{c} d \\ s \\ b \\ b' \end{array}\right) _{L,R} \end{aligned}$$
(15)

In the context of the SM, the mixing \(V_R\) of the RH quarks is not of interest. As for the left-handed charged current we obtain the modified \(3\times 4\) CKM mixing matrix describing W-boson interactions between three up quarks uct and four down quarks \(d,s,b,b'\):

$$\begin{aligned} \tilde{V}_\mathrm{CKM}=\left( \begin{array}{cccc} V_{ud}&{}V_{us}&{}V_{ub} &{} V_{ub'}\\ V_{cd}&{}V_{cs}&{}V_{cb} &{} V_{cb'}\\ V_{td}&{}V_{ts}&{}V_{tb} &{} V_{tb'} \end{array}\right) \end{aligned}$$
(16)

which in fact consists of three upper rows in \(4\times 4\) unitary matrix \(V_L\) (15), i.e. \(V_{ud} = V_{L1d}\), \(V_{us} = V_{L1s}\) etc. Then the condition (11) regards the first row of this matrix and it stems from the unitarity of \(V_L\). As for the first column, we have \(\vert V_{ud} \vert ^2 + \vert V_{cd} \vert ^2 + \vert V_{td} \vert ^2 = 1 - \vert V_{4d} \vert ^2\) where \(V_{4d}\) is the 4th row element in matrix \(V_L\) which is “deleted” for transforming the latter into \( \tilde{V}_\mathrm{CKM}\).

Let us discuss now in which conditions one could obtain large enough mixing with the 4th species, \(\vert V_{ub'} \vert = 0.04\) or so. From the structure of mass matrix (14), for \(M\gg v_\mathrm{w}\) we have \(\vert V_{ub'}\vert \approx Y_d^{14} v_\mathrm{w}/M \). Then the LHC limit on extra \(b'\) mass \(M> 880\) GeV [5] implies that \(\vert V_{ub'}\vert \simeq 0.04\) can be obtained if the coupling constant \(Y_d^{14} \) in (13) is larger than 0.2 or so. In other words, it should be much larger than the Yukawa constant \(y_b\) of the bottom quark. In turn, by taking \(\vert V_{ub'} \vert > 0.03\) and assuming (for the perturbativity) \(Y_d^{14} < y_t \simeq 1\), we get an upper limit on the extra quark mass, \(M < 6\) TeV or so. Thus, the mass of the extra state \(b'\) should be in the range of few GeV.

The CKM unitarity (2) can be also corrected by introducing a 4th up quark \(t'\) instead of the 4th down quark \(b'\), or more generically by introducing both \(b'\) and \(t'\) forming in some sense a complete vector-like 4th family. In the latter case, the mixing matrix (16) would become a \(4\times 4\) matrix, however it will not be unitary as far as \(b'\) and \(t'\) states are weak isosinglets. Interestingly, it can be shown that introduction of a fourth vector-like isodoublet family \(Q'_{L,R} = (t',b')_{L,R}\) can also have large enough effect for smoothing the discrepancies between the mixing angle determinations [14].

One has to remark, however, that the mixing of ordinary quarks with 4th species induces the quark flavor changing couplings of Z-boson at the tree-level. In the case of extra isosinglet down quark \(b'\) this question was discussed in Ref. [15]. In fact, Z boson couples the neutral current of fermions \(J_\mathrm{nc}^\mu = \sum _f \overline{f_{L,R}} [ I_{3}(f) - \sin ^2\theta _W Q(f) ] \gamma ^\mu f \) where \(I_{3}\) and Q are respectively weak isospin and electric charge of a fermion f, and \(\theta _W\) is the Weinberg angle. In the case of down quarks \(d_{1,2,3,4}\) the second part proportional to \(Q=-1/3\) is uniform for all four states of the LH and RH chirality, and thus it reduces to a flavor-diagonal (and vector) current also in the mass basis \(d,s,b,b'\), as a result of the unitarity of the matrices \(V_L\) and \(V_R\). For the RH states also the isospin dependent part is uniform since \(d_{iR}\), \(i=1, \ldots 4\), all have the same isosipns \(I_3=0\) and thus their mixing \(V_R\) cannot induce any flavor-changing couplings of Z in the mass basis \((d,s,b,b')_R\).

However, the four LH states have different isospins, namely \(d_{1L},d_{2L},d_{3L}\) have \(I_3=-1/2\) while \(d_{4L}\) has \(I_3=0\). Therefore, in the initial basis \((d_{1},d_{2},d_{3},d_4)_L\) the isospin part in the neutral current is represented by the matrix \({\tilde{I}} = \frac{1}{2} \mathrm{diag}(1,1,1,0)\). Therefore, after rotating to the mass basis by a matrix \(V_L\), the isospin part induces non-standard couplings of Z boson to the LH states described by the matrix \( V_\mathrm{NS} = V_L^\dagger \mathrm{diag}(0,0,0,1) V_L\), or explicitly

$$\begin{aligned} V_\mathrm{NS} = \left( \begin{array}{cccc} \vert V_{4d} \vert ^2 &{} V_{4d}^*V_{4s} &{} V_{4d}^*V_{4b} &{} V_{4d}^*V_{4b'} \\ V^*_{4s} V_{4d} &{} \vert V_{4s} \vert ^2 &{}V_{4s}^*V_{4b} &{} V_{4s}^*V_{4b'}\\ V^*_{4b} V_{4d} &{} V^*_{4b} V_{4s} &{}\vert V_{4b} \vert ^2 &{} V^*_{4b} V_{4b'} \\ V_{4d}&{}V_{4s}&{}V_{4b} &{} V_{4b'} \end{array}\right) _L \end{aligned}$$
(17)

These couplings can induce strong flavor-changing and CP-violating effects in \(K^0-\overline{K}^0\) system, as well as too large decay rates for \(K_L \rightarrow \mu ^+\mu ^-\) etc. [15]. In fact, they can be suppressed if \( V_{cb'} \) and \( V_{tb'} \) are much less than \(V_{ub'}\), or at least have rather small complex parts. (Accidentally, \(\vert V_{ub'} \vert \simeq 0.04 \) is comparable to \(\vert V_{cb} \vert \) and ten times larger than \(\vert V_{ub} \vert \).) The picture with the 4th state \(b'\) having a larger mixing with the first family than with (heavier) 2nd and 3rd families looks somewhat ad hoc, but it is not excluded by the present experimental limits. The implications of a TeV scale extra vector-like quarks \(b'\) or \(t'\) with significant mixing with the three normal families deserve careful analysis which will be given in details elsewhere [14].

Fig. 4
figure 4

The SM contribution to the muon decay mediated by W-boson (left), and the BSM contribution mediated by the flavor-changing \({{\mathcal {F}}}\)–boson (right)

5. “But what if the Hill comes to the CKM?”  Here we discuss just the opposite possibility: instead of moving the unitarity line to the probability distribution Hill in Fig. 2, we move the Hill towards the unitarity line.

Namely, we consider that the Fermi constant \(G_F\) in the effective interaction (4) which is responsible for leptonic decays of hadrons can be different from the effective constant \(G_\mu \) determined from the muon lifetime. We assume that besides the SM interaction (5) mediated by charged W–boson, there is also a new operator

$$\begin{aligned} -\frac{4G_{{\mathcal {F}}}}{\sqrt{2}} (\overline{e_L} \gamma _\mu \mu _L)(\overline{\nu _\mu } \gamma ^\mu \nu _e) \end{aligned}$$
(18)

mediated by a hypothetical lepton flavor changing neutral gauge boson \({{\mathcal {F}}}\). The respective diagrams, shown in Fig. 4, have positive interference for the muon decay. Namely, by Fierz transformation this new operator can be brought to the form (5), so that the sum of these two diagrams effectively gives the operator

$$\begin{aligned} - \frac{4G_\mu }{\sqrt{2}}\, \big (\overline{e_L} \gamma _\mu \nu _{e} \big ) \big (\overline{\nu _{\mu }} \gamma ^\mu \mu _L \big ) , \end{aligned}$$
(19)

the same as (5) but with the coupling constant

$$\begin{aligned} G_\mu = G_F + G_{{\mathcal {F}}}=G_F(1+ \delta _\mu ), \quad \frac{G_{{\mathcal {F}}}}{G_F} \equiv \delta _\mu >0 . \end{aligned}$$
(20)

Constant \(G_\mu =1.1663787(6) \times 10^{-5}\) GeV\(^{-2}\) is determined with great precision from the muon decay [3]. Now Eqs. (6) and (7), instead of \(\vert V_{ud}\vert \) and \(\vert V_{us}\vert \), are determining respectively the values \(\vert V_{ud} \vert \times G_F/G_\mu \) and \(\vert V_{us} \vert \times G_F/G_\mu \). Instead the value of \(\vert V_{us}/V_{ud}\vert \) determined from (8) remains unchanged since the Fermi constant cancels out. Thus, under our hypothesis, the dataset (10) should be modified to the following:

$$\begin{aligned}&\vert V_{us} \vert = 0.22333(60) \times (1+\delta _\mu ) \nonumber \\&\vert V_{us}/V_{ud} \vert = 0.23130(50) \nonumber \\&\vert V_{ud} \vert = 0.97370(14) \times (1+\delta _\mu ) \end{aligned}$$
(21)

Now, involving the extra parameter \(\delta _\mu \) but assuming the 3-family unitarity (2), the fit of the above dataset has acceptable quality, \(\chi ^2 = 6.1\), and the best fit point corresponds to \(\delta _\mu =0.00076\). This situation is shown in Fig. 5 in which the values of \(\vert V_{ud} \vert \) and \(\vert V_{us} \vert \) are determined by taking \(\delta _\mu =0.00076\). By this choice of the extra parameter the fit becomes perfectly compatible with the unitarity (2). The probability distribution Hill is moved up so that its top now lies on the unitarity line.

By imposing the unitarity condition \(\vert V_{ud} \vert ^2 + \vert V_{us} \vert ^2 = 1- \vert V_{ub} \vert ^2\), the list (21) can be transformed in \(\delta _\mu \) dependent determinations A, B, C of \(\vert V_{us} \vert \). Figure 6 shows these determinations for \(\delta _\mu =0.00076\). Taking into account that \(G_F/\sqrt{2} = g^2/8M_W^2 = 1/4v_\mathrm{w}^2\), where \(v_\mathrm{w} = 174\) GeV is the weak scale, and parametrizing similarly \(G_{{\mathcal {F}}}/\sqrt{2} = 1/4v_{{\mathcal {F}}}^2\), we see that \(\delta _\mu = G_{{\mathcal {F}}}/G_F = 0.00076\) corresponds to \(v_{{\mathcal {F}}}/v_\mathrm{w} = 36.3\), or to the flavor symmetry breaking scale \(v_{{\mathcal {F}}}= 6.3\) TeV. More widely, the \(1\sigma \) interval of the parameter \(\delta _\mu \) consistent with unitarity at the 68% C.L. is \(\delta _\mu = (7.6 \pm 1.6) \times 10^{-4}\) which corresponds to the new scale in the interval \(v_{{\mathcal {F}}}= [5.7 \div 7.1]\) TeV.

Fig. 5
figure 5

The same as on Fig. 2 but with the bands of \(\vert V_{ud}\vert \), \(\vert V_{us}\vert \) and \(\vert V_{ud}/V_{us} \vert \) taken as in (21) with \(1+\delta _\mu =1.00076\). The red line corresponds to 3-family unitarity (2) as in Fig. 2

Fig. 6
figure 6

Determinations of \(\vert V_{us}\vert \) obtained from (21)

6. The non-abelian gauge horizontal flavor symmetry \(G_H\) between the fermion families can be the key for understanding the quark and lepton mass and mixing pattern [16,17,18,19,20]. Namely, the form of the Yukawa matrices \(Y_{u,d,e}\) in (12) can be determined by the \(G_H\) symmetry breaking pattern, i.e. by the VEV structure of the horizontal scalar fields (flavons) responsible for this breaking. Then the fermion mass hierarchy is related to the hierarchy between these VEVs. In Refs. [16, 17] this conjecture was coined as hypothesis of horizontal hierarchies (HHH). In this picture the fermion masses emerge from the higher order operators involving, besides the Higgs doublet \(\phi \), also flavon scalars which transfer their VEV structure to the Yukawa matrices \(Y_{u,d,e}\). These so called “projective” operators in the UV-complete renormalizable theory can be obtained via integrating out some extra heavy fields, scalars [18,19,20] or vector-like fermions [16, 17]. In particular, this concept implies that the fermion masses cannot emerge if \(G_H\) symmetry is unbroken. Thus, \(G_H\) cannot be a vector-like symmetry but it should have a chiral character transforming the LH and RH particle species in different representations. In particular, in Refs. [16,17,18,19,20,21,22,23,24,25,26] the horizontal symmetry \(G_H\) was considered as \(SU(3)_H\) with the LH fermions of the three families transforming as triplets and the RH ones as anti-triplets, as it is motivated by the grand unification.

However, in the Standard Model framework one has more possibilities. Namely, in the limit of vanishing Yukawa couplings \(Y_{u,d,e} \rightarrow 0\) in (12), the SM Lagrangian acquires a maximal global chiral symmetry \(U(3)_Q \times U(3)_u \times U(3)_d \times U(3)_\ell \times U(3)_e\) under which fermion species Q, u etc. transform as triplets of independent U(3) groups. It is tempting to consider that the non-abelian SU(3) factors of this maximal flavor symmetry are related to gauge symmetries.Footnote 1

Let us concentrate on the lepton sector and discuss the gauge symmetry \(SU(3)_\ell \times SU(3)_e\) [31] under which the LH and RH lepton fields transform as

$$\begin{aligned} \ell _{L\alpha } = \left( \begin{array}{c}{\nu _\alpha }\\ {e_\alpha } \end{array}\right) _L \sim (\mathbf {3}_\ell ,1), \quad e_{R\gamma } \sim (1,\mathbf {3}_e) \end{aligned}$$
(22)

where \(\alpha =1,2,3\) and \(\gamma =1,2,3\) are the indices of \(SU(3)_\ell \) and \(SU(3)_e\) respectively. This set of fermions is not anomaly free. The ways of the anomaly cancellation were discussed in Ref. [31] and in this letter we shall not concentrate on this issue.Footnote 2

For breaking \(SU(3)_\ell \times SU(3)_e\) we introduce flavon fields, three triplets \(\eta _{i\alpha }\) of \(SU(3)_\ell \) and three triplets \(\xi _{i\gamma }\) of \(SU(3)_e\), \(i=1,2,3\). Then the charged lepton masses emerge from the gauge invariant dimension–6 operator

$$\begin{aligned} \frac{y_{ij} }{{{\mathcal {M}}}^2}\, \eta _{i\alpha } \overline{\xi }_{j}^{\gamma } \phi \, \overline{ \ell _{L\alpha } } e_{R\gamma } \, + \, \mathrm{h.c.} \end{aligned}$$
(23)

where \(y_{ij}\) are order one constants, \(\phi \) is the Higgs doublet and \({{\mathcal {M}}}\) is a cutoff scale. In an UV-complete theory such operators can be induced via seesaw-like mechanism by integrating out some heavy scalar or fermion states [16,17,18,19,20]. However, concrete model building is not the scope of this paper, and for our demonstration effective operator analysis is sufficient. As for the neutrinos, their Majorana masses are induced by the higher order operator

$$\begin{aligned} \frac{h_{ij} }{{{\mathcal {M}}}_\nu ^3} \,\overline{\eta }_{i}^{\alpha }\overline{\eta }_{j}^{\beta } \, \phi \phi \, \ell ^T_{L\alpha } C \ell _{\beta } \, + \, \mathrm{h.c.} \, \end{aligned}$$
(24)

where \(h_{ij} = h_{ji}\). The cutoff scale \({{\mathcal {M}}}_\nu \) of this operator is not necessarily the same as the scale \({{\mathcal {M}}}\) of operator (23).

In order to generate non-zero masses of all three leptons \(e,\mu ,\tau \), all three \(SU(3)_\ell \) flavons \(\eta _i\) as well as \(SU(3)_e\) \(\xi _i\) should have non-zero VEVs with disoriented directions. This means that the VEVs \(\langle \eta _{i\alpha } \rangle \) should form a rank-3 matrix. Without losing generality, the flavon basis can be chosen so that the matrix \(\langle \eta _{i\alpha } \rangle \) is diagonal, \(\langle \eta _{i\alpha } \rangle = w_i \delta _{i\alpha }\), i.e. the flavon VEVs are orthogonal:

$$\begin{aligned} \langle \eta _1 \rangle = \left( \begin{array}{c} w_1 \\ 0 \\ 0 \end{array}\right) , \quad \langle \eta _2 \rangle = \left( \begin{array}{c} 0 \\ w_2 \\ 0 \end{array}\right) , \quad \langle \eta _3 \rangle = \left( \begin{array}{c} 0 \\ 0 \\ w_3 \end{array}\right) \end{aligned}$$
(25)

Analogously, for \(\xi \)-flavons we take \(\langle \xi _{i\gamma } \rangle = v_i \delta _{i\gamma }\). After plugging these VEVs into (23) we obtain the leptonic Yukawa matrices in the SM Lagrangian (12) as

$$\begin{aligned} Y_e^{ij} = y_{ij} \frac{w_i v_j}{{{\mathcal {M}}}^2} \end{aligned}$$
(26)

Since the couplings (23) should give the lepton mass hierarchy, we consider that the latter emerges due to the VEV hierarchy \(v_3 \gg v_2 \gg v_1\) in \(SU(3)_e\) symmetry breaking, i.e. \(v_3 : v_2 : v_1 \sim m_\tau : m_\mu : m_e\) as it is described in Ref. [31]. On the other hand, operator (24) should give the observed neutrino mass pattern, \(m_\nu ^{ij} = h_{ij} w_i w_j v_\mathrm{w}^2/{{\mathcal {M}}}_\nu ^3\), and in particular the large neutrino mixing. This implies that \(SU(3)_\ell \) breaking flavons \(\eta \) should have comparable VEVs, \(w_3\sim w_2 \sim w_1\).

Gauge bosons \({{\mathcal {F}}}^\mu _a\) of \(SU(3)_\ell \) associated to the Gell–Mann matrices \(\lambda _a\), \(a=1,2, \ldots 8\), interact as \(g {{\mathcal {F}}}^\mu _a J_{a\mu }\) with the respective currents \(J_{a\mu } = J_{a\mu }^{(e)} + J_{a\mu }^{(\nu )} = \frac{1}{2} \overline{\varvec{e}_L} \gamma _\mu \lambda _a {\varvec{e}_L} + \frac{1}{2} \overline{\varvec{\nu }_L} \gamma _\mu \lambda _a {\varvec{\nu }_L} \), where g is the gauge coupling constant, \({\varvec{e}_L} = (e_1,e_2,e_3)_L^T\) and \({\varvec{\nu }_L} = (\nu _1,\nu _2,\nu _3)_L^T\) respectively denote the family triplets of the LH charged leptons and neutrinos.

At low energies these couplings induce four-fermion (current \(\times \) current) interactions:

$$\begin{aligned} \mathcal {L}_\mathrm{eff} = - \frac{g^2}{2} J_{a}^{\mu }\, \left( M^2\right) ^{-1}_{ab} \, J_{b\mu } \end{aligned}$$
(27)

where \(M^2_{ab}\) is the squared mass matrix of gauge bosons \({{\mathcal {F}}}^\mu _a\) which in the flavon VEV basis (25) is essentially diagonal apart of a non-diagonal \(2\times 2\) block related to \({{\mathcal {F}}}^\mu _3\) - \({{\mathcal {F}}}^\mu _8\) mixing. Namely, the masses of \({{\mathcal {F}}}^\mu _{1,2}\), \({{\mathcal {F}}}^\mu _{4,5}\) and \({{\mathcal {F}}}^\mu _{6,7}\) are

$$\begin{aligned}&M_{1,2}^2 = \frac{g^2}{2}(w_2^2 + w_1^2) = \frac{g^2}{2}v_{{\mathcal {F}}}^2 , \nonumber \\&M_{4,5}^2 = \frac{g^2}{2}(w_3^2 + w_1^2), \quad M_{6,7}^2 = \frac{g^2}{2}(w_3^2 + w_2^2) . \end{aligned}$$
(28)

As for \({{\mathcal {F}}}^\mu _3\) and \({{\mathcal {F}}}^\mu _8\) they have a mass mixing and their mass matrix reads

$$\begin{aligned} M_{38}^2 = \frac{g^2}{2} \left( \begin{array}{cc}{ w_2^2 + w_1^2}&{}{\frac{1}{\sqrt{3}}(w_1^2 - w_2^2)}\\ {\frac{1}{\sqrt{3}}(w_1^2 - w_2^2)}&{}{\frac{1}{3}(4 w_3^2 + w_1^2 + w_2^2) } \end{array}\right) . \end{aligned}$$
(29)

Notice that if \(w_1=w_2=v_{{\mathcal {F}}}/\sqrt{2}\), this matrix becomes diagonal. In the following, for the simplicity of our demonstration, we analyze this case.Footnote 3 Then for the gauge boson masses we have \(M_a^2 = (g^2/2) (x_a v_{{\mathcal {F}}})^2\), where

$$\begin{aligned} x^2_{1,2,3} = 1, \quad x^2_{4,5,6,7} = \frac{r+1}{2} , \quad x^2_8 = \frac{2r+1}{3} \end{aligned}$$
(30)

and \(r = 2 w_3^2/v_{{\mathcal {F}}}^2\). Then operators (27) can be rewritten as \(\mathcal {L}_\mathrm{eff} = \mathcal {L}_\mathrm{eff}^{e\nu } + \mathcal {L}_\mathrm{eff}^{ee} + \mathcal {L}_\mathrm{eff}^{\nu \nu }\) where

$$\begin{aligned}&\mathcal {L}_\mathrm{eff}^{e\nu } = - \frac{2G_{{\mathcal {F}}}}{\sqrt{2} }\, \sum _{a=1}^8 \big (\overline{\varvec{e}_L} \, \gamma ^\mu \frac{\lambda _a}{x_a} \, {\varvec{e}_L}\big ) \big (\overline{\varvec{\nu }_L} \, \gamma _\mu \frac{\lambda _a}{x_a} \, {\varvec{\nu }_L} \big ) \nonumber \\&\mathcal {L}_\mathrm{eff}^{ee} = - \frac{G_{{\mathcal {F}}}}{\sqrt{2}}\, \sum _{a=1}^8 \big (\overline{\varvec{e}_L} \, \gamma _\mu \frac{\lambda _a}{x_a} \, {\varvec{e}_L} \big )^2 \nonumber \\&\mathcal {L}_\mathrm{eff}^{\nu \nu } = - \frac{G_{{\mathcal {F}}}}{\sqrt{2}}\, \sum _{a=1}^8 \big (\overline{\varvec{\nu }_L}\, \gamma _\mu \frac{\lambda _a}{x_a} \, {\varvec{\nu }_L} \big )^2 \end{aligned}$$
(31)

where \(4G_{{\mathcal {F}}}/\sqrt{2} = 1/v_{{\mathcal {F}}}^2\). Obviously, the factor \(g^2/2\) in operators cancels out and the strength of these operators is determined solely by the VEVs (25).

The first term \(\mathcal {L}_\mathrm{eff}^{e\nu }\) contains operator (18) which contributes to the muon decay \(\mu \rightarrow e \nu _\mu \bar{\nu }_e\) as \(G_\mu = G_F + G_{{\mathcal {F}}}\). It is induced by exchange of gauge bosons \({{\mathcal {F}}}_1^\mu \) and \({{\mathcal {F}}}_2^\mu \), or more precisely by the combination \(({{\mathcal {F}}}_1^\mu \pm i{{\mathcal {F}}}_2^\mu )/\sqrt{2}\), as in second diagram of Fig. 4. As it was pointed out in the previous section, for restoring the CKM unitarity one needs \(\delta _\mu = G_{{\mathcal {F}}}/G_F = (v_\mathrm{w}/v_{{\mathcal {F}}})^2\) to be around \( 7\times 10^{-4}\) which corresponds to the flavor scale \(v_{{\mathcal {F}}}= 6\div 7\) TeV.

The similar operators in \(\mathcal {L}_\mathrm{eff}^{e\nu }\) mediated by the gauge bosons \({{\mathcal {F}}}_{4,5}^\mu \) and \({{\mathcal {F}}}_{6,7}^\mu \) contribute to the taon leptonic decays \(\tau \rightarrow e \nu _\tau \bar{\nu }_e\) and \(\tau \rightarrow \mu \nu _\tau \bar{\nu }_\mu \) which rates are well consistent with the SM predictions [33]. Then, in the case \(w_{1,2,3}\sim v_{{\mathcal {F}}}\) but \(w_1\ne w_2\), the branching ratio \(\Gamma (\tau \rightarrow \mu \nu _\tau \bar{\nu }_\mu )/\Gamma (\tau \rightarrow e \nu _\tau \bar{\nu }_e)\) can have order \(G_{{\mathcal {F}}}/G_F \sim \delta _\mu \) deviation from the SM prediction which can be experimentally testable. For a comparison, the present experimental value of this ratio is 0.9762(28) [5], which is \(1.3\sigma \) larger than the SM predicted value 0.9726. In addition, the terms in \(\mathcal {L}_\mathrm{eff}^{e\nu }\) with the diagonal generators \(\lambda _3\) and \(\lambda _8\) give rise to non-standard neutrino interactions with leptons. But respective coupling constants are of the order of \(G_{{\mathcal {F}}}= \delta _\mu G_F\), and hence well below the experimental constraints.

The last term \(\mathcal {L}_\mathrm{eff}^{\nu \nu }\) in (31) contains the non-standard interactions between neutrinos, but present experimental limits on the neutrino self-interactions are very weak. However, second term \(\mathcal {L}_\mathrm{eff}^{ee}\) in (31) containing charged leptons in principle is testable for the scale \(v_{{\mathcal {F}}}\) of few TeV.

Interestingly, if the flavor eigenstates \(e_{1},e_{2},e_{3}\) are the mass eigenstates \(e,\mu ,\tau \), the terms (31) do not contain any LFV operators inducing processes like \(\mu \rightarrow 3e\), \(\tau \rightarrow 3\mu \) etc. However, the lepton flavor-conserving contact operators \(-\frac{4\pi }{\Lambda _L^2}(\overline{e_L}\gamma _\mu e_L)^2\), \(-\frac{2\pi }{\Lambda _L^2}(\overline{e_L}\gamma ^\mu e_L)(\overline{\mu _L}\gamma _\mu \mu _L)\), etc. are restricted by the ‘compositeness’ limits \(\Lambda ^-_L(eeee) > 10.3\) TeV and \(\Lambda ^-_L(ee\mu \mu ) > 9.5\) TeV. Comparing these operators with the corresponding terms in (31) and taking into account the relations (30), the ‘compositeness’ scales can be expressed in terms of the scale \(v_{{\mathcal {F}}}\). Hence, we obtain the limit

$$\begin{aligned} v_{{\mathcal {F}}}> \left( \frac{r+1}{r+0.5} \right) ^{1/2} \times 2.1 ~ \mathrm{TeV} . \end{aligned}$$
(32)

Here the r–dependent pre-factor approaches 1 when \(r\gg 1\) and it becomes \(\sqrt{2}\) in the opposite limit \(r \ll 1\). Thus, the strongest limit emerges in the latter case, \(v_{{\mathcal {F}}}> 3\) TeV or so, which is anyway fulfilled for our benchmark range \(v_{{\mathcal {F}}}\simeq (6\div 7)\) TeV.

The flavor eigenstates \(e_{1},e_{2},e_{3}\) coincide with the mass eigenstates \(e,\mu ,\tau \), if the Yukawa matrix \(Y_e^{ij}\) in (26) is diagonal. This can be achieved by imposing some additional discrete symmetries between the flavons \(\eta _i\) and \(\xi _i\) of \(SU(3)_\ell \) and \(SU(3)_e\) sectors which would forbid the non-diagonal terms \(y_{ij}\) in operator (23). However, in general case the initial flavor basis of the LH leptons is related to the mass basis by the unitary transformation

$$\begin{aligned} \left( \begin{array}{c} e_1\\ e_2\\ e_3 \end{array}\right) _{L} = U_L \left( \begin{array}{c} e\\ \mu \\ \tau \end{array}\right) _{L} =\left( \begin{array}{ccc} U_{1e} &{} U_{1\mu } &{} U_{1\tau } \\ U_{2e} &{}U_{2\mu }&{} U_{2\tau }\\ U_{3e}&{} U_{3\mu }&{} U_{3\tau } \end{array}\right) \left( \begin{array}{c} e\\ \mu \\ \tau \end{array}\right) _{L} \end{aligned}$$
(33)

Then, in the basis of mass eigenstates, the operators \(\mathcal {L}_\mathrm{eff}^{ee}\) read as in (31) but with the substitution \(\lambda _a/x_a \rightarrow U^\dagger (\lambda _a/x_a) U\). Interestingly, in the limit \(r=1\), i.e. when the VEVs \(w_{1,2,3}\) are equal and so \(x_a =1\), all flavor bosons \({{\mathcal {F}}}_a^\mu \) have equal masses, and the substitution \(\lambda _a \rightarrow U^\dagger \lambda _a U\) is simply a basis redetermination of the Gell-Mann matrices. Therefore, no LFV effects will emerge in this case since the global \(SO(8)_\ell \) symmetry acts as a custodial symmetry. Namely, by Fierz transformations, using also the Fierz identities for the Gell-Mann matrices, we obtain

$$\begin{aligned} - \frac{G_{{\mathcal {F}}}}{\sqrt{2}}\, \sum _{a=1}^8 \big (\overline{\varvec{e}_L} \gamma _\mu \lambda _a {\varvec{e}_L} \big )^2 = - \frac{4}{3} \frac{G_{{\mathcal {F}}}}{\sqrt{2}}\, \big (\overline{\varvec{e}_L} \gamma _\mu {\varvec{e}_L} \big )^2 \end{aligned}$$
(34)

Obviously, the latter expression is invariant under the unitary transformation (33).

In general case \(r\ne 1\), the mixing (33) gives rise to the LFV operators as e.g. the one inducing \(\mu \rightarrow 3e\) decay:

$$\begin{aligned}&-\frac{4G_{\mu eee} }{\sqrt{2}} \big (\overline{e_L} \gamma ^\mu \mu _L\big )\big (\overline{e_L} \gamma ^\mu e_L\big ) \, + \, \mathrm{h.c.} , \nonumber \\&\frac{4G_{\mu eee} }{\sqrt{2}} = \frac{C(r) }{2v_{{\mathcal {F}}}^2} \left[ 1 + \frac{1-r}{r} \vert U_{3e} \vert ^2 \right] U^*_{3e} U_{3\mu } , \end{aligned}$$
(35)

where the function \(C(r) = (r-1) r \big [(r+1)(r+0.5)\big ]^{-1}\) is limited as \(\vert C(r) \vert < 1\), reaching the maximal value at \(r\gg 1\), and it vanishes at \(r=1\). Then, taking \(\vert U_{3e} \vert \ll 1\), we obtain for the branching ratio of \(\mu \rightarrow 3e\) decay

$$\begin{aligned} \frac{\Gamma (\mu \rightarrow ee\bar{e})}{\Gamma (\mu \rightarrow e\nu _\mu \bar{\nu }_e)} = \frac{1}{2} \left| \frac{G_{\mu eee}}{G_F} \right| ^2 = \frac{1}{8} \left( \delta _\mu C(r) \vert U^*_{3e} U_{3\mu } \vert \right) ^2\nonumber \\ \end{aligned}$$
(36)

The experimental upper bound on this branching ratio is \(10^{-12}\) [5]. Taking \(\delta _\mu = (v_\mathrm{w}/v_{{\mathcal {F}}})^2 = 7\times 10^{-4}\), the limit \(\delta _\mu \vert C U^*_{3e} U_{3\mu } \vert /\sqrt{8} < 10^{-6}\) translates into \(\vert C U^*_{3e} U_{3\mu } \vert < 0.4 \times 10^{-2}\) which is nicely satisfied if the lepton mixing angles in (33) are comparable with the CKM mixing angles in (1) or even larger. E.g. if the VEV ratio is in between \(r=0.5\div 1.5\), then \(\vert C(r) \vert < 1/7\) so that \(\vert U^*_{3e} U_{3\mu } \vert < (1/6)^2\) or so would suffice for properly suppressing the \(\mu \rightarrow 3e\) decay rate. This means that in this case the matrix elements \(\vert U_{3\mu } \vert \) and \(\vert U_{3e} \vert \) can be almost as large as the Cabibbo angle \(\sin \theta _C = V_{us}\). The experimental limits on other LFV effects as e.g. \(\tau \rightarrow 3\mu \) are weaker, and following the lines of Ref. [31] one can show that in our model with \(v_{{\mathcal {F}}}\simeq 6\) TeV or so, they are fulfilled even for whatever large mixings in (33). Once again, for \(r=1\) all LFV effects are vanishing owing to custodial symmetry, see Eq. (34).

7. Let us discuss briefly how the hypothesis \(G_\mu \ne G_F\) could affect the SM precision tests. In the SM, at tree level, the weak gauge boson masses are \(M_W= gv_\mathrm{w}/\sqrt{2}=e v_\mathrm{w}/\sqrt{2} \sin \theta _W\) and \(M_Z=M_W/\cos \theta _W\) where \(\theta _W\) is the weak angle. For precision tests the radiative corrections are important which depend also on the top quark and Higgs mass.

The world averages of experimentally measured masses of Z and W reported by PDG 2018 are [5]:

$$\begin{aligned}&M_Z^\mathrm{exp} = 91.1876(21)\,\mathrm{GeV}, \nonumber \\&M_W^\mathrm{exp} = 80.379(12)\,\mathrm{GeV}, \end{aligned}$$
(37)

while the SM global fit yields to the following values:

$$\begin{aligned}&M_Z^\mathrm{SM} = 91.1884(20)~\mathrm{GeV}, \nonumber \\&M_W^\mathrm{SM} = 80.358(4)~\mathrm{GeV}. \end{aligned}$$
(38)

Hence, the theoretical and experimental values of Z-mass are in perfect agreement while for W-boson the two values have about \(1.6\sigma \) discrepancy:

$$\begin{aligned} M_W^\mathrm{exp} - M_W^\mathrm{SM}= (21 \pm 13)~\mathrm{MeV} \end{aligned}$$
(39)

In the SM the mass of W-boson, including radiative corrections, is determined as

$$\begin{aligned} M_W = \frac{A_0}{\hat{s}_Z (1-\Delta {\hat{r}}_W)^{1/2} } \end{aligned}$$
(40)

where \(A_0 = (\pi \alpha /\sqrt{2} G_F)^{1/2} = 37.28039(1)\) GeV taking \(G_F=G_\mu \), the factor \(1-\Delta {\hat{r}}_W=0.93084(8)\) includes the main radiative corrections and \(\hat{s}_Z^2 =1.0348(2) s_W^2\) is the corrected value of \(\sin ^2 \theta _W(M_Z)\) by including the top and Higgs mass dependent corrections. The theoretical mass \(M_W=80.358(4)\) GeV (38) is then obtained by substituting in (40) the value \(\hat{s}_Z^2 = 0.23122(3)\) obtained from the SM global fit [5]. In our scenario, however, \(G_F \ne G_\mu \). Should we just set in \(A_0\) instead of \(G_F=G_\mu \) the “corrected” value \(G_F = (1+ \delta _\mu )^{-1} G_\mu \), then \(A_0\) should be rescaled by a factor \((1+\delta _\mu )^{1/2}\), and correspondingly the “theoretical” value of \(M_W\) (40) too. In particular, for \(\delta _\mu = 7\times 10^{-4}\) we would get \(M_W=80.386\) GeV, right in the ball-park of the experimental values (38). However, this is not the right thing to do.

In the global fit of SM \(M_Z\) is one of the input parameters with smallest experimental errors, along with the fine structure constant \(\alpha \) and the “muon” Fermi constant \(G_\mu \). Essentially, this is the main reason of the good coincidence between \(M_Z^\mathrm{exp}\) and \(M_Z^\mathrm{SM}\). In fact, the SM implies the relation

$$\begin{aligned} M_Z = \frac{M_W}{ \hat{c}_Z \hat{\rho }^{1/2}} = \frac{A_0}{\hat{s}_Z \hat{c}_Z (1-\Delta {\hat{r}}_W)^{1/2} \hat{\rho }^{1/2} } \end{aligned}$$
(41)

where \(\hat{\rho }=1+\rho _t + \delta \rho = 1.01013(5)\) includes the weak isospin breaking effects, dominantly from the quadratic \(m_t\) dependent corrections \(\rho _t = 3G_Fm_t^2/8\sqrt{2}\pi ^2\). Therefore, taking the experimental value of Z-mass (37), Eq. (41) can be used for determination of \(\hat{s}_Z^2\) parameter, \(\hat{s}_Z^2=0.23123(3)\). This, in turn, from \(M_W= M_Z \hat{\rho }^{1/2} \hat{c}_Z\) gives \(M_W=80.357(4)_\mathrm{SM}\) GeV, i.e. practically the same as the global fit result (38). This is because the determination of the parameter \(\hat{s}_Z^2\) in the SM global fit is dominated by the results of Z-pole measurements.

However, in our scenario rescaling \(A_0 \rightarrow A_0(1+\delta _\mu )^{1/2}\) changes the value of \(\hat{s}_Z^2\). In particular, taking \(\delta _\mu = (7.6\pm 1.6)\times 10^{-4}\), we get \(\hat{s}_Z^2=0.23148(3)_\mathrm{SM}(5)_{\delta _\mu }\). Then, again from \(M_W= M_Z \hat{\rho }^{1/2} \hat{c}_Z\), we get \(M_W=80.344(4)_\mathrm{SM}(3)_\mathrm{\delta _\mu }\) GeV. Thus, unfortunately, while the effect is there, in reality it goes right to the opposite direction. So, our determination of \(M_W\) differs from \(M_W^\mathrm{SM}\), \(M_W^\mathrm{SM} - M_W^\mathrm{our} = (13 \pm 3)\) MeV. Thus, with \(M_W^\mathrm{SM}\) already being in tension with the experimental value (37), our result has more tension: \(M_W^\mathrm{exp} - M_W^\mathrm{our}= (35 \pm 13)\) MeV (\(2.7\sigma \)).Footnote 4 If the tension will increase with future precision, this would mean that one has to admit at least some minimal step beyond the SM. The relation between W and Z masses can be improved by increasing of \(\rho \)-parameter via e.g. the VEV \(\sim 1\) GeV of a scalar triplet of the electroweak \(SU(2) \times U(1)\), or by diminishing Z mass by few MeV e.g. via its mixing with some extra gauge bosons like \(Z'\) at the TeV scale or perhaps also with the flavor gauge bosons considered in the previous section.

8. The value \(\vert V_{ud} \vert \) can be extracted also from free neutron decay by combining the results on the measurements of the neutron lifetime \(\tau _n\) with those of the axial current coupling constant \(g_A\). The master formula reads (see e.g. in a recent review [34]):

$$\begin{aligned} \vert V_{ud} \vert ^2= & {} \frac{K/ \ln 2 }{ G_F^2 {{\mathcal {F}}}_n \tau _n \, (1+3g_A^2) (1+ \Delta _R^V)} \nonumber \\= & {} \frac{5024.46(30)~\mathrm{s}}{\tau _n (1+3g_A^2)(1+ \Delta _R^V)} \end{aligned}$$
(42)

where \({{\mathcal {F}}}_n = f_n(1+\delta '_R)\) is the neutron f-value \(f_n= 1.6887(1)\) corrected by the long-distance QED correction \(\delta '_R = 0.01402(2)\) [35]. This equation, taking the values \(\tau _n=880.2\pm 1.0\) s, \(g_A=1.2724\pm 0.0023\) adopted in PDG 2018 [5], and \(\Delta _R^V = 0.02361(38)\) [7], would give the value

$$\begin{aligned} \vert V_{ud} \vert = 0.97577(55)_{\tau _n} (146)_{g_A}(18)_{\Delta _R^V} = 0.97577(157) \end{aligned}$$
(43)

which has an order of magnitude larger error than \( \vert V_{ud} \vert = 0.97420(10)_{{{\mathcal {F}}}t} (18)_{\Delta _R^V} = 0.97420(21)\) obtained from (6) and used in (9), due to large uncertainties in \(\tau _n\) and \(g_A\).

However, rather than for determination of \(\vert V_{ud} \vert \), Eq. (42) can be used for a consistency check. Namely, by comparing it with Eq. (6) we get a relation between \(\tau _n\) and \(g_A\) [36]:

$$\begin{aligned} \tau _n = \frac{2 {{\mathcal {F}}}t}{\ln 2 \, {{\mathcal {F}}}_n (1+3g_A^2)} = \frac{5172.0(1.1) ~\mathrm{s} }{1+3g_A^2} \end{aligned}$$
(44)

In Fig. 7 this relation is shown by the red band. This formula is very accurate since the common factors in Eqs. (6) and (42) including the Fermi constant and radiative corrections \(\Delta _R^V\) cancel out.

Fig. 7
figure 7

The red band shows the precision relation (44) between \(g_A\) and \(\tau _n\). Black triangles with horizontal error bars show values of \(g_A\) reported in Refs. [37,38,39] and vertical grey band corresponds to their average (45). Green circles show values of \(\tau _n\) reported by trap experiments [42,43,44,45,46,47,48,49,50] with respective error bars and horizontal green band shows their average (47). Blue squares and blue horizontal band show the the same for beam experiments [51, 52]

For the axial current coupling \(g_A\), the PDG 2018 quotes the value \(g_A=1.2724\pm 0.0023\). However, the results of the latest and most accurate experiments [37,38,39] which measured \(\beta \)-asymmetry parameter using different techniques (the cold neutrons in PERKEO II and PERKEO III experiments [37, 39] and ultra-cold neutrons in the UCNA experiment [38]), are in perfect agreement among each other, and their average determines the axial current coupling \(g_A\) with impressive (better than one per mille) precision:

$$\begin{aligned} g_A = 1.27625 \pm 0.00050 . \end{aligned}$$
(45)

Figure 7 shows the results of Refs. [37,38,39] and their average (vertical grey band). For \(g_A\) in this range Eq. (44) gives the Standard Model prediction for the neutron lifetime

$$\begin{aligned} \tau _n^\mathrm{SM} = 878.7 \pm 0.6~\mathrm{s} \end{aligned}$$
(46)

From the experimental side, the neutron lifetime is measured in two types of experiments. The trap experiments measure the disappearance rate of the ultra-cold neutrons (UCN) by counting the survived neutrons after storing them for different times in the UCN traps and determine the neutron decay width \(\Gamma _n = \tau _n^{-1}\). The beam experiments are the appearance experiments, measuring the width of \(\beta \)-decay \(n\rightarrow pe{\bar{\nu }}_e\), \(\Gamma _\beta =\tau _\beta ^{-1}\), by counting the produced protons in the monitored beam of cold neutrons. In the Standard Model the neutron decay should always produce a proton, and so both methods should measure the same value \(\Gamma _n = \Gamma _\beta \).

However, there is tension between the results obtained using these two methods, which was pointed out in Refs. [40, 41]. Figure 7 clearly demonstrates the discrepancy. Namely, by averaging the presently available results of eight trap experiments [42,43,44,45,46,47,48,49,50] one obtains:Footnote 5

$$\begin{aligned} \tau _\mathrm{trap} = 879.4 \pm 0.6~\mathrm{s} , \end{aligned}$$
(47)

which is compatible with \(\tau _n^\mathrm{SM} \) (46). In particular, this value of \(\tau _n\) together with new \(g_A\) (45) and new value \(\Delta _R^V = 0.02467(22)\) [8], determines \(\vert V_{ud} \vert \) with the precision more than 3 times better than in (43):

$$\begin{aligned} \vert V_{ud} \vert = 0.97327(33)_{\tau _n} (32)_{g_A}(10)_{\Delta _R^V} = 0.97327(47). \end{aligned}$$
(48)

This is compatible with \(\vert V_{ud} \vert = 0.97370(10)_{{{\mathcal {F}}}t} (10)_{\Delta _R^V} = 0.97370(14)\) from supeallowed \(0^+-0^+\) decays used in (10) but has 3 times larger error than the latter. For making it competitive with the latter determination, the neutron lifetime should be measured with precision of 0.1 s and \(g_A\) with precision 3 times better than in (45), which can be realistic in future experiments.

On the other hand, the beam experiments [51, 52] yield the value

$$\begin{aligned} \tau _\mathrm{beam} = 888.0 \pm 2.0~\mathrm{s} \, \end{aligned}$$
(49)

which is \(4.4\sigma \) away from the SM prediction (46). Therefore, it is more likely that the true value of the neutron lifetime is the one measured by trap experiments (47) which is consistent with the SM prediction (46).

About 1 per cent deficit of produced protons in the beam experiments [51, 52] might be due to some unfixed systematic errors. Alternatively, barring the possibility of uncontrolled systematics and considering the problem as real, a new physics must be invoked which could explain about one per cent deficit of protons produced in the beam experiments. One interesting possibility can be related to the neutron–mirror neutron (\(n-n'\)) oscillation [53,54,55], provided that ordinary and mirror neutrons have a tiny mass difference 300 neV or so [56]. Then in large magnetic fields (5 Tesla) used in beam experiments \(n-n'\) conversion probability can be resonantly enhanced to about \(\sim 0.01\), and the corresponding fraction of neutrons converted in mirror neutrons will decay in an invisible (mirror) channel without producing ordinary protons.

Concluding this section, let us remark that the recent accurate calculations of the short-range radiative corrections \(\Delta _R^V\) [8, 12] and respective redetermination of \(V_{ud}\) has no influence on the determination of \(\tau _n\) (46) obtained from Eq. (44). In fact, the latter equation directly relates the neutron lifetime to the value \({{\mathcal {F}}}t\) accurately measured in superallowed \(0^+-0^+\) nuclear transitions and to the value \(g_A\) obtained from accurate measurements of \(\beta \)-asymmetry. Notice that the relation (44) remains valid also in the presence of non-standard vector \(G_V\) or axial \(G_A\) coupling constants (which can be the case if some non-standard interactions mediated by new vector bosons also contribute to the neutron decay) since the value \(G_V\) (independently whether it is equal to \(G_F \vert V_{ud} \vert \) or not) anyway cancels out [57]. Hence, only the ratio \(g_A=G_A/G_V\) remains relevant which is accurately determined from the measurements of \(\beta \)-asymmetry. In particular, Eq. (44) remains valid in our model with \(G_F\ne G_\mu \) discussed in previous section.

9. In this paper we discussed the CKM unitarity problem. The present experimental and theoretical accuracy in the determination of the first row elements in the CKM matrix (1) indicates towards \(4.3 \sigma \) deviation from the unitarity (2). We investigated two new physics scenarios which could settle the problem. The respective results are summarized in Table 1.

The first, rather straightforward possibility is related to the existence of extra weak isosinglet down-type quark \(b'\) with the mass of few TeV which should have a rather large mixing with the first family, \(\vert V_{ub'} \vert \simeq 0.04\). However, apart of the persistent question “who has ordered that?”, this scenario has some unnatural features related to the flavor-changing phenomena. In particular, given that \(\vert V_{ub'} \vert \simeq 0.04\), then \(b'\) will induce too large effects in \(K^0-\overline{K}^0\) system etc. unless its mixings with 2nd and 3rd families \(V_{cb'}\) and \(V_{tb'}\) are rather small. Perhaps such a situation is possible by some conspiracies, but a priori it looks rather weird.

As another possibility for restoring unitarity, one can introduce additional effective operator contributing to the muon decay in positive interference with the Standard Model contribution. In this case the Fermi constant would be slightly different from muon decay constant, \(G_F = G_\mu /(1+\delta _\mu )\), where \(\delta _\mu \simeq 7\times 10^{-4}\) would suffice for unitarizing the CKM matrix. Namely, the values of \(V_{us}\) and \(V_{ud}\) (which are normally extracted by assuming \(G_F=G_\mu \)) are shifted by a factor \(1+\delta _\mu \) while their ratio is not affected. The needed effective operator can be mediated by a flavor changing boson of a gauge horizontal symmetry \(SU(3)_\ell \) between the three lepton families which is spontaneously broken at the scale of few TeV.

The scenario with gauge inter-family symmetry \(SU(3)_\ell \times SU(3)_e\) acting on left-handed and right-handed leptons can give a natural understanding of the mass hierarchy among charged leptons and large mixing of neutrinos as a consequence of spontaneous breaking pattern of this symmetry. Interestingly, despite the fact that these gauge bosons have flavor-changing couplings with the leptons, their exchanges do not induce dramatic LFV effects as e.g. \(\mu \rightarrow 3e\), \(\tau \rightarrow 3\mu \) decays etc. since these effects can be kept under control thanks to approximate custodial symmetry.

Analogously, one can consider the inter-family gauge symmetry \(SU(3)_Q \times SU(3)_u \times SU(3)_d\) between the quarks. Its breaking pattern can be at the origin of the quark mass and mixing spectrum. Interestingly, the flavor-changing gauge bosons of \(SU(3)_Q\) can contribute to the hadronic decays of kaons, hyperons, etc. In supersymmetric extension of the SM, the chiral gauge symmetries \(SU(3)_\ell \times SU(3)_e\) for leptons and \(SU(3)_Q \times SU(3)_u \times SU(3)_d\) for quarks can be also motivated as a natural tool for realizing the minimal flavor violation scenario [58,59,60].

There is the interesting possibility that these flavor gauge symmetries are common symmetries between particles of ordinary and mirror sectors, which is also motivated by the possibility of cancellation of triangle anomalies of gauge SU(3) factors between the ordinary and mirror fermions [61]. Mirror matter is also a viable candidate for dark matter (see e.g. reviews [62,63,64]). Since flavor gauge bosons are messengers between the two sectors, they can mediate new flavor violating phenomena such as muonium–mirror muonium, kaon–mirror kaon oscillations, etc. [31] and also can give a possible portal for direct detection of mirror matter in dark matter detectors [65, 66]. Cosmological implications and limits were discussed in Ref. [31].