1 Introduction

The neutral current B-anomalies (NCBAs) consist of various measurements in hadronic particle decays which, collectively, are in tension with Standard Model (SM) predictions. The particular observables displaying such tension often involve an effective vertex with an anti-bottom quark, a strange quark, a muon and an anti-muon, i.e. \(({\bar{b}} s)(\mu ^+ \mu ^-)\), plus the charge conjugated version. Observables such as the ratios of branching ratios \(R_K^{(*)} = BR(B\rightarrow K^{(*)} \mu ^+ \mu ^-) / BR(B\rightarrow K^{(*)} e^+ e^-)\) are not displaying the lepton flavour universality (LFU) property expected of the SM [1,2,3]. Such observables are of particular interest because much of the theoretical uncertainty in the prediction cancels in the ratio, leaving the prediction rather precise. Other NCBA observables display some disparity with SM predictions even when their larger theoretical uncertainties are taken into account, for example \(BR(B_s \rightarrow \mu ^+ \mu ^-)\) [4,5,6,7,8], \(BR(B_s \rightarrow \phi \mu ^+ \mu ^-)\) [9, 10] and angular distributions of \(B\rightarrow K^{(*)} \mu ^+ \mu ^-\) decays [11,12,13,14,15,16]. Global fits find that new physics contributions to the \(({\bar{b}} s)(\mu ^+ \mu ^-)\) effective vertex can fit the NCBAs much better than the SM can [17,18,19,20,21,22,23].

A popular option for beyond the SM (BSM) explanations of the NCBAs is that of a \(Z^\prime \) vector boson with family dependent interactions [24,25,26,27,28]. Such a particle is predicted by models with a BSM spontaneously broken U(1) gauged flavour symmetry. The additional quantum numbers of the SM fermions are constrained by the need to cancel local anomalies [29,30,31], for example muon minus tau lepton number [32,33,34,35,36,37], third family baryon number minus second family lepton number [38,39,40], third family hypercharge [41,42,43] or other assignments [44,45,46,47,48,49,50,51,52,53,54,55,56,57,58,59,60,61,62].

The current paper is about the third family hypercharge option. The Third Family Hypercharge Model [41] (henceforth abbreviated as the ‘\(Y_3\) model’) explains the hierarchical heaviness of the third family and the smallness of quark mixing. It was shown to successfully fit NCBAs, along with constraints from \(B_s-{\bar{B}}_s\) mixing and LFU constraints on \(Z^0\) boson interactions. The ATLAS experiment at the LHC has searchedFootnote 1 for the production of BSM resonances that yield a peak in the di-muon invariant mass (\(m_{\mu \mu }\)) spectrum, but have yet to find a significant one [64]. This implies a lower bound upon the mass of the \(Z^\prime \) in the \(Y_3\) model, \(M_X>1.2\) TeV [65], but plenty of viable parameter space remains which successfully explains the NCBAs. A variant, the Deformed Third Family Hypercharge Model (D\(Y_3\) model), was subsequently introduced [43] in order to remedy a somewhat ugly feature (ugly from a naturalness point of view) in the construction of the original \(Y_3\) model: namely, that a Yukawa coupling allowed at the renormalisable level was assumed to be tiny in order to agree with strict lepton flavour violation constraints. The D\(Y_3\) model can simultaneously fit the NCBAs and be consistent with the ATLAS di-muon direct search constraint for 1.2 TeV \(< M_X<\)12 TeV. We will also present results for a third variant, the D\(Y_3^\prime \) model, which is identical to the D\(Y_3\) model but with charges for second and third family leptons interchanged. As we show later on, this results in a better fit to data due to the different helicity structure of the couplings of the \(Z^\prime \) boson to muons (see Sect. 4.3 for details).

In either of these third family hypercharge models, the local gauge symmetry of the SM isFootnote 2 extended to \(SU(3)\times SU(2) \times U(1)_Y \times U(1)_X\). This is spontaneously broken to the SM gauge group by the non-zero vacuum expectation value (VEV) of a SM-singlet ‘flavon’ field \(\theta \) that has a non-zero \(U(1)_X\) charge. In each model, the third family quarks’ \(U(1)_X\) charges are equal to their hypercharges whereas the first two family quarks are chargeless under \(U(1)_X\). We must (since it is experimentally determined to be \({\mathcal {O}}(1)\) and is therefore inconsistent with a suppressed, non-renormalisable coupling) ensure that a renormalisable top Yukawa coupling is allowed by \(U(1)_X\); this implies that the SM Higgs doublet field should have \(U(1)_X\) charge equal to its hypercharge. Consequently, when the Higgs doublet acquires a VEV to break the electroweak symmetry, this gives rise to \(Z^0-Z^\prime \) mixing [41]. Such mixing is subject to stringent constraints from electroweak precision observables (EWPOs), in particular from the \(\rho \)-parameter, which encodes the ratio of the masses of the \(Z^0\) boson and the W boson [66].

Third family hypercharge models can fit the NCBAs for a range of the ratio of the \(Z^\prime \) gauge coupling to its mass \(g_{X}/M_{X}\) which does not contain zero. This means that it is not possible to ‘tune the \(Z-Z^\prime \) mixing away’ if one wishes the model to fit the NCBAs. As a consequence, it is not clear whether the EWPOs will strongly preclude the (D)\(Y_3\) models from explaining the NCBAs or not.

The purpose of this paper is to perform a global fit to a combined set of electroweak and NCBA-type data, along with other relevant constraints on flavour changing neutral currents (FCNCs). It is clear that the SM provides a poor fit to this combined set, as Table 1 shows. A p-valueFootnote 3 of .0007 corresponds to ‘tension at the \({3.4}{}\sigma \) level’.

The (D)\(Y_3\) models are of particular interest as plausible models of new physics if they fit the data significantly better than the SM, a question which can best be settled by performing appropriate global fits.

Our paper proceeds as follows: we introduce the models and define their parameter spaces in Sect. 2. At renormalisation scales at or below \(M_{X}\) but above \(M_W\), we encode the new physics effects in each model via the Standard Model Effective Theory (SMEFT). We calculate the leading (dimension-6) SMEFT operators predicted by our models at the scale \(M_{X}\) in Sect. 3. These provide the input to the calculation of observables by smelli-2.2.0 [67],Footnote 4 which we describe at the beginning of Sect. 4. The results of the fits are presented in the remainder of Sect. 4, before a discussion in Sect. 5.

Table 1 SM goodness of fit for the different data sets we consider, as calculated by smelli-2.2.0. We display the total Pearson’s chi-squared \(\chi ^2\) for each data set along with the number of observables n and the data set’s p-value. The set named ‘quarks’ contains \(BR(B_s \rightarrow \phi \mu ^+ \mu ^-)\), \(BR(B_s\rightarrow \mu ^+\mu ^-)\), \(\Delta m_s\) and various differential distributions in \(B\rightarrow K^{(*)} \mu ^+\mu ^-\) decays among others, whereas ‘LFU FCNCs’ contains \(R_{K^{(*})}\) and some B meson decay branching ratios into di-taus. Our sets are identical to those defined by smelli-2.2.0 and we refer the curious reader to its manual [67], where the observables are enumerated. We have updated \(R_K\) and \(BR(B_{s,d}\rightarrow \mu ^+ \mu ^-)\) with the latest LHCb measurements as detailed in the text

2 Models

In this section we review the models of interest to this study, in sufficient detail so as to proceed with the calculation of the SMEFT Wilson coefficients (WCs) in the following section. Under \(SU(3)\times SU(2) \times U(1)_Y\), we define the fermionic fields such that they transform in the following representations: \({Q_L}_i:=({u_L}_i\ {d_L}_i)^T \sim (\varvec{3},\ \varvec{2},\ +1/6)\), \({L_L}_i:=({\nu _L}_i\ {e_L}_i)^T \sim (\varvec{1},\ \varvec{2},\ -1/2)\), \({e_R}_i \sim (\varvec{1},\ \varvec{1},\ -1)\), \({d_R}_i \sim (\varvec{3},\ \varvec{1},\ -1/3)\), \({u_R}_i \sim (\varvec{3},\ \varvec{1},\ +2/3)\), where \(i\in \{1,2,3\}\) is a family index ordered by increasing mass. Implicit in the definition of these fields is that we have performed a flavour rotation so that \({d_L}_i, {e_L}_i, {e_R}_i, {d_R}_i, {u_R}_i\) are mass basis fields. In what follows, we denote 3-component column vectors in family space with bold font, for example \(\mathbf{u_L}:=(u_L,\ c_L,\ t_L)^T\). The Higgs doublet is a complex scalar \(\phi \sim (\varvec{1},\ \varvec{2},\ +1/2)\), and all three models which we consider (the \(Y_3\) model, the D\(Y_3\) model and the D\(Y_3^\prime \) model) incorporate a complex scalar flavon with SM quantum numbers \(\theta \sim (\varvec{1}, \varvec{1}, 0)\), which has a \(U(1)_X\) charge \(X_\theta \ne 0\) and is used to Higgs the \(U(1)_X\) symmetry, such that its gauge boson acquires a mass at the TeV scale or higher.

In the following we will present our results for three variants of third family hypercharge models, which differ in the charge assignment for the SM fields:

  • The \(Y_3\) model, introduced in [41]. Only third generation fermions have non-zero \(U(1)_X\) charges. The charge assignments can be read in Table 2.

  • The D\(Y_3\) model as introduced in [43]. It differs from the \(Y_3\) model in that charges have been assigned to the second generation leptons as well, while still being anomaly free.

  • The D\(Y_3^\prime \) model, which differs from the D\(Y_3\) model in that the charges for third and second generation left-handed leptons are interchanged. The charge assignments can be read in Table 3.

All three of these gauge symmetries have identical couplings to quarks, coupling only to the third family via hypercharge quantum numbers. This choice means that, of the quark Yukawa couplings, only the top and bottom Yukawa couplings are present at the renormalisable level. Of course, the light quarks are not massless in reality; their masses, as well as the small quark mixing angles, must be encoded in higher-dimensional operators that come from a further layer of heavy physics, such as a suite of heavy vector-like fermions at a mass scale \(\Lambda > M_X/g_X\), where \(g_X\) is the \(U(1)_X\) gauge coupling.

Table 2 \(U(1)_X\) charges of the gauge eigenbasis fields in the \(Y_3\) model, where \(i \in \{1, 2\}\). The flavon charge \(X_\theta \) is left undetermined
Table 3 \(U(1)_X\) charges of the gauge eigenbasis fields in the D\(Y_3^\prime \) model. The original D\(Y_3\) model charges (as introduced in Ref. [43]) can be obtained by interchanging \(X_{L_3}\) and \(X_{L_2}\). The flavon charge \(X_\theta \) is left undetermined

Whatever this heavy physics might be, the structure of the light quark Yukawa couplings will be governed by the size of parameters that break the \(U(2)^3_{\mathrm {global}}:=U(2)_q \times U(2)_u \times U(2)_d\) accidental global symmetry [68,69,70,71,72] of the renormalisable third family hypercharge Lagrangians. For example, a minimal set of spurions charged under both \(U(2)^3_{\mathrm {global}}\) and \(U(1)_X\) was consideredFootnote 5 in Ref. [73], which reproduces the observed hierarchies in quark masses and mixing angles when the scale \(\Lambda \) of new physics is a factor of 15 or so larger than \(M_X/g_X\). Taking this hierarchy of scales as a general guide, and observing that the global fits to electroweak and flavour data that we perform in this paper prefer \(M_X/g_X\approx 10\) TeV, we expect the new physics scale to be around \(\Lambda \approx 150\) TeV. This scale is high enough to suppress most contributions of the heavy physics, about which we remain agnostic, to low energy phenomenology including precise flavour bounds.Footnote 6 For this reason, we feel safe in neglecting the contributions of the \(\Lambda \) scale physics to the SMEFT coefficients that we calculate in Sect. 3, and shall not consider it in any further detail.

Continuing, we will first detail the scalar sector, which is common to (and identical in) all of the third family hypercharge models, before going on to discuss aspects of each model that are different (most importantly, the couplings to leptons).

2.1 The scalar sector

The coupling of the flavon to the \(U(1)_X\) gauge field is encoded in the covariant derivative

$$\begin{aligned} D_\mu \theta = (\partial _\mu + i X_\theta g_{X} X_\mu ) \theta , \end{aligned}$$
(1)

where \(X_\mu \) is the \(U(1)_X\) gauge boson in the unbroken phase and \(g_{X}\) is its gauge coupling. The flavon \(\theta \) is assumed to acquire a VEV \(\langle \theta \rangle \) at (or above) the TeV scale, which spontaneously breaks \(U(1)_X\). Expanding \(\theta = (\langle \theta \rangle + \vartheta )/\sqrt{2}\), its kinetic terms \((D_\mu \theta )^\dag D^\mu \theta \) in the Lagrangian density give the gauge boson a mass \(M_{X}=X_\theta g_{X} \langle \theta \rangle \) through the Higgs mechanism. After electroweak symmetry breaking, the electrically-neutral gauge bosons X, \(W^3\) and B mix, giving rise to \(\gamma \), \(Z^0\) and \(Z^\prime \) as the physical mass eigenstates [41]. To terms of order \((M_Z^2 / M_{X}^2)\), the mass and the couplings of the X boson are equivalent to those of the \(Z^\prime \) boson. Because we take \(M_X \gg M_Z\), the matching to the SMEFT (Sect. 3) should be done in the unbroken electroweak phase, where it is the X boson that is properly integrated out. In the rest of this section we therefore specify the \(U(1)_X\) sector via the X boson and its couplings. Throughout this paper, we entreat the reader to bear in mind that in terms of searches and several other aspects of their phenomenology, to a decent approximation the X boson and the \(Z^\prime \) boson are synonymous.

The covariant derivative of the Higgs doublet is

$$\begin{aligned} D_\mu \phi = \left( \partial _\mu + i \frac{g}{2} \sigma ^a W_\mu ^a + i \frac{g^\prime }{2} B_\mu + i \frac{g_X}{2} X_\mu \right) \phi , \end{aligned}$$
(2)

where \(W_\mu ^a\) (\(a=1,2,3\)) are unbroken SU(2) gauge bosons, \(\sigma ^a\) are the Pauli matrices, g is the SU(2) gauge coupling, \(B_\mu \) is the hypercharge gauge boson and \(g^\prime \) is the hypercharge gauge coupling. The kinetic term for the Higgs field, \((D_\mu \phi )^\dagger (D^\mu \phi )\), contains terms both linear and quadratic in \(X_\mu \). It is the linear terms

$$\begin{aligned} {\mathcal {L}} \supset -i \frac{g_X}{2} X_\mu \phi ^\dagger \left( \partial _\mu + i \frac{g}{2} \sigma ^a W_\mu ^a + i \frac{g^\prime }{2} B_\mu \right) \phi + \text {h.c.} \end{aligned}$$
(3)

that, upon integrating out the \(X_\mu \) boson, will give the leading contribution to the SMEFT in the form of dimension-6 operators involving the Higgs, as we describe in Sect. 3.

The charges of the fermion fields differ between the \(Y_3\) model and the D\(Y_3^\prime \) model, as follows.

2.2 Fermion couplings: the \(Y_3\) model

The \(Y_3\) model has fermion charges as listed in Table 2 (in the gauge eigenbasis), leading to the following Lagrangian density describing the X boson-SM fermion couplings [41]:

(4)

where

$$\begin{aligned} \Lambda ^{(I)}_P := V_{I}^\dagger P V_{I} \end{aligned}$$
(5)

are Hermitian 3-by-3 matrices. The index \(I \in \{u_L, d_L, e_L, \nu _L, u_R, d_R, e_R \}\) and the matrix \(P \in \{\xi , \Omega , \Psi \}\), where

$$\begin{aligned} \xi = \left( \begin{array}{ccc} 0 &{} 0 &{} 0 \\ 0 &{} 0 &{} 0 \\ 0 &{} 0 &{} 1 \\ \end{array}\right) , \end{aligned}$$
(6)

and \(\Omega \) and \(\Psi \) are described in Sect. 2.3. The \(V_I\) are 3-by-3 unitary matrices describing the mixing between fermionic gauge eigenstates and their mass eigenstates. Note that the quark doublets have been family rotated so that the \(d_{L_i}\) (but not the \(u_{L_i}\) fields) correspond to their mass eigenstates. Similarly, we have rotated \(L_i\) such that \(e_{L_i}\) align with the charged lepton mass eigenstates, but \(\nu _{L_i}\) are not. This will simplify the matching to the SMEFT operators that we perform in Sect. 3. We now go on to cover the X boson couplings in the D\(Y_3^\prime \) model before detailing the fermion mixing ansatz (which is common to all three models).

2.3 Fermion couplings: the D\(Y_3^\prime \) model

For the D\(Y_3^\prime \) model with the charge assignments listed in Table 3, the Lagrangian contains the following X boson-SM fermion couplings [43]:

(7)

The matrices \(\Lambda ^{(I)}_{\Omega }\) and \(\Lambda ^{(I)}_{\Psi }\) are defined in (5), where

$$\begin{aligned} \Omega = \left( \begin{array}{ccc} 0 &{} 0 &{} 0 \\ 0 &{} 1 &{} 0 \\ 0 &{} 0 &{} -\frac{5}{8} \\ \end{array}\right) , \quad \Psi = \left( \begin{array}{ccc} 0 &{} 0 &{} 0 \\ 0 &{} 1 &{} 0 \\ 0 &{} 0 &{} -\frac{5}{2} \\ \end{array}\right) . \end{aligned}$$
(8)

2.4 Fermion mixing ansatz

The CKM matrix and the PMNS matrix are predicted to be

$$\begin{aligned} V=V_{u_L}^\dag V_{d_L}, \quad U = U_{\nu _L}^\dag V_{e_L}, \end{aligned}$$
(9)

respectively. For all of the third family hypercharge models that we address here, the matrix element \((V_{(d_L)})_{23}\) must be non-zero in order to obtain new physics contributions of the sort required to explain the NCBAs. Moreover, in the \(Y_3\) model we need \((V_{e_L})_{23} \ne 0\) in order to generate a coupling (here left-handed) to muons.Footnote 7 These will lead to a BSM contribution to a Lagrangian density term in the weak effective theory proportional to \(({\bar{b}} \gamma ^\mu P_L s)(\bar{\mu }\gamma _\mu P_L \mu )\), where \(P_L\) is the left-handed projection operator, which previous fits to the weak effective theory indicate is essential in order to fit the NCBAs [17,18,19,20,21,22,23].

In order to investigate the model further phenomenologically, we must assume a particular ansatz for the unitary fermion mixing matrices \(V_I\). Here, for \(V_{d_L}\), we choose the ‘standard parameterisation’ often used for the CKM matrix [74]. This is a parameterisation of a family of unitary 3 by 3 matrices that depends only upon one complex phase and three mixing angles (a more general parameterisation would also depend upon five additional complex phases):

$$\begin{aligned} \left( \begin{array}{ccc} c_{12}c_{13} &{} s_{12}c_{13} &{} s_{13} e^{-i\delta } \\ -s_{12}c_{23}-c_{12}s_{23}s_{13}e^{i \delta } &{} c_{12}c_{23} - s_{12}s_{23}s_{13}e^{i\delta } &{} s_{23}c_{13} \\ s_{12}s_{23}-c_{12}c_{23}s_{13}e^{i \delta } &{} -c_{12}s_{23} - s_{12}c_{23}s_{13}e^{i\delta } &{} c_{23}c_{13} \\ \end{array} \right) , \end{aligned}$$
(10)

where \(s_{ij}:=\sin \theta _{ij}\) and \(c_{ij}:=\cos \theta _{ij}\), for angles \(\theta _{ij},\ \delta \in \mathbb {R}/(2 \pi \mathbb {Z})\). To define our particular ansatz, we choose angles such that \(\left( V_{d_L}\right) _{ij} = V_{ij}\) for all \(ij\ne 23\), i.e. we insert the current world-averaged measured central values of \(\theta _{ij}\) and \(\delta \) [74], except for the crucial mixing angle \(\theta _{23}\), upon which the NCBAs sensitively depend. Thus, we fix the angles and phase such that \(s_{12}=0.22650\), \(s_{13}=0.00361\) and \(\delta =1.196\) but allow \(\theta _{23}\) to float as a free parameter in our global fits. Following Refs. [41, 43], we choose simple forms for the other mixing matrices which are likely to evade strict FCNC bounds. Specifically, we choose \(V_{d_R}=1\), \(V_{u_R}=1\) and

$$\begin{aligned} V_{e_L} = \left( \begin{array}{ccc} 1 &{} 0 &{} 0 \\ 0 &{} 0 &{} 1 \\ 0 &{} 1 &{} 0 \\ \end{array}\right) \end{aligned}$$
(11)

in the \(Y_3\) model,Footnote 8 and \(V_{e_L}=1\) in the D\(Y_3^\prime \) model. Finally, \(V_{u_L}\) and \(V_{\nu _L}\) are then fixed by (9) and the measured CKM/PMNS matrix entries. For the remainder of this paper, when referring to the \(Y_3\) model or the D\(Y_3^\prime \) model, we shall implicitly refer to the versions given by this mixing ansatz (which, we emphasise, is taken to be the same for all third family hypercharge models aside from the assignment of \(V_{e_L}\)).

Next, we turn to calculating the complete set of dimension-6 WCs in the SMEFT that result from integrating the X boson out of the theory.

3 SMEFT coefficients

So far, particle physicists have found scant direct evidence of new physics below the TeV scale. This motivates the study of BSM models whose new degrees of freedom reside at the TeV scale or higher. In such scenarios, it makes sense to consider the Standard Model as an effective field theory realisation of the underlying high energy model. If one wishes to remain agnostic about further details of the high energy theory, this amounts to including all possible operators consistent with the SM gauge symmetries and performing an expansion in powers of the ratio of the electroweak and new physics scales.

The Standard Model Effective Field Theory [75,76,77] is such a parameterisation of the effects of heavy fields beyond the SM (such as a heavy X boson field of interest to us here) through \(d>4\) operators built out of the SM fields. In this paper we will work with operators up to dimension 6 (i.e. we will go to second order in the power expansion). Expanding the SMEFT to this order gives us a very good approximation to all of the observables we consider; the relevant expansion parameter for the EWPOs is \((M_Z / M_{X})^2 \ll 1\), and in the case of observables derived from the decay of a meson of mass m, the relevant expansion parameter is \((m / M_{X})^2 \ll 1\). By restricting to the \(M_{X}>1\) TeV region, we ensure that both of these mass ratios are small enough to yield a good approximation.

The SMEFT Lagrangian can be expanded as

$$\begin{aligned} {\mathcal {L}}_{\text {SMEFT}} = {\mathcal {L}}_{SM} + C_{\text {5}} O_{\text {5}} + \sum _{\text {dim 6}} C_i O_i + \cdots , \end{aligned}$$
(12)

where \(O_{\text {5}}\) schematically indicates Weinberg operators with various flavour indices, which result in neutrino masses and may be obtained by adding heavy gauge singlet chiral fermions to play the role of right-handed neutrinos. The sum that is explicitly notated then runs over all mass dimension-6 SMEFT operators, and the ellipsis represents terms which are of mass dimension (in the fields) greater than 6. The WCs \(C_i\) have units of \([\text{ mass}]^{-2}\). In the following we shall work in the Warsaw basis, which defines a basis in terms of a set of independent baryon-number-conserving operators [78]. By performing the matching between our models and the SMEFT, we shall obtain the set of WCs \(C_i\) at the scale \(M_{X}\), which can then be used to calculate predictions for observables.

Fig. 1
figure 1

X-boson mediated process responsible for the effective vertex between 4 fermionic fields \(\{\psi _i\}\)

To see where these dimension-6 operators come from, let us first consider the origin of four-fermion operators. We may write the fermionic couplings of the underlying theory of the X boson, given in (4) and (7) for the \(Y_3\) model and D\(Y_3^\prime \) model respectively, as

$$\begin{aligned} {\mathcal L}^\psi = - J^\mu _\psi X_\mu , \end{aligned}$$
(13)

where

$$\begin{aligned} J^\mu _\psi = \sum _{\psi _{{i,j}}} {\kappa _{ij}} \overline{\psi _i} \gamma ^\mu \psi _{{j}} \end{aligned}$$
(14)

is the fermionic current that the X boson couples to, where the sum runs over all pairs of SM Weyl fermions \(\psi _i\). The couplings \(\kappa _{ij}\) are identified from (4) or (7), depending upon the model. After integrating out the X boson in processes such as the one in Fig. 1, one obtains the following terms in the effective Lagrangian:

$$\begin{aligned} {\mathcal L}_\text {SMEFT} \supset - \frac{{J_\psi }_\mu {J_\psi }^\mu }{2 M_{X}^2}. \end{aligned}$$
(15)

We match the terms thus obtained with the four-fermion operators in the Warsaw basis [78] in order to identify the 4-fermion SMEFT WCs in that basis.

These 4-fermion operators are not the only SMEFT operators that are produced at dimension-6 by integrating out the X bosons of our models. Due to the tree-level \(U(1)_X\) charge of the SM Higgs, there are also various operators in the Higgs sector of the SMEFT, as follows. The (linear) couplings of the X boson to the Higgs, as recorded in (3), can again be written as the coupling of \(X_\mu \) to a current, viz.

$$\begin{aligned} {\mathcal L}^\phi = - {J_\phi }_\mu X^\mu , \end{aligned}$$
(16)

where this time

$$\begin{aligned} {J_\phi }_\mu = i \frac{g_X}{2} \phi ^\dagger D_\mu ^\text {SM} \phi + \text {h.c.} \end{aligned}$$
(17)
Fig. 2
figure 2

X-boson mediated process responsible for the effective vertex between two Higgs fields, one SM gauge boson, and a fermion bi-linear operator

is the bosonic current to which the X boson couples, where \(D_\mu ^\text {SM}=\partial _\mu + i \frac{g}{2} \sigma ^a W_\mu ^a + i \frac{g^\prime }{2} B_\mu \). Due to the presence of X boson couplings to operators which are bi-linear in both the fermion fields (\(J_\psi ^\mu \)) and the Higgs field (\(J_\phi ^\mu \)), integrating out the X bosons gives rise to cross-terms

$$\begin{aligned} {\mathcal L}_\text {SMEFT} \supset - \frac{{J_\phi }_\mu {J_\psi }^\mu }{M_{X}^2} \, , \end{aligned}$$
(18)

which encode dimension-6 operators involving two Higgs fields, one SM gauge boson, and a fermion bi-linear current. Diagrammatically, these operators are generated by integrating out the X boson from Feynman diagrams such as that depicted in Fig. 2.

Finally, there are terms that are quadratic in the bosonic current \(J_\phi ^\mu \),

$$\begin{aligned} {\mathcal L}_\text {SMEFT} \supset - \frac{{J_\phi }_\mu {J_\phi }^\mu }{2M_{X}^2} \, , \end{aligned}$$
(19)

which encode dimension-6 operators involving four Higgs fields and two SM covariant derivatives. The corresponding Feynman diagram is given in Fig. 3.

This accounts, schematically, for the complete set of dimension-6 WCs generated by either the \(Y_3\) model or the D\(Y_3^\prime \) model.Footnote 9 We tabulate all the non-zero WCs generated in this way in Table 4 for the \(Y_3\) model and in Table 5 for the D\(Y_3^\prime \) model.

Fig. 3
figure 3

X-boson mediated process responsible for dimension-6 operators involving four Higgs insertions and either two derivatives or two SM gauge boson insertions

Table 4 Non-zero dimension-6 SMEFT WCs predicted by the \(Y_3\) model, in units of \(g_{X}^2/M_{X}^2\), in the Warsaw basis [78]. We have highlighted the coefficient (for \(i=2,\ j=3\)) that is primarily responsible for the NCBAs in bold font
Table 5 Non-zero dimension-6 SMEFT WCs predicted by the D\(Y_3^\prime \) model, in units of \(g_{X}^2/M_{X}^2\), in the Warsaw basis [78]. We have highlighted the coefficient that is primarily responsible (for \(i=2, j=3\)) for the NCBAs in bold font. WCs for the original D\(Y_3\) model may be obtained by switching the l indices \(2 \leftrightarrow 3\) everywhere

4 Global fits

Given the complete sets of dimension-6 WCs (Tables 4 and 5) as inputs at the renormalisation scale \(M_{X}\),Footnote 10 we use the smelli-2.2.0 program to calculate hundreds of observables and the resulting likelihoods. The smelli-2.2.0 program is based upon the observable calculator flavio-2.2.0 [79], using Wilson-2.1 [80] for running and matching WCs using the WCxf exchange format [81].

In a particular third family hypercharge model, for given values of our three input parameters \(\theta _{23}\), \(g_{X}\), \(M_{X}\), the WCs in the tables are converted to the non-redundant basis [67] assumed by smelli-2.2.0Footnote 11 (a subset of the Warsaw basis). The renormalisation group equations are then solved in order to run the WCs down to the weak scale, at which the EWPOs are calculated. Most of the EWPOs and correlations are taken from Ref. [82] by smelli-2.2.0, which neglects the relatively small theoretical uncertainties in EWPOs. The EWPOs have not been averaged over lepton flavour, since lepton flavour non-universality is a key feature of any model of the NCBAs, including those built on third family hypercharge which we consider.

We have updated the data used by flavio-2.2.0 with 2021 LHCb measurements of \(BR(B_{d,s} \rightarrow \mu ^+ \mu ^-)\) taken on 9 fb\(^{-1}\) of LHC Run II data [8] by using the two dimensional Gaussian fit to current CMS, ATLAS and LHCb measurements presented in Ref. [83]:

$$\begin{aligned} BR(B_s \rightarrow \mu ^+ \mu ^-)= & {} (2.93 \pm 0.35) \times 10^{-9}, \nonumber \\ BR(B^0 \rightarrow \mu ^+ \mu ^-)= & {} (0.56 \pm 0.70) \times 10^{-10}, \end{aligned}$$
(20)

with an error correlation of \(\rho = -0.27\). The most recent measurement by LHCb in the di-lepton invariant mass squared bin \(1.1<Q^2/\text {GeV}^2 < 6.0\) is

$$\begin{aligned} R_K= 0.846^{+0.042}_{-0.039}{}^{+0.013}_{-0.012}, \end{aligned}$$
(21)

where the first error is statistical and the second systematic [3] (this measurement alone has a 3.1\(\sigma \) tension with the SM prediction of 1.00). We incorporate this new measurement by fitting the log likelihood function presented in Ref. [3] with a quartic polynomial.

The SMEFT weak scale WCs are then matched to the weak effective theory and renormalised down to the scale of bottom mesons using QCD\(\times \)QED. Observables relevant to the NCBAs are calculated at this scale. smelli-2.2.0 then organises the calculation of the \(\chi ^2\) statistic to quantify a distance (squared) between the theoretical prediction and experimental observables in units of the uncertainty. In calculating the \(\chi ^2\) value, experimental correlations between different observables are parameterised and taken into account. Theoretical uncertainties are modelled as being multi-variate Gaussians; they include the effects of varying nuisance parameters and are approximated to be independent of new physics. Theory uncertainties and experimental uncertainties are then combined in quadrature.

Fig. 4
figure 4

An X-boson mediated contribution to \(B_s - \overline{B_s}\) mixing

We note that an important constraint on \(Z^\prime \) models that explain the NCBAs is that from \(\Delta m_s\) (included by smelli-2.2.0 in the category of ‘quarks’ observables), deriving from measurements of \(B_s-\overline{B_s}\) mixing, because of the tree-level BSM contribution to the process depicted in Fig. 4. The impact of this constraint has significantly varied over the last decade, to a large degree because of numerically rather different lattice or theory inputs used to extract the measurement [84,85,86]. Here, we are wedded to the calculation and inputs used by smelli-2.2.0, allowing some tension in \(\Delta m_s\) to be traded against tension present in the NCBAs.

As we shall see, in all the models that we consider the global fit is fairly insensitive to \(M_{X}\), provided we specify \(M_{X} > 2\) TeV or so in order to be sure to not contravene ATLAS di-muon searches [43, 64, 65]. We will demonstrate this insensitivity to \(M_X\) below (see Figs. 9 and 15), but for now we shall pick \(M_{X}=3\) TeV and scan over the pair \((g_{X} \times 3~\text {TeV}/M_{X})\) and \(\theta _{23}\). Since the WCs at \(M_{X}\) all scale like \(g_{X}/M_{X}\), the results will approximately hold at different values of \(M_{X}\) provided that \(g_{X}\) is scaled linearly with \(M_{X}\). The running between \(M_{X}\) and the weak scale breaks this scaling, but such effects derive from loop corrections \(\propto (1/16 \pi ^2)\ln (M_{X}/M_Z)\) and are thus negligible to a good approximation.

4.1 \(Y_3\) model fit results

Table 6 Goodness of fit for the different data sets we consider for the \(Y_3\) model as calculated by smelli-2.2.0 for \(M_{X}=3\) TeV. We display the total \(\chi ^2\) for each data set along with the number of observables n and the data set’s p-value. The data set named ‘quarks’ contains \(BR(B_s \rightarrow \phi \mu ^+ \mu ^-)\), \(BR(B_s\rightarrow \mu ^+\mu ^-)\), \(\Delta m_s\) and various differential distributions in \(B\rightarrow K^{(*)} \mu ^+\mu ^-\) decays among others, whereas ‘LFU FCNCs’ contains \(R_{K^{(*)}}\) and some B meson decay branching ratios into di-taus. Our data sets are identical to those defined by smelli-2.2.0 and we refer the curious reader to its manual [67], where the observables are enumerated. We have updated to \(R_K\) and \(BR(B_{s,d}\rightarrow \mu ^+ \mu ^-)\) with the latest LHCb measurements as detailed in Sect. 4. Two free parameters of the model were fitted: \(\theta _{23}=-0.145{}\) and \(g_{X}=0.426{}\)

The result of fitting \(\theta _{23}\) and \(g_{X}\) for \(M_{X}=3\) TeV is shown in Table 6 for the \(Y_3\) model. The ‘global’ p-value is calculated by assuming a \(\chi ^2\) distribution with \(n-2\) degrees of freedom, since two parameters were optimised. The fit is encouragingly of a much better quality than the one of the SM. We see that the fits to the EWPOs and NCBAs are simultaneously reasonable.

Fig. 5
figure 5

Pulls in the EWPOs for the \(Y_3\) model \(M_{X}=3\) TeV best-fit point: \(\theta _{23}=-0.145{}\), \(g_{X}=0.426{}\). The pull is defined to be the theory prediction minus the central value of the observation, divided by the combined theoretical and experimental uncertainty, neglecting any correlations with other observables

The EWPOs are shown in more detail in Fig. 5, in which we compare some pulls in the SM fit versus the \(Y_3\) model best-fit point. We see that there is some improvement in the prediction of the W-boson mass, which the SM fit predicts is almost 2\(\sigma \) too low (as manifest in the \(\rho \)-parameter being measured to be slightly larger than one [74], for \(M_Z\) taken to be fixed to its SM value). The easing of this tension in \(M_W\) is due precisely to the \(Z-Z^\prime \) mixing in the (D)\(Y_3\) models. The non-zero value of the SMEFT coefficient \(C_{\phi D}\) breaks custodial symmetry, resulting in a shift of the \(\rho \)-parameter away from its tree-level SM value of one, to [66]

$$\begin{aligned} (\rho _0)_{Y_3} = 1 - C_{\phi D} v^2 / 2 =1 + v^2 g_X^2/(4m^2_X). \end{aligned}$$
(22)

where v is the SM Higgs VEV. Rather than being dangerous, as might reasonably have been guessed, it turns out that this BSM contribution to \(\rho _0\) is in large part responsible for the \(Y_3\) model fitting the EWPOs approximately as well as the SM does.

We also see that \(\sigma _0^{had}\), the \(e^+ e^-\) scattering cross-section to hadrons at a centre-of-mass energy of \(M_Z\), is better fit by the \(Y_3\) model than the SM. Although the other EWPOs have some small deviations from their SM fits, the overall picture is that the \(Y_3\) model best-fit point has an electroweak fit similar to that of the SM.

In order to see which areas of parameter space are favoured by the different sets of constraints, we provide Fig. 6.

Fig. 6
figure 6

Two parameter fit to the \(Y_3\) model for \(M_{X}=3\) TeV. Shaded regions are those preferred by the data set in the legend at the 95\(\%\) confidence level (CL). The global fit is shown by the solid curves, where the inner (outer) curves show the 70\(\%\)(95\(\%\)) CL regions, respectively. The set named ‘quarks’ contains \(BR(B_s \rightarrow \phi \mu ^+ \mu ^-)\), \(BR(B_s\rightarrow \mu ^+\mu ^-)\), \(\Delta m_s\) and various differential distributions in \(B\rightarrow K^{(*)} \mu ^+\mu ^-\) decays among others, whereas ‘LFU FCNCs’ contains \(R_{K^{(*)}}\) and some B meson decay branching ratios into di-taus. Our sets are identical to those defined by smelli-2.2.0 and we refer the curious reader to its manual [67], where the observables are enumerated. We have updated \(R_K\) and \(BR(B_{s,d}\rightarrow \mu ^+ \mu ^-)\) with the latest LHCb measurements as detailed in Sect. 4. The black dot marks the locus of the best-fit point

The figure shows that the EWPOs and different sets of NCBA data all overlap at the 95\(\%\) CL. The best-fit point has a total \(\chi ^2\) of 43 less than that of the SM and is marked by a black dot. The separate data set contributions to \(\chi ^2\) at this point are listed in Table 6. In order to calculate 70\(\%\) (95\(\%\)) CL bounds in the 2-dimensional parameter plane, we draw contours of \(\chi ^2\) equal to the best-fit value plus 2.41 (5.99) respectively, using the combined \(\chi ^2\) incorporating all the datasets.

Fig. 7
figure 7

\(Y_3\) model \(\Delta \chi ^2\) contributions in the vicinity of the best-fit point as a function of \(g_{X}\)

Fig. 8
figure 8

\(Y_3\) model \(\Delta \chi ^2\) contributions in the vicinity of the best-fit point as a function of \(\theta _{23}\)

Fig. 9
figure 9

\(Y_3\) model \(\Delta \chi ^2\) contributions in the vicinity of the best-fit point as a function of \(M_{X}\), where \(g_{X}\) has been scaled linearly

We further study the different \(\chi ^2\) contributions for the \(Y_3\) model in the vicinity of the best-fit point in Figs. 7, 8 and 9. From Fig. 7, we see that large couplings \(g_{X}>0.6\) are disfavoured by EWPOs as well as the NCBAs. From Fig. 8 we see that the EWPOs are insensitive to the value of \(\theta _{23}\) in the vicinity of the best-fit point but the NCBAs are not. At large \(-\theta _{23}\) the \(Y_3\) model suffers due to a bad fit to the \(B_s-\overline{B}_s\) mixing observable \(\Delta m_s\). In Fig. 9, we demonstrate the approximate insensitivity of \(\chi ^2\) near the best-fit point to \(M_{X}\), provided that \(g_{X}\) is scaled linearly with it.

Fig. 10
figure 10

Pulls of interest for the \(Y_3\) model \(M_{X}=3\) TeV best-fit point: \(\theta _{23}=-0.145{}\), \(g_{X}=0.426{}\). The pull is defined to be the theory prediction minus the central value of the observation, divided by the combined theoretical and experimental uncertainty, neglecting any correlations with other observables. Numbers in brackets after the observable name refer to minimum and maximum values of \(m_{\mu \mu }^2\) of the bin in GeV\(^2\), respectively (many other bins and observables are also used to compute the global likelihood)

Finally, we display some individual observables of interest in Fig. 10 at the \(Y_3\) model best-fit point. While some of the prominent NCBA measurements (for example \(R_K\) in the bin of \(m_{\mu \mu }^2\) between 1.1 GeV\(^{2}\) and 6 GeV\(^{2}\)) fit considerably better than the SM, we see that this is partly compensated by a worse fit in \(\Delta m_s\), as is the case for many \(Z^\prime \) models for the NCBAs. The \(P_5^\prime \) observable (derived in terms of angular distributions of \(B^0\rightarrow K^*\mu ^+\mu ^-\) decays [87, 88]) shows no significant change from the SM prediction in the bin that deviates the most significantly from experiment: \(m_{\mu \mu }^2 \in (4,\ 6)\) GeV\(^2\), as measured by LHCb [12] and ATLAS [13]. The fit to \(BR(\Lambda _b \rightarrow \Lambda \mu ^+ \mu ^-)\) is slightly worse than that of the SM in one particular bin, as shown in the figure. Some other flavour observables in the flavour sector, notably various bins of \(BR(B\rightarrow K^{(*)}\mu ^+ \mu ^-)\), show some small differences in pulls between the SM and the \(Y_3\) model. Whilst there are many of these and in aggregate they make a difference to the overall \(\chi ^2\), there is no small set of observables that provide the driving force and so we neglect to show them.Footnote 12 We shall now turn to the D\(Y_3^\prime \) model fit results, where these comments about flavour observables also apply.

4.2 D\(Y_3^\prime \) model fit results

Table 7 Goodness of fit for the different data sets we consider for the D\(Y_3^\prime \) model, as calculated by smelli-2.2.0 for \(M_{X}=3\) TeV. We display the total \(\chi ^2\) for each data set along with the number of observables n and the data set’s p-value. The set named ‘quarks’ contains \(BR(B_s \rightarrow \phi \mu ^+ \mu ^-)\), \(BR(B_s\rightarrow \mu ^+\mu ^-)\), \(\Delta m_s\) and various differential distributions in \(B\rightarrow K^{(*)} \mu ^+\mu ^-\) decays among others, whereas ‘LFU FCNCs’ contains \(R_{K^{(*)}}\) and some B meson decay branching ratios into di-taus. Our sets are identical to those defined by smelli-2.2.0 and we refer the curious reader to its manual [67], where the observables are enumerated. We have updated \(R_K\) and \(BR(B_{s,d}\rightarrow \mu ^+ \mu ^-)\) with the latest LHCb measurements as detailed in Sect. 4. Two free parameters of the model were fitted: \(\theta _{23}=-0.181{}\) and \(g_{X}=0.253{}\)

We summarise the quality of the fit for the D\(Y_3^\prime \) model at the best-fit point, for \(M_{X}=3\) TeV, in Table 7. We see a much improved fit as compared to the SM (by a \(\Delta \chi ^2=39\)) and a similar (but slightly worse) quality of fit compared to the \(Y_3\) model, as a comparison with Table 6 shows.

Fig. 11
figure 11

Two parameter fit to the D\(Y_3^\prime \) model for \(M_{X}=3\) TeV. Shaded regions are those preferred by the data set in the legend at the 95\(\%\) confidence level (CL). The global fit is shown by the solid curves, where the inner (outer) curves show the 70\(\%\) (95\(\%\)) CL regions, respectively. The set named ‘quarks’ contains \(BR(B_s \rightarrow \phi \mu ^+ \mu ^-)\), \(BR(B_s\rightarrow \mu ^+\mu ^-)\), \(\Delta m_s\) and various differential distributions in \(B\rightarrow K^{(*)} \mu ^+\mu ^-\) decays among others, whereas ‘LFU FCNCs’ contains \(R_{K^{(*)}}\) and some B meson decay branching ratios into di-taus. Our sets are identical to those defined by smelli-2.2.0 and we refer the curious reader to its manual [67], where the observables are enumerated. We have updated \(R_K\) and \(BR(B_{s,d}\rightarrow \mu ^+ \mu ^-)\) with the latest LHCb measurements as detailed in Sect. 4. The black dot marks the locus of the best-fit point

The constraints upon the parameters \(\theta _{23}\) and \(g_{X}\) are shown in Fig. 11. Although the figure is for \(M_{X}=3\) TeV, the picture remains approximately the same for \(2< M_{X}/\text {TeV} < 10\). We see that, as is the case for the \(Y_3\) model, there is a region of overlap of the 95\(\%\) CL regions of all of the constraints.

Fig. 12
figure 12

Pulls in the EWPOs for the \(M_{X}=3\) TeV D\(Y_3^\prime \) model best-fit point: \(g_{X}=0.253{}\), \(\theta _{23}=-0.181{}\). The pull is defined to be the theory prediction minus the central value of the observation, divided by the combined theoretical and experimental uncertainty, neglecting any correlations with other observables

The pulls in the EWPOs for the best-fit point of the D\(Y_3^\prime \) model are shown in Fig. 12. Like the \(Y_3\) model above, we see a fit comparable in quality to that of the SM. Again, the D\(Y_3^\prime \) model predicts \(M_W\) to be a little higher than in the SM, agreeing slightly better with the experimental measurement.

Fig. 13
figure 13

\(\Delta \chi ^2\) contributions in the vicinity of the D\(Y_3^\prime \) model best-fit point as a function of \(g_{X}\)

Fig. 14
figure 14

\(\Delta \chi ^2\) contributions in the vicinity of the D\(Y_3^\prime \) model best-fit point as a function of \(\theta _{23}\)

Fig. 15
figure 15

\(\Delta \chi ^2\) contributions in the vicinity of the D\(Y_3^\prime \) model best-fit point as a function of \(M_{X}\), where \(g_{X}\) has been scaled linearly

The behaviour of the fit in various directions in parameter space around the best-fit point is shown in Figs. 13, 14 and 15. Qualitatively, this behaviour is similar to that of the \(Y_3\) model: the EWPOs and NCBAs imply that \(g_{X}\) should not become too large. The mixing observable \(\Delta m_s\) prevents \(-\theta _{23}\) from becoming too large, and the fits are insensitive to \(M_{X}\) varied between 2 TeV and 10 TeV so long as \(g_{X}\) is scaled linearly with \(M_{X}\).

Fig. 16
figure 16

Pulls of interest for the \(M_{X}=3\) TeV D\(Y_3^\prime \) model best-fit point: \(g_{X}=0.253{}\), \(\theta _{23}=-0.181{}\). The pull is defined to be the theory prediction minus the central value of the observation, divided by the combined theoretical and experimental uncertainty, neglecting any correlations with other observables. Numbers in brackets after the observable name refer to minimum and maximum values of \(m_{\mu \mu }^2\) in GeV\(^2\), respectively (many other bins and observables are also used to compute the global likelihood)

Figure 16 shows various pulls of interest at the best-fit point of the D\(Y_3^\prime \) model. Better fits (than the SM) to several NBCA observables are partially counteracted by a worse fit to the \(\Delta m_s\) observable.

4.3 Original D\(Y_3\) model fit results

We display the overall fit quality of the original D\(Y_3\) model in Table 8. By comparison with Tables 1 and 6 we see that although its predictions still fit the data significantly better than the SM (\(\Delta \chi ^2\) is 32), the original D\(Y_3\) model does not achieve as good fits as the other models. For the sake of brevity, we have refrained from including plots for it.Footnote 13 Instead, it is more enlightening to understand the reason behind this slightly worse fit, which is roughly as follows. The coupling of the X boson to muons in the original D\(Y_3\) model is close to vector-like, viz. (where \(P_L,P_R\) are left-handed and right-handed projection operators, respectively), which is slightly less preferred by the smelli-2.2.0 fits than an X boson coupled more strongly to left-handed muons [67]. This preference is in large part due to the experimentally measured value of \(BR(B_s\rightarrow \mu ^+ \mu ^-)\), which is somewhat lower than the SM prediction [4,5,6,7], and is sensitive only to the axial component of the coupling to muons. Compared to the \(Y_3\) model and the D\(Y_3^\prime \) model, the fit to \(BR(B_s\rightarrow \mu ^+ \mu ^-)\) is worse when the D\(Y_3\) model fits other observables well.

Table 8 Goodness of fit for the different data sets we consider for the original D\(Y_3\) model, as calculated by smelli-2.2.0 for \(M_{X}=3\) TeV. We display the total \(\chi ^2\) for each data set along with the number of observables n and the data set’s p-value. The set named ‘quarks’ contains \(BR(B_s \rightarrow \phi \mu ^+ \mu ^-)\), \(BR(B_s\rightarrow \mu ^+\mu ^-)\), \(\Delta m_s\) and various differential distributions in \(B\rightarrow K^{(*)} \mu ^+\mu ^-\) decays among others, whereas ‘LFU FCNCs’ contains \(R_{K^{(*)}}\) and some B meson decay branching ratios into di-taus. Our sets are identical to those defined by smelli-2.2.0 and we refer the curious reader to its manual [67], where the observables are enumerated. We have updated \(R_K\) and \(BR(B_{s,d}\rightarrow \mu ^+ \mu ^-)\) with the latest LHCb measurements as detailed in Sect. 4. Two free parameters of the model were fitted: \(\theta _{23}=0.122{}\) and \(g_{X}=0.428{}\)

The p-value is significantly lower than the canonical lower bound of 0.05, indicating a somewhat poor fit.

4.4 The \(\theta _{23}=0\) limit of our models

Fig. 17
figure 17

Example of the \(W-\)loop process dominating the SMEFT contribution to the NCBAs in the \(\theta _{23}=0\) case. The filled disc marks the location of the BSM operator

The NCBAs can receive sizeable contributions even when the tree-level coupling of the X boson to \({{\bar{b}}} s\) vanishes. For example, non-zero and \({\mathcal {O}}\)(1) \(\text {TeV}^2\) values for \({C_{lu}}^{2233}\) (as well as non-zero \(C_{eu}^{2233}\) in the case of the D\(Y_3\) model and the D\(Y_3^\prime \) model) can give a reasonable fit to the NCBA data [89] via a W-boson loop as in Fig. 17Footnote 14 Such a scenario would require that \(V_{ts}\) originates from mixing entirely within the up-quark sector. This qualitatively different quark mixing ansatz therefore provides a motivation to consider the \(\theta _{23}=0\) scenario separately. In the \(\theta _{23}=0\) limit, we have that \({\Lambda _{\xi \, 23}^{(d_L)}}\) is proportional to \(s_{13} \ll 1\) meaning that we also predict a negligible \((C^{(1)}_{lq})^{2223}\) at \(M_X\). Meanwhile, \(C_{lu}^{2233}\) in the \(Y_3\) model (as well as \(C_{eu}^{2233}\) for the D\(Y_3^\prime \) model and the D\(Y_3\) model) remains the same as in the \(\theta _{23}\ne 0\) case shown in Tables 4 and 5. We note that contributions to the NCBAs arising from W loops such as the one in Fig. 17 are nevertheless always included (through renormalisation group running) by smelli-2.2.0, even for \(\theta _{23}\ne 0\).

While it is true that much of the tension with the NCBAs can be ameliorated by such W loop contributions, we find from our global fits that the corresponding values for \(g^2_{X}/M^2_{X}\) are far too large to simultaneously give a good fit to the EWPOs in this \(\theta _{23}=0\) limit. As our results with a floating \(\theta _{23}\) suggest (see e.g. Fig. 6 along \(\theta _{23}=0\)), as far as the EWPOs go, SM-like scenarios are strongly preferred, since EWPOs quickly exclude any region that might resolve the tension with NCBAs. This is even more so than for the simplified model studied in [89], where already a significant tension with \(Z \rightarrow \mu \mu \) was pointed out. In our case, besides several stringent bounds from other observables measured at the \(Z^0\)-pole for \(g_{X} \approx 1\) and \(M_{X} \approx O(3)~\text{ TeV }\) (which predicts \(C_{lq}^{2223}\) just large enough to give a slightly better fit to \(R_K\) and \(R_{K^*}\) than the SM) the predicted \(W-\)boson mass is more than 5\(\sigma \) away from its measured value. This stands in stark contrast with the overall better-than-SM fit we find for \(M_W\) when \(\theta _{23}\ne 0\).

5 Discussion

Previous explorations of the parameter spaces of third family hypercharge models [41, 43] capable of explaining the neutral current B-anomalies showed the 95\(\%\) confidence level exclusion regions from various important constraints, but these analyses did not include electroweak precision observables. Collectively, the electroweak precision observables were potentially a model-killing constraint because, through the \(Z^0-Z^\prime \) mixing predicted in the models, the prediction of \(M_W\) in terms of \(M_Z\) is significantly altered from the SM prediction. This was noticed in Ref. [66], where rough estimates of the absolute sizes of deviations were made. However, the severity of this constraint on the original Third Family Hypercharge (\(Y_3\)) Model parameter space was found to depend greatly upon which estimateFootnote 15 of the constraint was used. In the present paper, we use the smelli-2.2.0 [67] computer program to robustly and accurately predict the electroweak precision observables and provide a comparison with empirical measurements. We have thence carried out global fits of third family hypercharge models to data pertinent to the neutral current B-anomalies as well as the electroweak precision observables. This is more sophisticated than the previous efforts because it allows tensions between 217 different observables to be traded off against one another in a statistically sound way. In fact, at the best-fit points of the third family hypercharge models, \(M_W\) fits better than in the SM, whose prediction is some 2\(\sigma \) too low.

One ingredient of our fits was the assumption of a fermion mixing ansatz. The precise details of fermion mixing are expected to be fixed in third family hypercharge models by a more complete ultra-violet model. This could lead to suppressed non-renormalisable operators in the third family hypercharge model effective field theory, for example which, when the flavon acquires its VEV, lead to small mixing effects. Such detailed model building seems premature in the absence of additional information coming from the direct observation of a flavour-violating \(Z^\prime \), or indeed independent precise confirmation of NCBAs from the Belle II [93] experiment. Reining in any urge to delve into the underlying model building, we prefer simply to assume a non-trivial structure in the fermion mixing matrix which changes the observables we consider most: those involving the left-handed down quarks. Since the neutral current B-anomalies are most sensitive to the mixing angle between left-handed bottom and strange quarks, we have allowed this angle to float. But the other mixing angles and complex phase in the matrix have been set to some (roughly mandated but ultimately arbitrary) values equal to those in the CKM matrix. We have checked that changing these arbitrary values somewhat (e.g. setting them to zero) does not change the fit qualitatively: a change in \(\chi ^2\) of up to 2 units was observed. It is clear that a more thorough investigation of such variations may become interesting in the future, particularly if the NCBAs strengthen because of new measurements.

Table 9 Comparison of p-values resulting from our global fits of the SM and various third family hypercharge models (with \(M_{Z^\prime }=3\) TeV) to a combination of 219 neutral current B-anomaly and electroweak data

We summarise the punch line of the global fits in Table 9. We see that, while the SM suffers from a poor fit to the combined data set, the various third family hypercharge models fare considerably better. The model with the best fit is the original Third Family Hypercharge Model (\(Y_3\)). We have presented the constraints upon the parameter spaces of the \(Y_3\) model and the D\(Y_3^\prime \) model in detail in Sect. 4. The qualitative behaviour of the \(Y_3\) model and the D\(Y_3^\prime \) model in the global fit is similar, although the regions of preferred parameters are different.

It is well known that \(\Delta m_s\) provides a strong constraint on models which fit the NCBAs and ours are no exception: in fact, we see in Figs. 10 and 16 that this variable has a pull of \(2.7\sigma \) (\(2.1\sigma \)) in the \(Y_3\) model (D\(Y_3^\prime \) model), whereas the SM pull is only \(1.1\sigma \), according to the smelli-2.2.0 calculation. The dominant beyond the SM contribution to \(\Delta m_s\) from our models is proportional to the \(Z^\prime \) coupling to \(\bar{s} b\) quarks squared, i.e. \([g_X (\Lambda _\xi ^{(d_L)})_{23} / 6]^2\). The coupling \((\Lambda _\xi ^{(d_L)})_{23}\) is adjustable because \(\theta _{23}\) is allowed to vary over the fit, and \(\Delta m_s\) provides an upper bound upon \(|g_X (\Lambda _\xi ^{(d_L)})_{23}|\). On the other hand, in order to produce a large enough effect in the lepton flavour universality violating observables to fit data, the product of the \(Z^\prime \) couplings to \(\bar{s} b\) quarks and to \(\mu ^+ \mu ^-\) must be at least a certain size. Thus, models where the \(Z^\prime \) couples more strongly to muons because their \(U(1)_X\) charges are larger fare better when fitting the combination of the LFU FCNCs and \(\Delta m_s\). The \(Z^\prime \) coupling to muons is 1/2 for the \(Y_3\) model and 2/3 for the D\(Y_3^\prime \) model, favouring the D\(Y_3^\prime \) model in this regard.

Despite the somewhat worse fit to \(\Delta m_s\) for the \(Y_3\) model as compared to the D\(Y_3^\prime \) model, Table 9 shows that, overall, the \(Y_3\) model is a better fit. Looking at the flavour observables in detail, it is hard to divine a single cause for this: it appears to be the accumulated effect of many flavour observables in tandem. The difference in \(\chi ^2\) between the \(Y_3\) model and the D\(Y_3^\prime \) model of 4.4 is not large and might merely be the result of statistical fluctuations in the 219 data; indeed 1.2 of this comes from the difference of quality of fit to the EWPOs.

All of the usual caveats levelled at interpreting p-values apply. In particular, p-values change depending upon exactly which observables are included or excluded. We have stuck to pre-defined sets of observables in smelli-2.2.0 in an attempt to reduce bias. However, we note that there are other data that are in tension with SM predictions which we have not included, namely the anomalous magnetic moment of the muon \((g-2)_\mu \) and charged current B-anomalies. If we were to include these observables, the p-values of all models in Table 9 would lower. Since third family hypercharge models give the same prediction for these observables as the SM, each model would receive the same \(\chi ^2\) increase as well as the increase in the number of fitted data. However, since our models have essentially nothing to add to these observables compared with the SM, we feel justified in leaving them out from the beginning. We could have excluded some of the observables that smelli-2.2.0 includes in our data sets (obvious choices include those that do not involve bottom quarks, e.g. \(\epsilon _K\)) further changing our calculation of the p-values of the various models.

As noted above, as far as the third family hypercharge models currently stand, the \(Z^\prime \) contribution to \((g-2)_\mu \) is small [28]. In order to explain an inferred beyond the SM contribution to \((g-2)_\mu \) compatible with current measurements \(\Delta (g-2)_\mu /2 \approx 28\pm 8 \times 10^{-10}\), one may simply add a heavy (TeV-scale) vector-like lepton representation that couples to the muon and the \(Z^\prime \) at one vertex. In that case, a one-loop diagram with the heavy leptons and \(Z^\prime \) running in the loop is sufficient and is simultaneously compatible with the neutral current B-anomalies and measurements of \((g-2)_\mu \) [28].

Independent corroboration from other experiments and future B-anomaly measurements are eagerly awaited and, depending upon them, a re-visiting of global fits to flavour and electroweak data may well become desirable. We also note that, since electroweak precision observables play a key rôle in our fits, an increase in precision upon them resulting from LHC or future \(e^+e^-\) collider data, could also prove to be of great utility in testing third family hypercharge models indirectly. Direct production of the predicted \(Z^\prime \) [43, 65] (and a measurement of its couplings) would, along with an observation of flavonstrahlung [40], ultimately provide a ‘smoking gun’ signal.