1 Introduction

With the discovery of Higgs boson at the LHC the Standard Model (SM) has become a complete theory describing all known phenomena at the energies around and below the electroweak scale. The quest for physics beyond the SM (BSM) is of major importance in order to solve the hierarchy and flavor problems. In that respect the processes mediated by the flavor changing neutral currents (FCNC) are particularly interesting because they provide us with a window to BSM physics through low energy experiments. Among those, most attention in recent years has been devoted to the exclusive \(b\rightarrow s\) transitions because of the detection and measurement of \(\mathcal{B}(B_s\rightarrow \mu ^+\mu ^-)\) at LHC [1], in addition to the detailed angular distributions of the \(B\rightarrow K^{(*)} \ell ^+\ell ^-\) [2, 3] and \(B_s\rightarrow \phi \ell ^+\ell ^-\) [4] decays which gave us access to a number of observables, including those that are only mildly sensitive to hadronic uncertainties, while being highly sensitive to the potential effects of BSM physics [59]. Currently a couple of discrepancies have been observed [10, 11] but their interpretation is still a subject of controversies which are mostly related to various sources of hadronic uncertainties, cf. e.g. [12].

Although the significant effects of BSM physics were expected to affect the hadronic part of the \(b\rightarrow s\ell ^+\ell ^-\) processes, it turned out that the most surprising effect came from the ratio

$$\begin{aligned} R_K= { \mathcal{B}(B\rightarrow K \mu ^+ \mu ^- )_{q^2\in [1,6]\ \mathrm{GeV}^2}\over \mathcal{B}(B\rightarrow K e^+ e^-)_{q^2\in [1,6]\ \mathrm{GeV}^2}}, \end{aligned}$$
(1)

the measured value of which, \(R_K =0.745\left( ^{+90}_{-74}\right) (36)\) [13], turned out to be \(2.6\sigma \) lower than the one predicted in the SM [14]. Importantly, in this ratio the hadronic uncertainties cancel to a very large extent and the discrepancy is then naturally attributed to the violation of the lepton flavor universality. There have been several attempts to describe this discrepancy in terms of various BSM models [1528]. Most of the models allowing to accommodate the lepton flavor universality violation also allow for the lepton flavor violation (LFV). Although the LFV exclusive decays based on \(b\rightarrow s\ell _1\ell _2\) (\(\ell _{1,2} \in \{e,\mu ,\tau \}\)) have not been studied at the LHC so far,Footnote 1 a recent report by CMS on the observation of a \(2\sigma \) excess of \(h\rightarrow \mu \tau \) decay [30] boosted the interest in studying \(B_s\rightarrow \ell _1\ell _2\), \(B\rightarrow K^{(*)} \ell _1\ell _2\), and \(B_s\rightarrow \phi \ell _1\ell _2\) [3139].

In this paper we provide the explicit expressions for the angular distributions and decay rates of the above exclusive processes, in the case of \(\ell _1 \ne \ell _2\). As we shall see, some of the operators that do not contribute to the lepton flavor conserving processes (LFC) can significantly contribute to the LFV ones. By taking the limit \(m_1=m_2\) we retrieve the known expressions for the LFC processes. Our formulas are obviously applicable to any similar process and are written in terms of hadronic matrix elements of the relevant operators and the associated Wilson coefficients. To get the Wilson coefficients in the LFV case we will proceed in two ways: (i) We will first assume LFV to be generated through the scalar operator, via coupling to the Higgs boson, and estimate the size of the Wilson coefficients \(C_{S,P}^{\mu \tau }\) from the experimental information on \(\mathcal{B}(h\rightarrow \tau \mu )\), and then predict the decay rates of the above-mentioned processes; (ii) We use a model with a \(Z^\prime \)-boson in which the LFV is generated by the vector interaction, estimate the Wilson coefficients \(C_{9,10}^{\mu \tau }\) by the known information as regards the \(B_s\)\(\overline{B}_s\) mixing and the other low energy observables. This latter option has been discussed in Ref. [39], which we briefly revisit.

The remainder of this paper is organized as follows: In Sect. 2 we set the definitions of the effective Hamiltonian, recall the standard parametrization of hadronic matrix elements and derive the formulas for all three types of the exclusive \(b\rightarrow s \ell _1\ell _2\) decay modes. In Sect. 3 we discuss the case of the LFV contributions arising from the scalar operator and derive the upper bounds on the specific decay modes using \(C_{S,P}\) extracted from \(\mathcal{B}(h\rightarrow \tau \mu )\). In Sect. 4 we revisit the upper bounds on the same processes derived in the framework of the \(Z^\prime \) model. We briefly summarize in Sect. 5.

2 Exclusive \(b\rightarrow s \ell _1\ell _2\) decays

As a starting point we will extend the usual effective Hamiltonian for the \(b\rightarrow s\) transitions by including the LFV operators

$$\begin{aligned} \mathcal {H}_{\mathrm {eff}}= & {} -\frac{4 G_F}{\sqrt{2}}V_{tb}V_{ts}^*\nonumber \\&\times \sum _{i=7,9,10,S,P} \left( C_i(\mu )\mathcal {O}_i(\mu )+C_i^\prime (\mu )\mathcal {O}_i^\prime (\mu )\right) , \end{aligned}$$
(2)

where the relevant operators are defined by

$$\begin{aligned} \mathcal {O}_{9}&=\frac{e^2}{g^2}(\bar{s}\gamma _\mu P_L b)(\bar{\ell _1}\gamma ^\mu \ell _{2}),\quad \mathcal {O}_{10} = \frac{e^2}{g^2}(\bar{s}\gamma _\mu P_L b)(\bar{\ell _1}\gamma ^\mu \gamma ^5\ell _{2}),\nonumber \nonumber \\ \mathcal {O}_{S}&= \frac{e^2}{(4\pi )^2}(\bar{s}P_R b)(\bar{\ell _1}\ell _{2}),\quad \mathcal {O}_{P} = \frac{e^2}{(4\pi )^2}(\bar{s}P_R b)(\bar{\ell _1}\gamma _5\ell _{2}),\nonumber \\ \end{aligned}$$
(3)

and the operators with flipped chirality \(\mathcal {O}^\prime _{9,10,S,P}\) are obtained from \(\mathcal {O}_{9,10,S,P}\) by replacing \(P_L \leftrightarrow P_R\), where \(P_{L/R}=\frac{1}{2}(1\mp \gamma _5)\). In the SM the operators \(\mathcal{O}_{9,10}\) play the major role, together with the electromagnetic penguin operator \(\mathcal{O}_7=(e/g^2) m_b(\bar{s}\sigma _{\mu \nu }P_R b)F^{\mu \nu } \). The corresponding Wilson coefficients are obtained through a perturbative matching between the full and effective theories at the weak interaction scale \(\mu \simeq m_W\) and then run down to the scale at which the process takes place, namely \(\mu =m_b\). After appropriately absorbing the effects of \(\mathcal{O}_{1-6}\) in the effective Wilson coefficients, one finally has \(C_7=-0.304\), \(C_9=4.211\), \(C_{10}=-4.103\) [40]. Other Wilson coefficients in the SM are zero, \(C_{7,9,10}^\prime =0\), \(C_{S,P}^{(\prime )}=0\). Of course, if \(m_1 \ne m_2\) all the Wilson coefficients are zero in the SM and in order to generate their non-zero values one needs to work in a specific framework of BSM physics. Before embarking on that part of the problem, we will now derive the expressions for the decay rates and angular distributions (when possible) starting from the Hamiltonian (2).

2.1 Leptonic decay \(B_s\rightarrow \ell _1 \ell _2\)

We first focus on the simplest exclusive \(b\rightarrow s\ell _1\ell _2\) mode, \(B_s \rightarrow \ell _1\ell _2\), which is also instructive as far as the operators contributing to the process are concerned. Of course, and after the trivial replacements, the same expressions will be valid for \(B_d\rightarrow \ell _1\ell _2\). We use the standard decomposition of the hadronic matrix element,

$$\begin{aligned} \langle 0 | \bar{b} \gamma _\mu \gamma _5 s | B_s(p) \rangle = i p_\mu f_{B_s}, \end{aligned}$$
(4)

where \(f_{B_s}\) is the \(B_s\)-meson decay constant, and we obtain

$$\begin{aligned}&\mathcal{B}(B_s\rightarrow \ell _1^- \ell _2^+)^\mathrm{theo} \nonumber \\&\quad =\dfrac{\tau _{B_s}}{64 \pi ^3}\frac{\alpha ^2 G_F^2}{m_{B_s}^3} f_{B_s}^2 |V_{tb}V_{ts}^*|^2\lambda ^{1/2}(m_{B_s},m_1,m_2)\nonumber \\&\qquad \times \left\{ [\phantom {\left. \frac{m_{B_s}^2}{m_b+m_s}\right| ^2}m_{B_s}^2-(m_1+m_2)^2]\cdot \left| \phantom {\frac{m_{B_s}^2}{m_b+m_s}}(C_9-C_9')(m_1-m_2)\right. \right. \nonumber \\&\qquad \left. +(C_S-C_S')\frac{m_{B_s}^2}{m_b+m_s}\right| ^2 \nonumber \\&\qquad \left. +[m_{B_s}^2-(m_1-m_2)^2]\cdot \left| \phantom {\frac{m_{B_s}^2}{m_b+m_s}}(C_{10}-C_{10}')(m_1+m_2)\right. \right. \nonumber \\&\qquad \left. \left. +(C_P-C_P')\frac{m_{B_s}^2}{m_b+m_s}\right| ^2\right\} , \end{aligned}$$
(5)

where \(\lambda (a,b,c)=[a^2-(b-c)^2][a^2-(b+c)^2]\). What immediately becomes evident from Eq. (5) is that in the LFV channel the lepton vector current is not conserved, \(i\partial _\mu (\bar{\ell }_1 \gamma ^\mu \ell _2)=(m_2-m_1)\bar{\ell }_1 \ell _2\ne 0\), and the contribution of \(C_9^{(\prime )}\) cannot be neglected. Quite obviously, in the limit \(m_1 =m_2\) one finds the usual expression for \(\mathcal{B}(B_s\rightarrow \ell ^+ \ell ^-)\). Finally, when confronting theory with the experimental measurements one needs to account for the effect of oscillations in the \(B_s-\overline{B}_s\) system because the time dependence of the \(B_s\)-decay rate has been integrated in experiment. Therefore, and to a good approximation, one can identify [41]

$$\begin{aligned} \mathcal{B}(B_s\rightarrow \ell _1 \ell _2 )_\mathrm{exp}\approx {1\over 1-y_s} \mathcal{B}(B_s\rightarrow \ell _1 \ell _2 )^\mathrm{th}\,, \end{aligned}$$
(6)

where \(y_s=\Delta \Gamma _{B_s}/(2 \Gamma _{B_s}) =0.061(9)\), as measured at LHCb [42]. Notice that the non-conservation of the vector current induces the term in Eq. (5) proportional to \(C_9-C_9^\prime \), which involves the difference of the lepton masses, and therefore the decay modes \(B_s\rightarrow \ell _1^- \ell _2^+\) and \(B_s\rightarrow \ell _1^+ \ell _2^-\) should be studied separately, unless there is a reason that \((C_9-C_9')_{12} = - (C_9-C_9')_{21}\). One should therefore be careful in relating the LFV with the LFC contributions via a multiplicative factor. For that to be plausible, one should make sure the contribution proportional to \(C_9-C_9'\) in the LFV case is absent.

2.2 \(B\rightarrow K \ell _1 \ell _2\)

Throughout this paper we will use the kinematics of Ref. [43], which for the case of \(B\rightarrow K \ell _1^- \ell _2^+\) means that the main decay axis z is defined in the rest frame of B, so that K and the lepton pair travel in the opposite directions. The angle between the negatively charged lepton and the decay axis (opposite to the direction of flight of the kaon) is denoted by \(\theta _\ell \) and is defined in the lepton-pair rest frame. Concerning the hadronic matrix elements we use the following (standard) parametrizations:

$$\begin{aligned} \langle \bar{K}(k)|\bar{s}\gamma _\mu b|\bar{B}(p)\rangle&= \left[ (p+k)_\mu - \frac{m_B^2-m_K^2}{q^2}q_\mu \right] f_+(q^2)\nonumber \\&\quad +\frac{m_B^2-m_K^2}{q^2} q_\mu f_0(q^2),\end{aligned}$$
(7)
$$\begin{aligned} \langle \bar{K}(k)|\bar{s}\sigma _{\mu \nu } b|\bar{B}(p)\rangle&= -i (p_\mu k_\nu -p_\nu k_\mu )\frac{2 f_T(q^2,\mu )}{m_B+m_K},\nonumber \\ \end{aligned}$$
(8)

where \(f_{+,0,T}(q^2)\) are the hadronic form factors, functions of \(q^2 = (p-k)^2=(p_1+p_2)^2\), with \((m_1 +m_2)^2 \le q^2 \le (m_B-m_K)^2\). In the following the scale \(\mu =m_b\) will be assumed. Using the above definitions we can then write the differential decay rate in the following form:

$$\begin{aligned}&\dfrac{\mathrm {d}\mathcal{B}}{\mathrm {d}q^2}(\bar{B} \rightarrow \bar{K} \ell _1^- \ell _2^+) = \vert \mathcal{N}_{K}(q^2)\vert ^2\nonumber \\&\quad \times \left\{ \varphi _7(q^2) |C_7+C_{7}'|^2 + \varphi _{9}(q^2) |C_{9}+C_{9}'|^2\right. \nonumber \\&\quad + \varphi _{79}(q^2) \mathrm {Re}[C_7 C_9^*] + \varphi _{10}(q^2) |C_{10}+C_{10}'|^2 \nonumber \\&\quad + \varphi _S(q^2) |C_S+C_{S}'|^2+ \varphi _P(q^2) |C_P+C_{P}'|^2 \nonumber \\&\quad \left. + \varphi _{9S}(q^2) \mathrm {Re}[C_{9} C_S^*]+ \varphi _{10P}(q^2) \mathrm {Re}[C_{10} C_P^*] \right\} , \end{aligned}$$
(9)

where the \(\varphi _{i}(q^2)\) depend on kinematical quantities and on the form factors, or more explicitly:Footnote 2

$$\begin{aligned} \varphi _{7}(q^2)= & {} \frac{2 m_b^2|f_T(q^2)|^2}{(m_B+m_K)^2} \lambda (m_B,m_K,\sqrt{q^2})\nonumber \\&\times \left[ 1-\frac{(m_1-m_2)^2}{q^2}-\frac{\lambda (\sqrt{q^2},m_1,m_2)}{3 q^4}\right] , \nonumber \\ \varphi _{9(10)}(q^2)= & {} \frac{1}{2}|f_0(q^2)|^2(m_1\mp m_2)^2 \frac{(m_B^2-m_K^2)^2}{q^2}\nonumber \\&\times \left[ 1-\frac{(m_1\pm m_2)^2}{q^2}\right] \nonumber \\&\quad +\frac{1}{2}|f_+(q^2)|^2 \lambda (m_B,m_K,\sqrt{q^2})\nonumber \\&\quad \times \left[ 1-\frac{(m_1\mp m_2)^2}{q^2}-\frac{\lambda (\sqrt{q^2},m_1,m_2)}{3 q^4}\right] \nonumber ,\\ \varphi _{79}(q^2)= & {} \frac{2 m_b f_+(q^2)f_T(q^2)}{m_B+m_K} \lambda (m_B,m_K,\sqrt{q^2})\nonumber \\&\times \left[ 1-\frac{(m_1-m_2)^2}{q^2}-\frac{\lambda (\sqrt{q^2},m_1,m_2)}{3 q^4}\right] ,\nonumber \\ \varphi _{S (P)}(q^2)= & {} \frac{q^2 |f_0(q^2)|^2}{2(m_b-m_s)^2}(m_B^2-m_K^2)^2 \nonumber \\&\times \left[ 1-\frac{(m_1\pm m_2)^2}{q^2}\right] , \nonumber \\ \varphi _{10P (9S)}(q^2)= & {} \frac{|f_0(q^2)|^2}{m_b-m_s}(m_1\pm m_2)(m_B^2-m_K^2)^2\nonumber \\&\times \left[ 1-\frac{(m_1 \mp m_2)^2}{q^2}\right] . \end{aligned}$$
(10)

Finally, the normalization factor in Eq. (9) reads

$$\begin{aligned} \vert \mathcal{N}_{K}(q^2)\vert ^2= & {} \tau _{B_d}\dfrac{\alpha ^2 G_F^2 |V_{tb}V_{ts}^*|^2}{512 \pi ^5 m_B^3}\dfrac{\lambda ^{1/2}(\sqrt{q^2},m_1,m_2)}{q^2}\nonumber \\&\times \lambda ^{1/2}(\sqrt{q^2},m_B,m_K). \end{aligned}$$
(11)

Like in the previous subsection we see that due to the non-conservation of the leptonic vector current, the new pieces emerge in the functions \(\varphi _i(q^2)\). By taking the limit \(m_1 = m_2\) in Eq. (10) we retrieve the known expressions for the LFC case. We should also emphasize that the interference term \(\varphi _{9S}(q^2)\) changes the sign depending on the charge of the heavier lepton. In other words, if one assumes that the Wilson coefficients \((C_i)_{12}= (C_i)_{21}\), then the difference between \(\mathcal{B}(B \rightarrow K \ell _1^- \ell _2^+ )\) and \(\mathcal{B}(B \rightarrow K \ell _2^- \ell _1^+)\) will be a measure of the interference term proportional to \(\mathrm {Re}[C_{9} C_S^*]\).

2.3 \(B\rightarrow K^*\ell _1 \ell _2\) and \(B_s\rightarrow \phi \ell _1 \ell _2\)

These processes proceed via \(B\rightarrow K^*(\rightarrow K\pi ) \ell _1 \ell _2\) and \(B_s\rightarrow \phi (\rightarrow K\bar{K}) \ell _1 \ell _2\). Since the expression for the angular distribution of the latter decay can be obtained by the trivial replacements in the expression for the former decay mode, we will focus on \(\bar{B}\rightarrow \bar{K}^*(\rightarrow K^-\pi ^+) \ell _1^- \ell _2^+\). As already stated in the previous subsection, we adopt the kinematics of Ref. [43], which is even more explicitly specified in Ref. [44] and fixed in such a way that they coincide with the conventions adopted in experiments at the LHC [2, 3]. In the appendix of the present paper we give necessary details concerning the kinematics of this process. Besides \(\theta _\ell \) we also need \(\theta _K\), the angle between the decay axis \(-z\) and the direction of flight \(K^-\) in the rest frame of \(\bar{K}^*\) (cf. Fig. 4). The angle between the planes spanned by \(K\pi \) and \( \ell _1^- \ell _2^+\) respectively is denoted by \(\phi \). In this case there are many more form factors parametrizing the hadronic matrix elements, namely,

$$\begin{aligned}&\langle \bar{K}^*(k)|\bar{s}\gamma ^\mu (1-\gamma _5) b|\bar{B}(p)\rangle \nonumber \\&\quad = \varepsilon _{\mu \nu \rho \sigma }\varepsilon ^{*\nu }p^\rho k^\sigma \frac{2 V(q^2)}{m_B+m_{K^*}}-i \varepsilon _\mu ^*(m_B+m_{K^*})A_1(q^2)\nonumber \\&\qquad +i(p+k)_\mu (\varepsilon ^*\cdot q)\frac{A_2(q^2)}{m_B+m_{K^*}}\nonumber \\&\qquad + i q_\mu (\varepsilon ^*\cdot q) \frac{2 m_{K^*}}{q^2}[A_3(q^2)-A_0(q^2)], \end{aligned}$$
(12)
$$\begin{aligned}&\langle \bar{K}^*(k)|\bar{s}\sigma _{\mu \nu } q^\nu (1-\gamma _5) b|\bar{B}(p)\rangle \nonumber \\&\quad = 2 i \varepsilon _{\mu \nu \rho \sigma } \varepsilon ^{*\nu }p^\rho k^\sigma T_1(q^2)+[\varepsilon _\mu ^*(m_B^2-m_{K^*}^2)\nonumber \\&\qquad -(\varepsilon ^*\cdot q)(2p-q)_\mu ]T_2(q^2)\nonumber \\&\qquad +(\varepsilon ^*\cdot q) \left[ q_\mu - \frac{q^2}{m_B^2-m_{K^*}^2}(p+k)_\mu \right] T_3(q^2), \end{aligned}$$
(13)

where \(\varepsilon _\mu \) is the polarization vector of \(K^*\), and the form factor \(A_3(q^2)\) is not independent but related to \(A_{1,2}(q^2)\) as \(2 m_{K^*} A_3(q^2)=(m_B+m_{K^*})A_1(q^2)-(m_B-m_{K^*})A_2(q^2)\). The full angular distribution of the above decay readsFootnote 3

$$\begin{aligned} \dfrac{\mathrm {d}^4 \mathcal{B} ({B}\rightarrow \bar{K}^{*}\rightarrow (K\pi ) \ell _\alpha ^-\ell _\beta ^+)}{\mathrm {d}q^2\mathrm {d}\cos \theta _\ell \mathrm {d}\cos \theta _K \mathrm {d}\phi } = \dfrac{9}{32\pi }I(q^2,\theta _\ell ,\theta _K,\phi ), \end{aligned}$$
(14)

with

$$\begin{aligned}&I(q^2,\theta _\ell ,\theta _K,\phi ) \nonumber \\&\quad = I_1^s(q^2)\sin ^2\theta _K + I_1^c(q^2)\cos ^2\theta _K+[I_2^s(q^2)\sin ^2\theta _K\nonumber \\ {}&\qquad +I_2^c(q^2)\cos ^2\theta _K]\cos 2\theta _\ell \nonumber \\&\qquad +I_3(q^2)\sin ^2\theta _K \sin ^2\theta _\ell \cos 2\phi \nonumber \\&\qquad +I_4(q^2)\sin 2\theta _K \sin 2\theta _\ell \cos \phi \nonumber \\&\qquad + I_5(q^2) \sin 2\theta _K\sin \theta _\ell \cos \phi \nonumber \\&\qquad +[I_6^s(q^2)\sin ^2\theta _K+I_6^c(q^2)\cos ^2\theta _K]\cos \theta _\ell \nonumber \\&\qquad +I_7(q^2)\sin 2\theta _K \sin \theta _\ell \sin \phi \nonumber \\&\qquad + I_8(q^2)\sin 2\theta _K \sin 2\theta _\ell \sin \phi \nonumber \\&\qquad +I_9(q^2) \sin ^2\theta _K \sin ^2\theta _\ell \sin 2 \phi . \end{aligned}$$
(15)

After integrating over angles the differential decay rate is simply

$$\begin{aligned} {\mathrm {d}\mathcal{B}\over \mathrm {d}q^2}=\frac{1}{4}\left[ 3 I_1^c(q^2)+6 I_1^s(q^2)-I_2^c(q^2)-2I_2^s(q^2)\right] \ . \end{aligned}$$
(16)

The \(q^2\)-dependent angular coefficients are combinations of the decay’s helicity amplitudes, which can also be expressed in terms of the transversity amplitudes \(A_{\perp ,\parallel ,0,t}^{L(R)}\equiv A_{\perp ,\parallel ,0,t}^{L(R)}(q^2)\) as follows:

$$\begin{aligned} A_{\perp }^{L(R)}&= \mathcal{N}_{K^*} \sqrt{2} \lambda _B^{1/2}\\&\quad \times \left[ [(C_9+C_9')\mp (C_{10}+C_{10}')]\frac{V(q^2)}{m_B+m_{K^*}}\right. \\ {}&\quad \left. +\frac{2 m_b}{q^2}(C_7+C_7') T_1(q^2)\right] , \\ A_{\parallel }^{L(R)}&= -\mathcal{N}_{K^*} \sqrt{2}(m_B^2-m_{K^*}^2)\\&\quad \times \left[ [(C_9-C_9')\mp (C_{10}-C_{10}')]\frac{A_1(q^2)}{m_B-m_{K^*}}\right. \\ {}&\quad \left. +\frac{2 m_b}{q^2}(C_7-C_7')T_2(q^2)\right] , \end{aligned}$$
$$\begin{aligned} A_0^{L(R)}&=-\frac{\mathcal{N}_{K^*}}{2 m_{K^*} \sqrt{q^2}}\left\{ \phantom {\left[ (m_B^2+3m_{K^*}^2-q^2)T_2(q^2)-\frac{ \lambda _B T_3(q^2)}{m_B^2-m_{K^*}^2} \right] } 2 m_b (C_7-C_7')\right. \nonumber \\&\quad \times \left[ \phantom {\left. \frac{ \lambda _B T_3(q^2)}{m_B^2-m_{K^*}^2} \right] }(m_B^2+3m_{K^*}^2-q^2)T_2(q^2)-\frac{ \lambda _B T_3(q^2)}{m_B^2-m_{K^*}^2} \right] \nonumber \\&\quad +[(C_9-C_9')\mp (C_{10}-C_{10}')]\nonumber \\&\quad \cdot \left. \left[ (m_B^2-m_{K^*}^2-q^2)(m_B+m_{K^*})A_1(q^2)\right. \right. \nonumber \\&\qquad \left. \left. -\frac{ \lambda _B A_2(q^2)}{m_B+m_{K^*}}\right] \right\} \nonumber \\ A_{t}^{L(R)} \!&=\! -\mathcal{N}_{K^*} \frac{\lambda _B^{1/2}}{\sqrt{q^2}}\left[ (C_9\!-\!C_{9}') \mp (C_{10}\!-\!C_{10}') \!+\!\frac{q^2}{m_b+m_s}\right. \nonumber \\ {}&\quad \times \left. \left( \frac{C_S-C_S'}{m_1-m_2}\mp \frac{C_P-C_P'}{m_1+m_2}\right) \right] A_0(q^2), \end{aligned}$$
(17)

where, for shortness, \(\lambda _B=\lambda (m_B,m_{K^*},\sqrt{q^2})\), \(\lambda _q=\lambda (m_1,m_2,\sqrt{q^2})\), and

$$\begin{aligned} \mathcal{N}_{K^*}=V_{tb}V_{ts}^*\left[ \frac{\tau _{B_d} G_F^2 \alpha ^2}{3 \times 2^{10} \pi ^5 m_B^3} \lambda _B^{1/2} \lambda _q^{1/2} \right] ^{1/2}. \end{aligned}$$
(18)

The upper signs in the above formulas correspond to \(A_i^L\) and the lower ones to \(A_i^R\). Notice that \(A_{t}\) also has the superscript L(R), referring to the chirality of the lepton pair, which may appear unusual when compared to the lepton flavor conserving case, and which we now explain. When \(\ell _1=\ell _2\) the pseudoscalar density can be rewritten as

$$\begin{aligned} \bar{\ell } \gamma _5 \ell = \frac{q^\mu }{2 m_\ell } (\bar{\ell } \gamma _\mu \gamma _5 \ell ), \end{aligned}$$
(19)

so that the contributions coming from the operator \(\mathcal {O}_P^{(\prime )}\) can be absorbed in the amplitude \(A_t\), which is associated to the timelike polarization vector of the virtual vector boson, \(\epsilon _{V}^\mu (t)=q^\mu /\sqrt{q^2}\). A similar approach cannot be applied to the scalar operator \(\mathcal {O}_S^{(\prime )}\), because \(q^\mu (\bar{\ell } \gamma _\mu \ell ) = 0\), and one must define a new amplitude \(A_S\) to accommodate for the residual scalar contribution. In the LFV case, \(m_1\ne m_2\), one can use the Ward identities to absorb both the scalar and the pseudoscalar densities in the vector and axial currents, respectively. Therefore, in the LFV case the amplitudes \(A_t\) and \(A_S\) are replaced by \(A_t^{L(R)}\). Although the expressions for \(A_t^{L,R}\) are ill defined in the limit \(m_1 = m_2\) we have checked that the angular coefficients are very well defined and one retrieves the standard formulas of Ref. [45].

Finally, in terms of the transversity amplitudes (17), the angular coefficients \(I_{1-9}(q^2)\) are given by

$$\begin{aligned} I_1^s(q^2)&=\left[ |A_{\perp }^L|^2+|A_{\parallel }^L|^2+ (L\rightarrow R) \right] \\&\quad \times \frac{\lambda _q +2 [q^4-(m_1^2-m_2^2)^2]}{4 q^4}\\&\quad +\frac{4 m_1 m_2}{q^2}\mathrm {Re}\left( A_{\parallel }^L A_{\parallel }^{R*}+A_{\perp }^L A_{\perp }^{R*}\right) , \\ I_1^c(q^2)&= \left[ |A_0^L|^2+|A_0^R|^2 \right] \frac{q^4-(m_1^2-m_2^2)^2}{q^4}\\&\quad +\frac{8 m_1 m_2}{q^2} \mathrm {Re}(A_0^L A_0^{R*}-A_t^L A_t^{R*}) \\&\quad -2\frac{(m_1^2-m_2^2)^2-q^2 (m_1^2+m_2^2)}{q^4}\left( |A_t^L|^2+|A_t^R|^2 \right) ,\\ I_2^s(q^2)&= \frac{\lambda _q}{4 q^4}[|A_\perp ^L|^2+|A_\parallel ^L|^2+(L\rightarrow R)], \\ I_2^c(q^2)&= - \frac{\lambda _q}{q^4}(|A_0^L|^2+|A_0^R|^2), \\ I_3(q^2)&= \frac{\lambda _q}{2 q^4} [|A_\perp ^L|^2-|A_\parallel ^L|^2+(L\rightarrow R)],\\ I_4(q^2)&= - \frac{\lambda _q}{\sqrt{2} q^4} \mathrm {Re}(A_\parallel ^L A_0^{L*}+(L\rightarrow R)],\\ I_5(q^2)&= \frac{\sqrt{2}\lambda _q^{1/2}}{q^2}\left[ \phantom {\frac{m_1^2-m_2^2}{q^2}} \mathrm {Re}(A_0^L A_\perp ^{L*}-(L\rightarrow R))\right. \\&\quad \left. -\frac{m_1^2-m_2^2}{q^2} \mathrm {Re}(A_t^L A_\parallel ^{L*}+(L\rightarrow R))\right] , \\ I_6^s(q^2)&=- \frac{2 \lambda _q^{1/2}}{q^2}[\mathrm {Re}(A_\parallel ^L A_\perp ^{L*}-(L\rightarrow R))], \end{aligned}$$
$$\begin{aligned} I_6^c(q^2)&= - \frac{4\lambda _q^{1/2}}{q^2}\frac{m_1^2-m_2^2}{q^2} \mathrm {Re}(A_0^L A_t^{L*}+(L\rightarrow R)),\nonumber \\ I_7(q^2)&= - \frac{\sqrt{2}\lambda _q^{1/2}}{q^2}\times \left[ \phantom {\frac{m_1^2-m_2^2}{q^2}} \mathrm {Im}(A_0^L A_\parallel ^{L*}-(L\rightarrow R))\right. \nonumber \\ {}&\quad +\left. \frac{m_1^2-m_2^2}{q^2} \mathrm {Im}(A_\perp ^{L}A_t^{L*} +(L\rightarrow R))\right] , \nonumber \\ I_8(q^2)&= \frac{\lambda _q}{\sqrt{2}q^4}\mathrm {Im}(A_0^{L}A_\perp ^{L*} +(L\rightarrow R)), \nonumber \\ I_9(q^2)&=- \frac{\lambda _q}{q^4}\mathrm {Im}(A_\perp ^L A_\parallel ^{L*} +A_\perp ^R A_\parallel ^{R*} ). \end{aligned}$$
(20)

Once again, by taking the limit \(m_1 \rightarrow m_2\), one retrieves the usual expressions for the coefficients of the angular distribution of \(\bar{B}\rightarrow \bar{K}^*\ell ^+\ell ^-\). Our expressions agree with those recently presented in Ref. [44], and they are related to those given in Ref. [45] via \(I_{4,6,7,9}\rightarrow - I_{4,6,7,9}\). In order to compare with the expressions for \(A_t\) and \(A_S\) from Ref. [45] one needs to identify

$$\begin{aligned}&A_t = \lim _{m_1\rightarrow m_2}\left( A_t^L-A_t^R \right) ,\qquad \nonumber \\&A_S = \lim _{m_1\rightarrow m_2}\left[ {m_1-m_2 \over \sqrt{q^2} }\left( A_t^L+A_t^R\right) \right] \,. \end{aligned}$$
(21)

2.4 Numerical significance

To illustrate numerically the significance of the factors multiplying the Wilson coefficients, we use the form factors of Refs. [46, 47] and distinguish the case of LFV arising from the vector operators, i.e.

$$\begin{aligned}&\mathcal{B}(\bar{B}\rightarrow \bar{K}^{(*)} \ell _1\ell _2) = 10^{-9} \left( a_{K^{(*)}}^{12} {|}C_9+ C_9^{\prime } {|}^2 + b_{K^{(*)}}^{12} {|}C_{10}\right. \nonumber \\&\quad \left. +C_{10}^\prime {|}^2 + c_{K^{(*)}}^{12} {|}C_{9} -C_{9}^{\prime } {|}^2+d_{K^{(*)}}^{12} {|}C_{10}-C_{10}^\prime {|}^2\right) ,\nonumber \\ \end{aligned}$$
(22)

from the case in which the LFV comes from the scalar operators,

$$\begin{aligned}&\mathcal{B}(\bar{B}\rightarrow \bar{K}^{(*)} \ell _1\ell _2) = 10^{-9} \left( e_{K^{(*)}}^{12} {|}C_S+ C_S^{\prime } {|}^2 + f_{K^{(*)}}^{12} {|}C_{P} \right. \nonumber \\&\quad \left. +C_{P}^\prime {|}^2+ g_{K^{(*)}}^{12} {|}C_{S} -C_{S}^{\prime } {|}^2+h_{K^{(*)}}^{12} {|}C_{P}-C_{P}^\prime {|}^2\right) . \end{aligned}$$
(23)

The values of the factors multiplying the Wilson coefficients are obtained after integrating over all available \(q^2\)’s and are listed in Tables 1 and 2.

Table 1 Values for the multiplicative factors defined in Eq. (22). The quoted uncertainties are at the \(1\sigma \) level
Table 2 Values for the multiplicative factors defined in Eq. (23) to \(1\sigma \) accuracy

Notice also that the functions which are being integrated to obtain those factors have a peculiar feature: those which multiply \(|C_{9,10} \pm C_{9,10}^\prime |^2\) are more pronounced in the intermediate \(q^2\) region, whereas those multiplying \(|C_{S,P} \pm C_{S,P}^\prime |^2\) are mostly receiving contributions from the large \(q^2\) region. To illustrate this feature, we show in Fig. 1 the coefficient functions \(\varphi _{9,10}(q^2)\) [\(\varphi _{S,P}(q^2)\)], which upon integration amount to \(a_{K}^{\mu \tau }\) and \(b_K^{\mu \tau }\) [\(e_{K}^{\mu \tau }\) and \(f_K^{\mu \tau }\)].Footnote 4

Furthermore in the case of LFV generated by the scalar operators the lifted helicity suppression of the leptonic decay (5) leads to the following hierarchy among different modes:

$$\begin{aligned} C_{S,P}^{(\prime )}\ne 0, C_{9,10}^{(\prime )}= & {} 0: \mathcal {B}(B_s\rightarrow \ell _1\ell _2)>\mathcal {B}(B\rightarrow K \ell _1\ell _2)\nonumber \\> & {} \mathcal {B}(B\rightarrow K^*\ell _1\ell _2). \end{aligned}$$
(24)

That hierarchy is inverted for the LFV processes generated by the vector operators, namely

$$\begin{aligned} C_{S,P}^{(\prime )}=0, C_{9,10}^{(\prime )}\ne & {} 0: \mathcal {B}(B_s\rightarrow \ell _1\ell _2)<\mathcal {B}(B\rightarrow K \ell _1\ell _2)\nonumber \\< & {} \mathcal {B}(B\rightarrow K^*\ell _1\ell _2). \end{aligned}$$
(25)

Of course the above discussion is valid as long as we do not consider the case of LFV generated by both the scalar and the vector operators, which we will not discuss in the following anyway.

Fig. 1
figure 1

Coefficient functions \(\phi _{9,10}(q^2) =\vert \mathcal{N}_{K}(q^2)\vert ^2 \varphi _{9,10}(q^2)\) and \(\phi _{S,P}(q^2) = \vert \mathcal{N}_{K}(q^2)\vert ^2 \varphi _{S,P}(q^2)\) appearing in Eq. (10), which after integration over \(q^2\) give the factors \(a_K^{\mu \tau }\), \(b_K^{\mu \tau }\), \(e_K^{\mu \tau }\), and \(f_K^{\mu \tau }\) in Eqs. (22, 23). Full curves correspond to \(\phi _{9}(q^2)\) and \(\phi _{S}(q^2)\), while the dashed ones to \(\phi _{10}(q^2)\) and \(\phi _{P}(q^2)\)

3 A case of \(C_{S,P}\ne 0\): coupling to Higgs

In this section we focus on the specific example of a scenario in which the LFV is generated through the scalar operators. We will relate the \(2.2\sigma \) excess of \(h\rightarrow \mu \tau \) observed by CMS [30], to the decays \(B_s \rightarrow \mu \tau \) and \(B\rightarrow K^{(*)}\mu \tau \).Footnote 5

In the scenarios in which the physics BSM comes solely from the modification of the Higgs sector, the decay \(h\rightarrow \mu \tau \) can be described by the Yukawa Lagrangian,

$$\begin{aligned} \mathcal {L}_Y^{\mathrm {eff}} = - y_{ij} \bar{\ell }^i_L \ell ^j_R h + \mathrm {h.c.} \end{aligned}$$
(26)

The non-diagonal couplings \(y_{ij}\) can originate in the mixing of the Higgs doublet with additional scalar doublets, and the above Lagrangian is fully adequate if the masses of other Higgs states are larger than \(m_h\) [49]. The only Wilson coefficients in Eq. (2) that receive non-negligible contributions through the scalar penguin diagrams are [50, 51]

$$\begin{aligned} C_{S,P}&= - \frac{y_{\mu \tau }\pm y_{\tau \mu }^*}{2}\frac{m_b v}{16 m_W^2\sin ^2\theta _W} \nonumber \\&\quad \times \left( \frac{6 x_t}{x_h} - \frac{2 x_t^3}{ (1-x_t)^3}\ln x_t +\frac{4 x_t^2}{ (1-x_t)^3}\ln x_t\right. \nonumber \\&\quad \left. - \frac{x_t^2}{ (1-x_t)^2} + \frac{3 x_t}{ (1-x_t)^2} \right) , \end{aligned}$$
(27)

where \(x_{t,h}= m_{t,h}^2/m_W^2\), \(v= 246\ \mathrm{GeV}\), and the upper (lower) signs corresponds to \(C_S\) (\(C_P\)).Footnote 6 Using the CMS result, \(\mathcal{B}(h\rightarrow \mu \tau ) = 0.84^{+0.39}_{-0.37}\,\%\), and the formula \(\Gamma (h\rightarrow \mu \tau )= \left( |y_{\mu \tau }|^2+|y_{\tau \mu }|^2 \right) m_h/(8\pi )\), one obtains

$$\begin{aligned}&1.9 [0.8] < 10^3 \times \sqrt{|y_{\mu \tau }|^2+|y_{\tau \mu }|^2 }< 3.2 [3.6]\nonumber \\&\quad \quad \text {at }\,68\,\%\, [95\,\%]\,\text {CL}, \end{aligned}$$
(28)

which then amounts to

$$\begin{aligned}&8.4 [3.5]<10^4\times \sqrt{|C_S|^2+|C_P|^2}< 14.2 [16.0]\nonumber \\&\quad \quad \text {at }\,68\,\%\, [95\,\%]\,\text {CL}. \end{aligned}$$
(29)

Notice that the couplings \(y_{\mu \tau }\) and \(y_{\tau \mu }\) are tacitly assumed to be complex, in which case the quantity \(|y_{\mu \tau }|^2+|y_{\tau \mu }|^2\) is not enough to completely determine the decay amplitudes of the processes described here. One possibility to tackle this issue is to use Eqs. (5) and (27), write

$$\begin{aligned}&\mathcal {B}(B_s \rightarrow \mu \tau ) \propto (m_{B_s}^2-m_\mu ^2-m_\tau ^2)(|y_{\mu \tau }|^2+|y_{\tau \mu }|^2)\nonumber \\&\quad -2m_\mu m_\tau \mathrm {Re}(y_{\mu \tau } y_{\tau \mu }^*), \end{aligned}$$
(30)

and combine it with the constraint coming from \(\mathcal {B}(\tau \rightarrow \mu \gamma )\), namelyFootnote 7

$$\begin{aligned} \mathcal {B}(\tau \rightarrow \mu \gamma ) = {\alpha m_\tau ^5\over 64 \pi ^4 \Gamma _\tau } \left( |C_L^\gamma |^2+|C_R^\gamma |^2\right) \end{aligned}$$
(31)

and \(\mathcal {B}(\tau \rightarrow \mu \gamma )_\mathrm{exp.}<4.4\times 10{-8}\) [52]. As of now nothing can be said about the complex phases of these couplings and in the following we assume them to be zero.

As indicated in Eq. (24) the most sensitive channel to the presence of \(C_S\ne 0\) is the leptonic decay mode \(\mathcal{B}(B_s\rightarrow \ell _1\ell _2 )\). To exacerbate the phenomenon we focus on the \(\mu \tau \)-decay channel and show in Fig. 2 its dependence on the coupling \(y_{\mu \tau }=y_{\tau \mu }\). We also show the plot of \(\mathcal{B}(h\rightarrow \tau \mu )\) versus the branching fractions of the modes we are interested in for increasing values of \(y_{\tau \mu }\). The horizontal stripes correspond to the \(1\sigma \) (darker) and \(2\sigma \) (brighter) result reported by CMS.

Fig. 2
figure 2

In the left panel we plot the branching fraction of \(B_s\rightarrow \tau \mu \) decay as a function of \(y_{\tau \mu }=y_{\mu \tau }\). The segment shown in full line is the one corresponding to \(y_{\tau \mu }\) extracted from \(\mathcal{B}(h\rightarrow \tau \mu )\) to \(2\sigma \). In the right panel we show \(\mathcal{B}(h\rightarrow \tau \mu )\) by the brighter (darker) stripe to \(1 (2) \sigma \) as reported by CMS, versus \(\mathcal{B}(B_s\rightarrow \tau \mu )\) [blue], \(\mathcal{B}(B\rightarrow K\tau \mu )\) [red], \(\mathcal{B}(B\rightarrow K^*\tau \mu )\) [orange]

Finally, the bounds on the LFV modes obtained in this way are:

$$\begin{aligned}&\mathcal{B}(B_s \rightarrow \mu \tau ) <3.9\times 10^{-13}\,,\nonumber \\&\mathcal{B}(B \rightarrow K \mu \tau )< 3.8 \times 10^{-14}\,,\nonumber \\&\mathcal{B}(B \rightarrow K \mu \tau ) < 1.2 \times 10^{-14}\,. \end{aligned}$$
(32)

These bounds are too small for current experimental searches. The purpose of this section, however, was only to illustrate the effect of LFV generated through the scalar couplings extracted from the experimental bound on \(\mathcal{B}(h\rightarrow \mu \tau )\). If the origin of such a coupling is different from the one discussed here, the above bounds could be larger (less stringent) but the hierarchy given in Eq. (24) will still hold true.

4 A case of \(C_{9,10}\ne 0\): coupling to \(Z^\prime \)

In this section we revisit a \(Z^\prime \) model, already discussed in the context of this problem in Ref. [39]. It illustrates the case in which the LFV is generated by the (axial-)vector operators. Furthermore, since our bounds somewhat differ from those reported in Ref. [39] we believe it is worth discussing it in more detail. The most general lagrangian involving \(Z^\prime \) reads

$$\begin{aligned} \mathcal {L}_{Z^\prime } \supset g_{\ell _i \ell _j}^L \bar{\ell _i }\gamma ^\mu P_L \ell _j \ Z_\mu ^\prime + g_{sb}^L \bar{s}\gamma ^\mu P_L b \ Z_\mu ^\prime + (L\rightarrow R),\nonumber \\ \end{aligned}$$
(33)

where we assume that the \(Z^\prime \) boson couples only to the second and third generations of quarks and leptons. Since the scale of new physics is assumed to be well above the electroweak one, the \(SU(2)_L\) gauge invariance has to be preserved, which then implies that, for example, \(g_{\ell _i \ell _j}^L=g_{\nu _i\nu _j}^L\) and \(g_{sb}^L=g_{ct}^L\).

After integrating out the \(Z^\prime \), the relevant Wilson coefficients read

$$\begin{aligned} C_9^{(\prime )\mu \tau }&= -\frac{\pi }{\sqrt{2}m_{Z'}^2}\frac{1}{\alpha G_F V_{tb} V_{ts}^*} g_{sb}^{L(R)}(g_{\mu \tau }^R+g_{\mu \tau }^L), \nonumber \\ C_{10}^{(\prime )\mu \tau }&= -\frac{\pi }{\sqrt{2}m_{Z'}^2}\frac{1}{\alpha G_F V_{tb} V_{ts}^*} g_{sb}^{L(R)}(g_{\mu \tau }^R-g_{\mu \tau }^L), \end{aligned}$$
(34)

where the primed Wilson coefficients are proportional to \(g_{sb}^{R}\). To get the value of \(g_{sb}^{L(R)}\) we use the information on the \(B_s\)\(\overline{B}_s\) mixing amplitude and add the contribution coming from the couplings to \(Z^\prime \). More specifically, we add

$$\begin{aligned} \mathcal{H}_\mathrm{eff}^{Z^\prime }= & {} C_1 (\bar{b} \gamma _\mu P_L s) (\bar{b} \gamma ^\mu P_L s) \nonumber \\&+ C_1^\prime (\bar{b} \gamma _\mu P_R s) (\bar{b} \gamma ^\mu P_R s) + C_5 (\bar{b}_i P_L s_j) (\bar{b}_j P_R s_i) ,\nonumber \\ \end{aligned}$$
(35)

to the SM contribution. The Wilson coefficients are easily computed at \(\mu \approx m_{Z^\prime }\) and read

$$\begin{aligned} C_1^{(\prime )} = { \left( g_{sb}^{L(R)} \right) ^2 \over 2 m_{Z^\prime }^2}\,, \quad C_5 = - {2 g_{sb}^{L}g_{sb}^R \over m_{Z^\prime }^2}\,, \end{aligned}$$
(36)

which then, combined with

$$\begin{aligned}&\langle \bar{B}_s^0\vert \bar{b} \gamma _\mu (1-\gamma _5) s\, \bar{b} \gamma ^\mu (1-\gamma _5) s \vert B_s^0\rangle = \frac{8}{3} f_{B_s}^2 m_{B_s}^2 B_1(\mu )\,,\nonumber \\&\langle \bar{B}_s^0\vert \bar{b}_i (1-\gamma _5) s_j\, \bar{b}_j (1-\gamma _5) s \vert B_s^0\rangle \nonumber \\&\quad =\frac{2}{3} f_{B_s}^2 m_{B_s}^2 \left( \frac{m_{B_s}}{m_b(\mu )+m_s(\mu )} \right) ^2 B_5(\mu )\,, \end{aligned}$$
(37)

lead to

$$\begin{aligned}&{\Delta m_{B_s}^\mathrm{exp.} \over \Delta m_{B_s}^\mathrm{SM}} = 1 + {2\pi ^2 \over G_F^2 m_W^2 |V_{tb}V_{ts}^*|^2 \eta _B S_0(x_t) m_{Z^\prime }^2} \nonumber \\&\quad \times \left[ \eta _1 (g_{sb}^{L})^2 + \eta _1(g_{sb}^R)^2 - \eta _5{B_5(m_b)\over B_1(m_b)} \left( {m_{B_s}\over m_b + m_s} \right) ^2 g_{sb}^{L}g_{sb}^R\right] , \end{aligned}$$
(38)

where \(\eta _{1,5}\) account for the evolution of the Wilson coefficients from the scale \(\mu = m_{Z^\prime }\) down to \(\mu =m_b\), which we evaluate using the two-loop QCD anomalous dimensions to find [5356]

$$\begin{aligned}&\eta _1= 0.79 [0.80], \quad \eta _5= 0.89 [0.90]\quad \mathrm {for}\; m_{Z^\prime }=1\ \mathrm{TeV},\nonumber \\&\eta _1= 0.77 [0.78], \quad \eta _5= 0.88 [0.89]\quad \mathrm {for}\; m_{Z^\prime }=2\ \mathrm{TeV},\nonumber \\ \end{aligned}$$
(39)

where in the square brackets we quote the values obtained to leading order in QCD. The hadronic quantities entering Eq. (38) have been computed by means of numerical simulations of QCD on the lattice in Ref. [57] and read

$$\begin{aligned}&f_{B_s}= 228(8)\ \mathrm{MeV}, B_1^{\overline{\mathrm{MS}}}(m_b) = 0.86(3),\nonumber \\&B_5^{\overline{\mathrm{MS}}}(m_b) = 1.57(11)\,. \end{aligned}$$
(40)

Since we consider here only the scenarios in which either \(g_{sb}^{L}\ne 0\), \(g_{sb}^{R}=0\), or \(g_{sb}^{R}\ne 0\), \(g_{sb}^{L}=0\), the last term in Eq. (38) will always be zero for us. Therefore, keeping in mind that \(({\Delta m_{B_s}^\mathrm{exp.}/\Delta m_{B_s}^\mathrm{SM}})= 1.02(10)\), and by using the above ingredients we find, to \(2\sigma \) accuracy,

$$\begin{aligned} {\vert g_{sb}^{L(R)}\vert \over m_{Z^\prime }}\le 1.6(8) \times 10^{-3}\ \mathrm{TeV}^{-1}\ . \end{aligned}$$
(41)

Another coupling needed in Eq. (34) is \(g_{\mu \tau }^L\) which can be extracted from the deviation of the measured \(\mathcal{B}(\tau \rightarrow \mu \bar{\nu }_\mu \nu _\tau )_\mathrm{exp.}= 17.33(5)\,\%\) [52] with respect to its Standard Model prediction \(\mathcal{B}(\tau \rightarrow \mu \bar{\nu }_\mu \nu _\tau )_\mathrm{theo.}^\mathrm{SM}= 17.29(3)\,\%\), namely [39, 58]:Footnote 8

$$\begin{aligned} \delta \mathcal{B}_{\tau \mu }&=\mathcal{B}(\tau \rightarrow \mu \bar{\nu }_\mu \nu _\tau )_\mathrm{exp.}-\mathcal{B}(\tau \rightarrow \mu \bar{\nu }_\mu \nu _\tau )_\mathrm{theo.}^\mathrm{SM} \nonumber \\&=-{m_\tau ^5\over 1536 \pi ^3 \Gamma _\tau m_{Z^\prime }^2} {8 G_F\over \sqrt{2}} \left( g_{\mu \tau }^L\right) ^2 +{\mathcal O}(1/m_{Z^\prime }^4). \end{aligned}$$
(43)

Finally, the last coupling needed in Eq. (34) is \(g_{\mu \tau }^R\), which can be bounded from \(\mathcal{B}(\tau \rightarrow \mu \mu \mu )_\mathrm{exp.} < 2.1\times 10^{-8}\) [52], by using the expression [39]

$$\begin{aligned} \mathcal{B}(\tau \rightarrow 3 \mu ) = {m_\tau ^5\over 1536 \pi ^3 \Gamma _\tau m_{Z^\prime }^4} \left( g_{\mu \mu }^L\right) ^2 \left[ 2 \left( g_{\mu \tau }^L\right) ^2 + \left( g_{\mu \tau }^R\right) ^2 \right] .\nonumber \\ \end{aligned}$$
(44)

Besides \(g_{\mu \tau }^L\), which we discussed above, we need the value of \(g_{\mu \mu }^L\), which can be obtained from a fit to the \(b\rightarrow s\mu \mu \) data. To that end we consider two scenarios: the one in which the new physics contribution to the lepton flavor conserving channel comes entirely from \(g_{sb}^L\), i.e. \(C_{9}^{\mu \mu } = -C_{10}^{\mu \mu }\), and the case in which the coupling to quarks is entirely right-handed, \(g_{sb}^R\), and the Wilson coefficients satisfy \(C_{9}^{\prime \mu \mu } = -C_{10}^{\prime \mu \mu }\). Concerning the value of \(C_9^{(\prime )}\) we can derive it as in Ref. [28], by relying on the safest quantities as far as hadronic uncertainties are concerned, which to \(2\sigma \) accuracy results in

$$\begin{aligned}&C_9^{\mu \mu }\in [-0.52,-0.19],\qquad C_9^{\mu \mu \ \prime } \in [-0.41,-0.08], \end{aligned}$$
(45)

and this makes \(R_K\) consistent with experiment.Footnote 9

Such an obtained \(g_{\mu \mu }^L\) is then used to get \(g_{\mu \tau }^R\) by means of Eq. (43). Notice, however, that for very small values of \(g_{\mu \mu }^L\) the value of \(g_{\mu \tau }^R\) can be excessively large if we require the saturation of the experimental bound. In those cases we invoke the perturbativity requirement and set the bound to \(|g_{\mu \tau }^R| \le 1\). With all above ingredients in hands we can compute \(C_{9,10}^{\mu \tau (\prime )}\) by means of Eq. (34), and then use the obtained values to predict the upper bounds for the rates of the decay modes we discuss here. In Fig. 3 we show such bounds for both scenarios: (i) in Scenario I we use \(C_9^{\mu \mu }= - C_{10}^{\mu \mu }\) to determine \(g_{\mu \mu }^L\), while in (ii) Scenario II we use the condition \(C_9^{\prime \ \mu \mu }= - C_{10}^{\prime \ \mu \mu }\). The resulting bounds satisfy the hierarchy noted in Eq. (25). We focus on the values of \(C_9^{(\prime ) \mu \mu }= - C_{10}^{(\prime ) \mu \mu } \ne 0\) (and \(C_9^{(\prime ) ee}=0\)), which give \(R_K\) consistent with the one measured at LHCb. That range of values correspond to the shaded regions in the plots in Fig. 3. The resulting bounds are:

Scenario

I

II

\(\mathcal{B}(B \rightarrow K^*\mu \tau ) \le \)

\(1.6 \times 10^{-8}\)

\(9.3 \times 10^{-8}\)

\(\mathcal{B}(B \rightarrow K \mu \tau ) \le \)

\(0.9 \times 10^{-8}\)

\(5.2 \times 10^{-8}\)

\(\mathcal{B}(B_s \rightarrow \mu \tau ) \le \)

\(0.8 \times 10^{-8}\)

\(4.6 \times 10^{-8}\)

Fig. 3
figure 3

Upper bounds on the branching fractions \(\mathcal {B}(B\rightarrow K^*\mu \tau ) > \mathcal {B}(B\rightarrow K \mu \tau )>\mathcal {B}(B_s\rightarrow \mu \tau )\), as a function of the BSM contribution to \(C_{10}^{(\prime )\mu \mu }\) extracted from the LFC decay modes in two setups: in Scenario I we use \(C_{10}^{\mu \mu }=-C_{9}^{\mu \mu }\) while keeping \(C_{9,10}^{\prime \ \mu \mu }=0\), and in Scenario II we take \(C_{10}^{\prime \ \mu \mu }=-C_{9}^{\prime \ \mu \mu }\) with \(C_{9,10}^{\mu \mu }=0\). Shaded regions correspond to the values given in Eq. (44), obtained by combining \(\mathcal{B}(B_s\rightarrow \mu \mu )\) with the high \(q^2\) bin of \(d\mathcal{B}(B\rightarrow K \mu \mu )/dq^2\), which result in \(R_K\) consistent with experiment, cf. Ref. [28]. See text for discussion concerning the couplings \(g_{sb,\mu \tau }^{L(R)}\)

We stress once again that the above bounds are obtained after assuming that the BSM physics effects come in the scenarios with either \(C_9=-C_{10}\), or \(C_9^\prime =-C_{10}^\prime \). In other words either \(g_{sb}^R=0\), or \(g_{sb}^L=0\). If no assumption as regards the BSM physics is being made, and both \(g_{sb}^{L}\) and \(g_{sb}^{R}\) were left free, then the third term in the brackets of Eq. (38) would play an important role and the resulting bounds on the above decay modes would be weaker.

5 Summary

In the present paper we discussed the possibility of observing the LFV modes in exclusive decays based on \(b\rightarrow s\ell _1^\pm \ell _2^\mp \). Starting from the low energy effective hamiltonian, we derived the expressions for decay rates for \(B_s\rightarrow \ell _1\ell _2\), \(B\rightarrow K \ell _1\ell _2\), \(B\rightarrow K^*(\rightarrow K\pi ) \ell _1\ell _2\), and similar modes. We show that the extra contributions proportional to the difference between lepton masses arise in the case of LFV modes, thus requiring particular care when trying to average the (lepton) charge-conjugated modes. We then examined the situation in which the LFV is generated by the (pseudo-)scalar operators, to distinguish it from the one in which the LFV comes from the coupling to (axial-)vector operators. In the former case we find that most of the events would occur at larger values of \(q^2\), while in the latter case the events are expected to be equidistributed over a large window of \(q^2\)’s. Furthermore, we find that the hierarchy of the branching fractions of our modes change: while in the case of coupling to the (axial-)vector operators we find \(\mathcal {B}(B_s\rightarrow \ell _1\ell _2)<\mathcal {B}(B\rightarrow K \ell _1\ell _2)<\mathcal {B}(B\rightarrow K^*\ell _1\ell _2)\), in the case of coupling to the (pseudo-)scalar operators we get \(\mathcal {B}(B_s\rightarrow \ell _1\ell _2)>\mathcal {B}(B\rightarrow K \ell _1\ell _2)>\mathcal {B}(B\rightarrow K^*\ell _1\ell _2)\). To illustrate both cases we first used a phenomenological Lagrangian that encodes \(\mathcal{B}(h\rightarrow \mu \tau )\ne 0\), as recently suggested by CMS, to derive the bounds that seem to be too low for these decay modes to be probed experimentally. In the second case we revisited a \(Z^\prime \) model in which a (small) tree level flavor changing neutral couplings are allowed, and after a short discussion concerning the specific scenarios and the channels allowing to bound the relevant LFV couplings, we derive bounds which generically suggest the branching fractions of all the modes we consider to be less than a few times \(10^{-8}\), which are thus more likely to be probed experimentally.

As a by-result of our analysis we revisited the computation of the angular distribution of \(B\rightarrow K^*(\rightarrow K\pi ) \ell _1\ell _2\), which is often a source of confusion in the lepton flavor conserving case, due to incomplete information given in most of the papers on the subject. We were able to confirm the results of Ref. [44], where full and unambiguous information was provided, by an independent explicit calculation.