1 Introduction

The radiative decay \(B^- \rightarrow \mathrm {\ell }^- \bar{\nu }_{\mathrm {\ell }}\gamma \) has been extensively studied in the context of QCD factorization (QCDF) [1,2,3,4,5] when the energy of the photon \(E_\gamma \) is large compared to the scale of the strong interaction \(\Lambda _{\text {QCD}}\). At leading power in a simultaneous expansion in \(\Lambda _{\text {QCD}}/E_{\gamma }\) and \(\Lambda _{\text {QCD}}/m_b\), and at leading order (LO) in the strong coupling \(\alpha _s\), the relevant \(B\rightarrow \gamma \) transition form factor can be expressed in terms of only two hadronic parameters: the accurately known B meson decay constant \(f_B\), and the poorly constrained first inverse moment \(1/\lambda _B = \int _0^{\infty } d\omega \,\phi ^B_+(\omega )/\omega \) of \(\phi ^B_+(\omega )\), the leading-twist B meson light-cone distribution amplitude (LCDA). This hadronic parameter was introduced in the theoretical description of charmless hadronic decays [6] and appears in the QCD calculation of almost any other exclusive B decay to light particles. The radiative decay \(B^- \rightarrow \mathrm {\ell }^- \bar{\nu }_{\mathrm {\ell }}\gamma \) has been advocated as a means to determine \(\lambda _B\) from data [5]. First significant measurements can be expected from the BELLE II experiment (see [7] for the most recent BELLE result).

This strategy is difficult to implement in the hadronic B experiment LHCb, since the photon in the radiative decay cannot be easily reconstructed. In this paper we investigate whether the four-lepton decay \(B^-\rightarrow \mathrm {\ell }\bar{\nu }_{\mathrm {\ell }} \gamma ^* \rightarrow \ell \bar{\nu }_\ell \ell ^{(\prime )} \bar{\ell }^{(\prime )}\), in which the real photon is replaced by a virtual one, which decays into a lepton-antilepton pair (\(\mathrm {\ell },\mathrm {\ell }'= e,\mu \)), retains sensitivity to \(\lambda _B\), and hence could provide an alternative measurement. We focus on the kinematic region, where the \(\gamma ^*\), respectively the lepton pair, has large energy but small invariant mass \(q^2\lesssim 6\,\text{ GeV}^2\).Footnote 1 The four-lepton decays have not been observed up to now, but the LHCb experiment [8] established an upper bound of \(\text{ Br }\,(B^+ \rightarrow \mu ^+ \bar{\nu }_{\mu } \mu ^- \mu ^+) \, < 1.6 \cdot 10^{-8} \) on the branching fraction of the muonic mode under the assumption that the smaller of the two possible \(\mu ^+ \mu ^-\) invariant masses is below 980 MeV, which is close to, in fact somewhat below, theoretical expectations [9, 10].

The factorization theorem for the \(B \rightarrow \gamma \) form factors in the regime where the photon is energetic, \(n_+q=2 E_\gamma \gg \Lambda _\mathrm{QCD}\), has been established long ago [3, 4]. Its generalization to \(B \rightarrow \gamma ^*\) form factors is straightforward, when \(q^2\) is away from light-meson resonances. The present treatment follows the strategy applied to \(B^- \rightarrow \mathrm {\ell }^- \bar{\nu }_{\mathrm {\ell }}\gamma \) [5] and \(B_s\rightarrow \mu ^+\mu ^-\gamma \) [11] – we compute the form factor in QCD factorization at leading power (LP) including \(\mathcal {O}(\alpha _s)\) QCD corrections, and include next-to-leading power (NLP) corrections at \(\mathcal {O}(\alpha _s^0)\). The light-meson resonance contribution is included in the same fashion as for the “type-B” contribution to \(B_s\rightarrow \mu ^+\mu ^-\gamma \) [11]. Since the four-lepton final state is produced from a virtual W boson and photon, an extension of previous calculations is required to \(B \rightarrow \gamma ^*\) form factors that depend on two non-vanishing virtualities. We note that a previous computation [12] of these \(B \rightarrow \gamma ^*\) form factors includes either only the resonance contribution at small \(q^2\), or employs QCD sum rules that apply only to large \(q^2\sim m_b^2\). No attempts have so far been undertaken to estimate the form factors for intermediate and small \(q^2\) with factorization methods, as done here, which apply when \(n_+q\gg \Lambda _\mathrm{QCD}\). With these kinematic restrictions the differential branching fraction of the four-lepton decay is expressed, at LP, in terms of generalized inverse moments of the B meson LCDA, which can be related to \(\lambda _B\).

We consider the case of non-identical lepton flavours, \(\ell ^\prime \not =\ell \), and identical ones, which requires additional kinematic considerations.

2 Basic definitions

Following the conventions of [5], we write the \(B^- \rightarrow \ell \bar{\nu }_\ell \ell ' \bar{\ell }'\) decay amplitude to lowest non-vanishing order in the electromagnetic coupling as

$$\begin{aligned}&\mathcal {A}\left( B^- \rightarrow \ell \bar{\nu }_\ell \ell ' \bar{\ell }'\right) \nonumber \\&\quad =\frac{G_F V_{u b}}{\sqrt{2}} \big \langle {\ell (p_\ell ) \, \bar{\nu }_\ell (p_\nu ) \, \ell ' (q_1)\, \bar{\ell }'(q_2)}\big \vert \bar{\ell } \gamma ^{\mu }\left( 1-\gamma _{5}\right) \nu _{\ell }\nonumber \\&\qquad \cdot \bar{u} \gamma _{\mu }\left( 1-\gamma _{5}\right) b \big \vert {B^{-}(p)}\big \rangle \nonumber \\&\quad = \frac{G_F V_{u b}}{\sqrt{2}}\frac{ie^2}{q^2}\,Q_{\ell '} \left[ T^{\mu \nu }(p,q)+Q_\ell f_B\, g^{\mu \nu } \right] \nonumber \\&\qquad \times \left( \bar{u}_{\ell '} \gamma _{\mu } v_{\bar{\ell '}} \right) \left( \bar{u}_{\ell } \gamma _{\nu }(1-\gamma _5) v_{\bar{\nu }} \right) , \end{aligned}$$
(2.1)

where \(q\equiv q_1+q_2\) and \(p=m_B v = q + k\), such that \(k = p_\ell + p_\nu \) is the momentum of the virtual W boson. In addition, we use the convention \(iD^\mu = i \partial ^\mu - Q_\ell e A_{\mathrm{em}}^\mu \) for the electromagnetic covariant derivative, with \(Q_\ell =-1\) for the lepton fields. The hadronic tensor

$$\begin{aligned} T^{\mu \nu }(p,q)= & {} \int d^4 x\, e^{i q \cdot x} \big \langle {0}\big \vert {{\,\mathrm{T}\,}}\Big \{j^{\mu }_{ \mathrm em}(x),\nonumber \\&\left( \bar{u}\gamma ^{\nu } \left( 1-\gamma ^5\right) b\right) (0) \Big \} \big \vert {B^-}\big \rangle , \end{aligned}$$
(2.2)

with the electromagnetic current \(j^\mu _{\mathrm{em}} = \sum _q Q_q \bar{q}\gamma ^\mu q + Q_\ell \bar{\ell }\gamma ^\mu \ell \) accounts for the emission of the virtual photon from the B meson constituents. The second term in the square brackets in (2.1) corresponds to the emission from the final-state lepton, see Fig. 2 below. It can be expressed in terms of the B meson decay constant \(\big \langle {0}\big \vert \bar{u}\gamma ^{\nu } \left( 1-\gamma ^5\right) b \big \vert {B^-(p)}\big \rangle = -i f_B p^{\nu }\) and constitutes a power correction relative to the \(T^{\mu \nu }\) term in the kinematic region of interest.

The hadronic tensor \(T^{\mu \nu }\) can be decomposed into six form factors \(F_i(q^2,k^2)\) of two kinematic invariants. Applying the Ward identity \(q_{\mu }T^{\mu \nu }= f_B p^{\nu }\) leaves four form factors and a contact term (see Appendix A for details). We write

$$\begin{aligned} T^{\mu \nu }(p,q)&= \left( g^{\mu \nu }v\cdot q {-} v^{\mu } q^ {\nu }\right) \hat{F}_{A_\perp }+i\, \epsilon ^{\mu \nu \alpha \beta }\, v_{\alpha } q_{\beta } F_{V}\nonumber \\&\quad -\hat{F}_{A_\parallel }v^\mu q^\nu + (q^\mu , k^\nu ) \text{ terms }. \end{aligned}$$
(2.3)

We neglect the lepton masses, in which case the \(q^\mu , k^\nu \) terms drop out after contracting \(T^{\mu \nu }(p,q)\) with the lepton tensor. The contact term is fixed by the Ward identity to \((f_B m_B)/(v\cdot q) v^\mu v^\nu \). This can be rewritten as \(f_B/(v\cdot q) \,v^\mu (k^\nu +q^\nu )\) and has been absorbed into \(\hat{F}_{A_\parallel }\) and the \(k^\nu \) terms in (2.3). The convention for the totally anti-symmetric tensor is \(\epsilon ^{0123}=1\). The virtual photon emission from the final-state lepton \(\ell \) in (2.1) is exactly cancelled by the redefinition

$$\begin{aligned} F_{A_\perp }=\hat{F}_{A_\perp }+ \frac{Q_\ell f_B}{v \cdot q}, \quad \quad \tilde{F}_{A_\parallel }=\hat{F}_{A_\parallel }- \frac{Q_\ell f_B}{v \cdot q}. \end{aligned}$$
(2.4)

Therefore, the term in square brackets in (2.1) can be expressed in terms of three form factors. To separate amplitudes corresponding to the different polarization states of the virtual photon, we shall use the decomposition

$$\begin{aligned} T^{\mu \nu }(p,q)+Q_\ell f_B g^{\mu \nu }= & {} F_{A_\perp }\, g^{\mu \nu }_{\perp } v\cdot q +i\, F_{V}\epsilon ^{\mu \nu \alpha \beta }\, v_{\alpha } q_{\beta }\nonumber \\&- F_{A_\parallel }\,v^{\mu } q^{\nu }, \end{aligned}$$
(2.5)

which implies

$$\begin{aligned} F_{A_\parallel }= \tilde{F}_{A_{\parallel }}- \frac{2q^2(m_B^2-q^2+k^2)}{\lambda }F_{A_\perp }\,. \end{aligned}$$
(2.6)

Here \(\lambda \equiv \lambda (m_B^2,q^2,k^2) = m_B^4 - 2m_B^2 (k^2 + q^2) + (k^2-q^2)^2\) the Källén function. The form factor \(F_{A_\parallel }\) arises from a longitudinally polarized virtual photon and vanishes in the real-photon limit \(q^2 \rightarrow 0\). Without loss of generality we choose the three-momentum \(\vec {q}\) to point in the positive z direction, such that its decomposition into light-cone vectors \(n_\pm ^\mu \) reads

$$\begin{aligned} q^\mu = n_+q\,\frac{n_-^\mu }{2} + n_-q\,\frac{n_+^\mu }{2} \ , \end{aligned}$$
(2.7)

with \(n_\pm ^\mu = (1,0,0,\mp 1)\) and \(q^2 = n_+q\,n_-q\). The transverse metric tensor is then \(g^{\mu \nu }_{\perp }=g^{\mu \nu }- (n_+^{\mu }n_-^{\nu }+n_+^{\nu }n_-^{\mu })/2\). The large component \(n_+q\) of \(q^\mu \) is related to the invariant masses \(q^2\) and \(k^2\) via

$$\begin{aligned} n_+q= \frac{m_B^2 - k^2 + q^2 + \sqrt{\lambda }}{2m_B} \ . \end{aligned}$$
(2.8)

Finally, we define the left- and right-helicity form factors

$$\begin{aligned} F_{L} = \frac{1}{2} \left( F_{V}+ F_{A_\perp }\right) , \quad \quad F_{R}= \frac{1}{2} \left( F_{V}- F_{A_\perp }\right) . \end{aligned}$$
(2.9)

Since helicity is conserved in high-energy QCD processes, \(F_R\) is power-suppressed relative to \(F_L\) in the heavy-quark / large \(n_+q\) limit.

For non-identical lepton flavours \(\ell \ne \ell '\), the differential decay width can be obtained analytically by a straightforward calculation. The full angular distribution, i.e. the fivefold differential rate is given in Appendix B. Here we quote the double differential rate in the invariant masses \(q^2\) and \(k^2\), for which we obtain the simple expression

$$\begin{aligned} \frac{d^2 \mathrm{Br}\left( B^{-} \rightarrow \ell \, \bar{\nu }_\ell \, \ell ' \bar{\ell '}\right) }{dq^2\,dk^2}&=\frac{\tau _BG_F^2|V_{ub}|^2\alpha _{\mathrm{em}}^2}{2^{8}3^2\pi ^3m_B^5} \frac{\sqrt{\lambda }}{q^2} \sqrt{1-\frac{4m_{\mathrm {\ell }'}^2}{q^2}}\nonumber \\&\quad \times \left( 1-\frac{m_{\mathrm {\ell }}^2}{k^2}\right) \left( 8 k^2\left( m_B^2+q^2-k^2\right) ^2 \left| F_{A_\perp }\right| ^2\right. \nonumber \\&\quad \left. +8 k^2\lambda \left| F_{V}\right| ^2 + \frac{\lambda ^2}{q^2} \,|F_{A_{\parallel }}|^2 \right) , \end{aligned}$$
(2.10)

keeping the lepton masses \(m_{\ell ^{(\prime )}}\) in the phase space integration (implying \(q^2 > 4 m_{\mathrm {\ell }'}^2\)), which is relevant for muons.

The case of identical lepton flavours \(\ell ^\prime =\ell \) is more complicated, as an additional contribution from the interchange of the two final-state leptons arises. To clarify this point, let us define \(B^-\rightarrow \ell ^-(p_1) \ell ^+(p_2)\ell ^-(p_3) \bar{\nu }_\ell (p_\nu )\), with \(q^2=(p_1+p_2)^2\) and \(\tilde{q}^2=(p_2+p_3)^2\). At the amplitude level, we have \(\mathcal {M}_\mathrm{tot} = \mathcal {M}_a - \mathcal {M}_b\) where \(\mathcal {M}_b = \mathcal {M}_a(p_1 \rightarrow p_3, p_3 \rightarrow p_1)\). For the decay rate, this results in an additional interference term between \(\mathcal {M}_a\) and \(\mathcal {M}_b\), while the rates \(\Gamma _{a,b} \propto |\mathcal {M}_{a,b}|^2\) from the squares of the individual diagrams are equal (as depicted in Fig. 1). Since \(\Gamma _a + \Gamma _b\) is equal to the rate for non-identical lepton flavours, we find

$$\begin{aligned} \mathrm{Br}\left( B^{-} \rightarrow \ell \, \bar{\nu }_\ell \, \ell \bar{\ell }\right)&= \mathrm{Br}\left( B^{-} \rightarrow \ell \, \bar{\nu }_\ell \, \ell ' \bar{\ell '}\right) \nonumber \\&\quad +\mathrm{Br}_\mathrm{int}\left( B^{-} \rightarrow \ell \, \bar{\nu }_\ell \, \ell \bar{\ell }\right) . \end{aligned}$$
(2.11)

For the interference term \(d^2 \mathrm{Br}_\mathrm{int}\left( B^{-} \rightarrow \ell \, \bar{\nu }_\ell \, \ell \bar{\ell }\right) /(dq^2\,dk^2)\) can only be obtained numerically.

Fig. 1
figure 1

Graphical representation of the squared amplitude for the non-identical-lepton final-state (left) and of the interference term for the case of identical lepton flavours (right)

3 Calculation of the form factors

The amplitude can be factorized through an expansion in \(\Lambda _\mathrm{QCD}/m_b\) and \(\Lambda _\mathrm{QCD}/n_+q\), if the quark propagator that connects the electromagnetic and the weak current is far off-shell. This happens for very large \(q^2\) of order \(m_B^2\) in which case the amplitude can be reduced to a hard matching coefficient times the B meson decay constant defined as a local matrix element in heavy-quark effective theory (HQET). The decay rate for such large \(q^2\) is highly suppressed. The situation is more interesting when \(q^2 \ll m_B^2\), but \(q^\mu \) has still a large component \(n_+q\sim \mathcal {O}(m_B)\), while \(n_-q\sim \mathcal {O}(\Lambda _{\mathrm{QCD}})\), or even smaller. In this case the intermediate quark propagator becomes hard-collinear and the \(\gamma ^*\) probes the light-cone structure of the B meson. A factorization formula, which expresses the form factors as a convolution of the B meson LCDA with a perturbative scattering kernel, can then be derived for the LP contribution using soft-collinear effective theory [13,14,15,16] by matching QCD \(\rightarrow \) SCET \(_{\mathrm{I}}\) \(\rightarrow \) HQET. Since the derivation is very similar to the one for \(B^- \rightarrow \mathrm {\ell }^- \bar{\nu }_{\mathrm {\ell }}\gamma \) and \(B_s\rightarrow \mu ^+\mu ^-\gamma \) decays [5, 11], we only sketch the main steps in the following.

Upon integrating out the hard scales \(m_b, n_+q\), the flavour-changing weak current is represented in SCET\(_\mathrm{I}\) by

$$\begin{aligned} \bar{q} \gamma ^\mu (1- \gamma _5) b&\rightarrow C_V^{(A0)} \, [\bar{q}_\mathrm{hc} \gamma _\perp ^\mu (1-\gamma _5) h_v] \nonumber \\&\quad + \left( C_4 n_-^\mu + C_5 v^\mu \right) [\bar{q}_\mathrm{hc} (1+\gamma _5) h_v], \end{aligned}$$
(3.1)

with hard matching coefficients \(C_i = C_i(n_+q; \mu )\). Here \(q_\mathrm{hc} = W^\dagger \xi _\mathrm{hc}\) is the hard-collinear quark field in SCET, multiplied with a hard-collinear Wilson line to ensure SCET collinear gauge-invariance. Fields without arguments live at space-time point \(x = 0\). At LP, the index \(\mu \) is transverse, since the LP SCET\(_\mathrm{I}\) electromagnetic current \(j^\mu _{q, \mathrm{SCET}_\mathrm{I}}(x)\) [3] contains only the transverse polarization of the virtual photon. We therefore only need \(C_V^{(A0)}\), which to \(\mathcal {O}(\alpha _s)\) reads [13]

$$\begin{aligned} C_V^{(A0)}(n_+q; \mu )= & {} 1 + \frac{\alpha _sC_F}{4\pi } \bigg ( - 2 \ln ^2 \frac{m_B z}{\mu } + 5 \ln \frac{m_B z}{\mu } \nonumber \\&- \frac{3 - 2z}{1-z} \ln z -2 \,\mathrm{Li}_2(1-z) - \frac{\pi ^2}{12} - 6 \bigg ),\nonumber \\ \end{aligned}$$
(3.2)

with \(\alpha _s\equiv \alpha _s(\mu )\) in the \(\overline{\mathrm{MS}}\) scheme, and \(z = n_+q/m_B = 1 - k^2/m_B^2 + \mathcal {O}(\Lambda _\mathrm{QCD}/m_B)\). The hadronic tensor is then expressed as

$$\begin{aligned} T^{\mu \nu }(p,q) = 2 C_V^{(A0)} \, \mathcal{T}^{\mu \nu }(q) \end{aligned}$$
(3.3)

in terms of the matching coefficient and the SCET\(_\mathrm{I}\) correlation function

$$\begin{aligned} \mathcal{T}^{\mu \nu }(q)&= \int d^4 x \, e^{i q x} \big \langle {0}\big \vert T \Big \{ j^\mu _{q, \mathrm{SCET}_\mathrm{I}}(x),\nonumber \\ {}&\qquad [\bar{q}_\mathrm{hc} \gamma _\perp ^\nu P_L h_v](0) \Big \} \big \vert {B^-_v}\big \rangle . \end{aligned}$$
(3.4)

A discussion of the precise power counting of the individual terms in \(j^\mu _{q, \mathrm{SCET}_\mathrm{I}}\) and the possibility of extrapolating the above expressions to the large \(q^2\) region with tree-level accuracy can be found in [11].

The SCET\(_\mathrm{I}\) correlation function is then matched at LP to HQET. This results in

$$\begin{aligned} \mathcal{T}^{\mu \nu }(q)&= \left( g_\perp ^{\mu \nu } + \frac{i}{2} \epsilon ^{\mu \nu \rho \sigma } n_{+ \rho } n_{- \sigma }\right) \frac{Q_u F_B m_B}{4} \nonumber \\&\quad \times \int _0^\infty d \omega \, \phi ^B_+(\omega ) \, \frac{J(n_+q, q^2, \omega ; \mu )}{\omega - n_-q- i 0^+}, \end{aligned}$$
(3.5)

where \(n_-q= q^2/n_+q\). The hard-collinear matching function [17]

$$\begin{aligned}&J(n_+q, q^2,\omega ;\mu ) = 1 + \frac{\alpha _s C_F}{4\pi } \,\Bigg \{ \ln ^2 \frac{\mu ^2}{n_+q\,(\omega - n_-q)} - \frac{\pi ^2}{6} -1 \nonumber \\&\quad - \frac{n_-q}{\omega } \ln \frac{n_-q-\omega }{n_-q} \left[ \ln \frac{\mu ^2}{-q^2} + \ln \frac{\mu ^2}{n_+q\,(\omega - n_-q)} +3 \right] \Bigg \} \end{aligned}$$
(3.6)

is convoluted with the leading-twist B meson LCDA \(\phi ^B_+(\omega )\) defined through

(3.7)

which contains the scale-dependent HQET B meson decay constant \(F_B = F_B(\mu )\). Similar to the denominator in (3.5), \(n_-q\) in (3.6) must be understood as \(n_-q+ i 0^+\).

As a consequence of helicity conservation of QCD in the high-energy limit, the Lorentz structure in (3.5) gives a non-vanishing contribution only to the left-helicity form factor \(F_L\), which can be expressed as

$$\begin{aligned} F_L^{\mathrm{LP}}&= C_V^{(A0)}(\mu ) \, \frac{Q_u F_B(\mu ) m_B}{n_+q} \, \int _0^\infty d \omega \, \phi ^B_+(\omega ; \mu )\nonumber \\&\quad \times \frac{J(n_+q, q^2, \omega ; \mu )}{\omega - n_-q- i 0^+}. \end{aligned}$$
(3.8)

The form factors \(F_R\) and \(F_{A_\parallel }\) vanish at leading power, \(F_R^{\mathrm{LP}} = F_{A_\parallel }^{\mathrm{LP}} = 0\). The \( - i 0^+\) prescription in (3.8) generates a rescattering phase of the form factor \(F_L\) for \(q^2>0\).

Fig. 2
figure 2

Photon emissions that contribute to the tree-level \(B^- \rightarrow \ell \bar{\nu }_\ell \ell ' \bar{\ell }'\) amplitude. The emission from the spectator quark (a) contributes at leading power whereas the emission from the heavy quark (b) and the lepton (c) is power suppressed

A factorization formula for NLP corrections is presently not known. Following [5, 11] we infer the leading NLP \(\mathcal {O}(\alpha _s^0)\) contributions by a diagrammatic analysis of the tree diagrams of Fig. 2. In the hard-collinear region, diagrams (b) and (c) vanish at LP. Their NLP contribution can be expressed in terms of \(f_B\). In diagram (a), which gives the \(\mathcal {O}(\alpha _s^0)\) term in the above LP factorization result, we now expand the light-quark propagator to NLP, and obtain

(3.9)

where l denotes the spectator-quark momentum of order \(\Lambda _\mathrm{QCD}\) and terms suppressed by two powers of \(\Lambda _\mathrm{QCD}/\{n_+q,m_b\}\) are neglected. This expression reduces to the one [5] for real photons, when \(n_-q= 0\) and \(n_+q= 2 E_\gamma \). Proceeding as in [5], we find

$$\begin{aligned} F_L^{\mathrm{NLP}}&= \xi (q^2,v\cdot q) + \frac{Q_{\ell }f_B}{2v \cdot q}\,, \end{aligned}$$
(3.10)
$$\begin{aligned} F_R^{\mathrm{NLP}}&= \frac{F_B}{n_+q}\frac{m_B Q_u}{n_+q} \left( 1+ \frac{n_-q}{\lambda _B^+(n_-q)}\right) - \frac{F_B m_B Q_b}{q^2-2 m_b v \cdot q}\nonumber \\&\quad - \frac{Q_{\mathrm {\ell }}f_B}{2v \cdot q} \end{aligned}$$
(3.11)

for the \(\mathcal {O}(\alpha _s^0)\) NLP terms of the form factors. These expressions are written in a form such that the complete expression for \(F_{L,R}\) including \(F_L^{\mathrm{LP}}\) is valid for both, hard-collinear and hard \(q^2\).Footnote 2 Setting \(2v \cdot q =n_+q+n_-q\rightarrow n_+q\) and neglecting \(q^2\) in the denominator of the second term in (3.11), one recovers the strict NLP expressions in the hard-collinear region. The \(q^2\)-dependent inverse moment of the B LCDA is defined as

$$\begin{aligned} \frac{1}{\lambda _B^+(n_-q)} \equiv \int _{0}^{\infty }d\omega \, \frac{\phi ^B_{+}(\omega )}{\omega -n_-q-i 0^+} \ . \end{aligned}$$
(3.12)

Power corrections to \(F_L\) from the photon emission off the spectator quark cannot be factorized and are parametrized by the “symmetry-preserving”, power-suppressed form factor \(\xi (q^2,v\cdot q)\). For \(q^2 \rightarrow 0\) we recover the result for the on-shell \(B^- \rightarrow \ell \bar{\nu }_\ell \gamma \) form factors [5].

One important difference between the virtual and on-shell photon case concerns the second term in the square brackets in (3.9), which matches onto a hadronic matrix element with a transverse derivative acting on the spectator-quark field. This term contributes only to the longitudinal form factor \(F_{A_\parallel }\) and is hence irrelevant in \(B \rightarrow \gamma \ell \nu \). Since we do not consider explicitly the tree-level contributions proportional to the three-particle LCDAs of the B meson, we can compute this term in the so-called Wandzura–Wilczek (WW) approximation [18], in which case only the subleading two-particle LCDA \(\phi ^B_-(\omega )\) appears. We then find, for hard-collinear \(q^2\),

$$\begin{aligned} F_{A_\parallel }^{\mathrm{NLP}}&= -\frac{2F_B Q_u}{n_+q} \left( 2m_B\frac{n_-q}{n_+q}\frac{1}{\lambda _B^-(n_-q)} +1\right) \nonumber \\&\quad + \frac{2F_B (Q_b-Q_\ell )}{n_+q} + \xi '(q^2, v\cdot q)\nonumber \\&=-\frac{4 F_Bm_B Q_u}{(n_+q)^2} \frac{n_-q}{\lambda _B^-(n_-q)} + \xi '(q^2, v\cdot q)\,. \end{aligned}$$
(3.13)

In the numerical analysis, we employ an expression for \(F_{A_{\parallel }}\), which is accurate in both the hard and hard-collinear \(q^2\) region. To this end, we use (2.6) together with

$$\begin{aligned} \tilde{F}_{A_{\parallel }}^{\mathrm{NLP}}&= \frac{4 F_B m_B Q_u}{n_+q}\frac{n_-q}{n_+q} \left( \frac{1}{\lambda _B^+(n_-q)}-\frac{1}{\lambda _B^-(n_-q)}\right) \nonumber \\&\quad -\frac{2 F_B Q_u}{n_+q}\left( 1 + \frac{n_-q}{\lambda _B^+(n_-q)} \right) \nonumber \\&\quad +\frac{2 F_B m_b Q_b}{2v\cdot q m_b - q^2} -\frac{2f_B Q_{\mathrm {\ell }}}{2v\cdot q}+\xi '(q^2, v\cdot q)\,, \end{aligned}$$
(3.14)

and \(F_{A_\perp }^{\mathrm{NLP}}\) computed from (3.10), (3.11). The inverse moment \(\lambda _B^-(n_-q)\) of the subleading-twist LCDA \(\phi ^B_-(\omega )\), which was already introduced for \(B\rightarrow K^*\ell \ell \) [19], is defined in analogy to (3.12). The finite invariant mass of the virtual photon regulates its endpoint divergence at \(\omega \rightarrow 0\). Nevertheless, in the limit \(q^2 \rightarrow 0\) we find \(F_{A_\parallel } \rightarrow 0\) due to the additional \(n_-q\) in the numerator, as it should be, since an on-shell photon has no longitudinal polarization. As in the case of \(F_L\) we allow for a possible non-factorizable contribution by adding an unknown form factor \(\xi '(q^2, v\cdot q)\), which must also vanish as \(q^2\rightarrow 0\).

The power-suppressed form factor \(\xi \) that parameterizes the contribution from soft distances \(x\sim 1/\Lambda _\mathrm{QCD}\) between the currents in \(T^{\mu \nu }\) as well as the three-particle B LCDA contributions have been calculated with light-cone QCD sum rules [17, 20], but this method can only be used for \(q^2=0\) or space-like. We therefore follow the simple ansatz [11]

$$\begin{aligned} \xi (q^2,v\cdot q)&= - r_\mathrm{LP} \frac{F_B m_B Q_u}{n_+q} \frac{1}{\lambda _B^+(n_-q)} \nonumber \\ \xi ^\prime (q^2,v\cdot q)&= 0 \,, \end{aligned}$$
(3.15)

which incorporates the observation that the power-suppressed form factors appear to reduce the LP ones by setting them to a fraction \(r_\mathrm{LP}=0.2\) of \(F_L\) at tree level. Since there is no LP contribution to \(F_{A_\parallel }\), \(\xi ^\prime \) is set to 0 in this model. The branching fraction of the four-lepton decay is quite sensitive to the value of \(r_\mathrm{LP}\). For the \(B_s\rightarrow \mu ^+\mu ^-\gamma \) decay the conservative estimate \(r_\mathrm{LP}=0.2\pm 0.2\) was adopted [11]. Below we also present results for \(r_\mathrm{LP}=0.2\pm 0.1\).

Since the \(B\rightarrow \gamma ^*\) form factors are time-like, the heavy-quark/large-energy expansion is certainly upset locally by the lowest light-meson resonances, \(\rho \) and \(\omega \). However, as shown in [11], quark–hadron duality is also violated globally, such that for any \(q^2\) bin that contains these resonances, the resonance contribution will be dominant. In order to describe the form factors in the entire region \(q^2\lesssim 6\,\text{ GeV}^2\), we add the resonant process \(B^- \rightarrow \ell \bar{\nu }_\ell V \rightarrow \ell \bar{\nu }_{\ell } \ell ^{(\prime )} \bar{\ell }^{(\prime )}\), where \(V=\rho ,\omega \), to the factorization expressions (3.10) and (3.11). As discussed in [11], this procedure can be justified parametrically, as the averaged resonance contribution is formally a power correction. Nevertheless, the existence of resonances implies that the QCD factorization calculation of the time-like form factors is not on as solid ground as at \(q^2=0\) for \(B^-\rightarrow \ell ^-\bar{\nu }_\ell \gamma \). Writing the dispersion relation in \(q^2\) at fixed \(n_+ q\) for the hadronic tensor \(T^{\mu \nu }\), and including only the \(\rho \) and \(\omega \) resonances in the spectral function in the Breit–Wigner approximation, we find

$$\begin{aligned} F_{L (R)}^{\text {res}}&= \sum _{V=\rho ^0\,,\,\omega }\mathrm{BW}_V \,\frac{1}{2}\left( \frac{2m_B}{m_B+m_V}V^{B\rightarrow V}(k^2) \right. \nonumber \\&\quad \left. \pm \frac{m_B+m_V}{v\cdot q} A_1^{B\rightarrow V}(k^2)\right) , \end{aligned}$$
(3.16)

where the upper (lower) sign applies to \(F_L \,(F_R)\). In addition,

$$\begin{aligned} \mathrm{BW}_V\equiv c_\mathrm{V} \frac{f_{V}m_V}{m_V^2-q^2-i m_V \Gamma _V} \,, \end{aligned}$$
(3.17)

with \(c_{\rho }=1/2\) and \(c_\omega =1/6\).Footnote 3 For the \(B\rightarrow V\) transition form factors \(V, A_1\) and \(A_2\) we use the definition and numerical results of [21]. It follows from the heavy-quark symmetry relations for the heavy-to-light \(B\rightarrow V\) form factors [18], (applicable since \(k^2 = k^2(n_+q, q^2)\) via (2.8) and \(n_+q \gg \Lambda _\mathrm{QCD}\) and \(q^2=m_V^2\) in the argument of the form factors in (3.16)) that \(F_R^{\mathrm{res}}\) is power-suppressed relative to \(F_L^{\mathrm{res}}\), hence it formally counts as a next-to-next-to-leading power correction. We do not add a resonance contribution to the form factor \(F_{A_\parallel }\) for the longitudinal intermediate polarization state, since a simple Breit–Wigner ansatz as above would lead to a non-vanishing form factor at \(q^2=0\) resulting in a \(1/q^4\) singularity in the rate, which is unphysical.Footnote 4

4 Numerical results

We combine the form factors at leading-power (LP) and next-to-leading power (NLP) calculated with QCD factorization with the resonance contribution into the final result

$$\begin{aligned} F_X = F_X^{\mathrm{LP}} +F_X^{\mathrm{NLP}} + F_X^{\mathrm{res}} \ . \end{aligned}$$
(4.1)

We include renormalization group evolution to sum logarithms of the ratio of the hard, hard-collinear and soft scales in the LP term following [5], but not in \(F_X^{\mathrm{NLP}}\) where we set \(F_B=f_B\). Contrary to [11], we do not re-expand products of series expansions in \(\alpha _s\).

Table 1 Input parameters from PDG [24] unless stated otherwise. The value of \(f_B\) is taken from FLAG [23] using inputs from [25,26,27,28]. Here we quote the exclusive \(|V_{ub}|\) value from HFLAV [22], which uses lattice inputs from [29, 30]. Here \(\alpha _{\mathrm{em}}=\alpha _{\mathrm{em}}^{(5)}(5 \;\mathrm {GeV})\)

We use the inputs specified in Table 1 and the exponential model

$$\begin{aligned} \phi ^B_{+}(\omega ) =\frac{\omega }{\omega _0^2}e^{-\omega /\omega _0} \ , \quad \quad \phi ^B_{-}(\omega ) =\frac{1}{\omega _0}e^{-\omega /\omega _0} \ , \end{aligned}$$
(4.2)

for the B LCDA. We put \(\omega _0 = \lambda _B \equiv \lambda _B^+(n_-q=0) = 0.35\pm 0.15\) GeV at the scale 1 GeV as our default value. In the LP terms, we evolve \(\phi ^B_+(\omega )\) to the hard-collinear scale \(\mu _{hc}\) employing the analytic expression given in [20]. Previous analyses of \(B\rightarrow \gamma \ell \nu \) showed that the shape of the B LCDA is also important when including power corrections [20]. For the time-like virtual photon form factors, there is less control over power corrections and we therefore content ourselves with the exponential model to present our main results and conclusions. We further study the dependence of \(\lambda _B^\pm (n_-q)\) and the branching fraction of the four-lepton decay on the shape of the B meson LCDA in Sec. 4.4 using three two-parameter models [20] for the B LCDA. In addition, as \(|V_{ub}|\) is an overall factor we do not include its uncertainty in our error estimates, nor do we include the negligible uncertainties on the other input parameters in Table 1. We expect that eventual measurements of the four-lepton final states will be normalized to the decay rate of another, accurately known, exclusive \(b\rightarrow u\) transition.

Fig. 3
figure 3

\(|F_L^{\mathrm{LP}}|\) at leading order (LP, LO) and next-to-leading order (LP, NLO) as a function of \(q^2\). The bands represent the scale uncertainty from \(\mu _{hc}=1.5\pm 0.5\) GeV (left) and \(\mu _h=5^{+5}_{-2.5}\) GeV (right) and the dashed (dotted) curves correspond to the upper (lower) scale value. The value of \(n_+q\) is fixed to 4 GeV

4.1 Form factors

In Fig. 3, we show \(|F_L^{\mathrm{LP}}|\) at leading order (LP,LO) and next-to-leading order (LP,NLO) as a function of \(q^2\) at fixed \(n_+q=4\) GeV. The band describes the scale uncertainty of the hard-collinear scale \(\mu _{hc}=1.5\pm 0.5\) GeV (left) and that of the hard scale \(\mu _h = 5^{+5}_{-2.5}\) GeV (right). Similar to the \(B\rightarrow \gamma \ell \nu \) case, at small \(q^2\) the form factor in the LO approximation has a large scale uncertainty, which is practically removed by the NLO correction. We conclude that the LP form factor is under very good control away from light-meson resonances, once the B LCDA input is specified. It is worth noting that the form factors do not fall off right away with increasing \(q^2\), but exhibit a maximum near \(q^2 \approx 0.5\,\)GeV\(^2\). The maximum is generated by the sizeable imaginary part \(\pi \phi ^B_+(n_-q)\) of the \(q^2\)-dependent B LCDA moment \(\lambda _B^+(n_-q)\). These features of \(F_L\) are largely independent of the chosen value of \(n_+ q\).

Fig. 4
figure 4

Illustration of the \(q^2\) dependence of the leading form factor \(|F_L|\) including successively leading power (LP, NLO), next-to-leading power (NLP) local contributions, \(\xi \) and resonances. The value of \(n_+q\) is fixed to 4 GeV

The breakdown of \(|F_L|\) into its various contributions is shown in Fig. 4, starting with LP,NLO, then successively adding the NLP local (loc) contributions (defined as \(F_X^{\mathrm{NLP}}\) without the \(\xi \) term), the \(\xi \)-contribution as defined in (3.15) and finally the resonance contribution (res). We observe that the NLP contribution is of similar size as the NLO correction at LP. In the small \(q^2\) region, the form factor is locally dominated by the resonance contribution, as expected. However, also at larger \(q^2\) the resonance contribution is comparable to the NLP local contribution. This is due to the fact that the fall-off of the form factors in QCD factorization with increasing \(q^2\) is not faster than the \(1/q^2\) fall-off of the Breit–Wigner parametrization of the resonances. Note that we have fixed again \(n_+q=4\) GeV, and only show the \(q^2\) dependence of the form factors as the above observations are generic.

The \(q^2\) dependence of the power-suppressed (NLP) form factors \(F_R\) and \(F_{A_\parallel }\) is shown in the lower panel of Fig. 5. For \(F_R\), we show separately the local NLP contribution and the total by adding the resonance contribution. As we do not include a resonance contribution for \(F_{A_\parallel }\), we only show the total form factor. In addition, we show the dependence on \(\lambda _B\) by varying it from 200 MeV (dashed) to 500 MeV (dotted). Except for very small \(q^2\), the form factor relevant to the longitudinal polarization state of the virtual photon is significantly larger than the one for the right-helicity state.

For \(F_L\) (upper panel of Fig. 5), we show in addition to the results for \(\lambda _B=200\)  MeV (dashed) and 500 MeV (dotted) the dominant uncertainty of \(|F_L|\) computed with the central value \(\lambda _B=350\) MeV from varying \(r_\mathrm{LP}\) by \(\delta r_\mathrm{LP}=0.1 (0.2)\). We note two important features. First, for all three form factors there is a crossing of the dashed and dotted lines, such that the lower value \(\lambda _B = 200\) MeV increases the form factors at small \(q^2\) but decreases it for large \(q^2\), while for the upper value \(\lambda _B = 500\) MeV the situation is reversed. In the region where the crossing occurs (around 3.5 GeV\(^2\) for \(F_L\)) all sensitivity to \(\lambda _B\) is lost. Second, for \(F_L\) at low \(q^2\) the sensitivity to \(\lambda _B\) is larger than the uncertainty coming from \(r_\mathrm{LP}\).

Fig. 5
figure 5

The \(q^2\) dependence of the form factors at fixed \(n_+q= 4\) GeV for three values of \(\lambda _B\) from 200 MeV (dashed) to 500 MeV (dotted). In addition, for \(|F_L|\) (upper panel) we show the uncertainty from varying for the central value \(r_\mathrm{LP}\) by \(\delta r_\mathrm{LP} = 0.1 (0.2)\). For the NLP form factors (lower panel), we show \(|F_R|\) for both, with and without the resonance contribution. We do not add a resonance contribution to \(F_{A_\parallel }\)

Finally, we comment on the contribution of the three form factors to the differential rate in (2.10). More precisely, we show in Fig. 6 the three terms in the round bracket in (2.10), that is, the form factors squared including their kinematic prefactors. It is remarkable that the longitudinal polarization term \(\frac{\lambda ^2}{q^2}\,|F_{A_\parallel }|^2\) dominates the rate outside the resonance region, despite that fact that it is technically power-suppressed. Moreover, leaving out the resonance term, the longitudinal term would dominate even at small \(q^2\), although it vanishes for \(q^2\rightarrow 0\) (since \(F_{A_\parallel } \sim q^2\) as \(q^2\rightarrow 0\)), while the other two terms approach constants in this limit.

This behaviour can be understood by comparing the analytic expressions for the three terms (without the resonance term) for small \(q^2\). For small \(q^2\), the first two terms in the round bracket of (2.10) combine to \(16 k^2 (m_B^2-k^2)^2 (|F_L|^2+|F_R|^2)\), and we then estimate

$$\begin{aligned} \frac{d^2 \mathrm{Br}^{(F_{A_{\parallel }})}}{dq^2\,dk^2}\big / \frac{d^2 \mathrm{Br}^{(F_L)}}{dq^2\,dk^2}= & {} \frac{m_B}{m_B-n_+q}\frac{q^2}{(n_+ q)^2} \left( \ln ^2\frac{q^2 e^{\gamma _E}}{n_+q \lambda _B^+}+\pi ^2\right) \nonumber \\&+\mathcal {O}\!\left( \frac{q^4}{m_B^4}\right) \nonumber \\\approx & {} \frac{27 \pi ^2 q^2}{4 m_B^2} +\mathcal {O}\!\left( \frac{q^4}{m_B^4}\right) , \end{aligned}$$
(4.3)

where the last line refers to the representative value \(n_+q=2 m_B/3\). The parametric dependence identifies this ratio as power-suppressed in the hard-collinear region \(q^2\sim m_B \Lambda _\mathrm{QCD}\) as it should be. However, the large numerical factor \(27\pi ^2/4\) implies that the longitudinal term dominates whenever \(q^2\) is larger than the very small value 0.4 GeV\(^2\) as seen in the Figure. The origin of the large factor is the \(\pi ^2\) that arises from the large imaginary part of the inverse B LCDA moment, in this case \(\lambda _B^-(n_-q)\) in (3.13), since for values \(q^2 \in [0.1,1]\) the logarithmic term \(\ln ^2\frac{q^2 e^{\gamma _E}}{n_+q \lambda _B^+}\) is small.

Fig. 6
figure 6

Contribution of the form factors to the rate in (2.10) including kinematic factors. For \(F_V\) and \(F_{A_\perp }\), we show the QCDF results (labelled “No res”) and the full result including resonances

4.2 Predictions for the branching ratios

In this section, we provide theoretical predictions for the branching ratio in various \(q^2\) bins, integrated over \(n_+q\) (alternatively, \(k^2\)). The factorization calculation of the form factors in (3.8), (3.10), (3.11) and (3.13) are valid only for \(n_+q\sim \mathcal {O}(m_B)\). We therefore assume \(n_+q> 3\) GeV, which corresponds to \(E_\gamma =1.5\) GeV at \(q^2=0\), and integrate the double-differential branching fraction over \(n_+q>3\) GeV before forming \(q^2\) bins. A rough estimate, obtained by assuming that our results apply in the full phase space, shows that the \(n_+q\) cut reduces the rate by \(\mathcal {O}(20\%)\) for the [1.5, 6] GeV\(^2\) \(q^2\) bin.

Table 2 Branching ratio for the two non-identical lepton flavour cases (in \(10^{-8}\)) integrated over different bins in \(q^2\) and for \(n_+q> 3\) GeV. We show the individual contributions consecutively adding to the LP result the NLP local and \(\xi \) contributions and finally the resonances. In addition, we quote the uncertainties from varying the scales \(\mu _{h,hc}\), \(r_{\mathrm{LP}}=0.2\pm 0.2\) and \(\lambda _B=350\pm 150\) MeV. The total uncertainty is obtained by adding them in quadrature. For electrons, we also consider a low bin with \(q^2_{\mathrm{min}}= 0.0025\) GeV\(^2\)

For non-identical lepton flavours, \(\ell ^\prime \ne \ell \), the required \(n_+q\) cut can easily be applied as for each event \(n_+q\) can be inferred from the reconstructed \(k^2\) and \(q^2\) using (2.8). For the \(q^2\) bins, we consider the low bin \([4m_\mu ^2,0.96]\) GeV\(^2\), where the upper boundary of the bin is determined such that the large experimental background from \(\phi \) mesons decaying into a lepton pair is avoided. This bin was also considered by the LHCb Collaboration [8]. Figure 5 shows that in this bin, the \(\rho \) and \(\omega \) resonances make a large contribution. As mentioned, we do not attribute an additional error due to our resonance model. This introduces an additional uncertainty in this region which is challenging to quantify. Above \(q^2>1\) GeV\(^2\), the effect of the \(\rho \) and \(\omega \) resonances (and thus a possible uncertainty associated with this) is significantly reduced. We consider three different \(q^2\) bins: [1, 6], [1.5, 6] and [2, 6] GeV\(^2\). In these bins, the resonance contribution is approximately 10% only. In Table 2, we give the branching ratio in these \(q^2\) bins, specifying the contributions which are successively added. In addition, we specify the uncertainties from variations of the scales \(\mu _{h,hc}\), \(r_\mathrm{LP}=0.2\pm 0.2\) and \(\lambda _B=350\pm 150\) MeV. We observe that in the three considered regions above \(q^2>1\) GeV\(^2\), the effect of the resonances is smaller than the uncertainty from \(r_\mathrm{LP}\).

4.2.1 Identical lepton flavours

A challenge arises when considering identical lepton flavours, \(\ell ^\prime = \ell \), because experimentally the two like-sign leptons cannot be distinguished. This results in the additional interference term (2.11). More challenging is the required cut on \(n_+q\), where q is the photon momentum, to ensure that the photon has hard-collinear momentum. Considering again \(B^-\rightarrow \ell ^-(p_1) \ell ^+(p_2)\ell ^-(p_3) \bar{\nu }_\ell (p_\nu )\), with \(q^2=(p_1+p_2)^2\) and \(\tilde{q}^2=(p_2+p_3)^2\), experimentally, \(q^2\) and \(\tilde{q}^2\) cannot be distinguished. Instead, the invariant mass of two \(\mu ^- \mu ^+\)-pairs are defined as \(q^2_{\mathrm{low}}<q^2_{\mathrm{high}}\). In this case, placing the required cut on \(n_+q\) is not unambiguously possible as we cannot determine if the virtual photon has \(q^2_{\mathrm{low}}\) or \(q^2_{\mathrm{high}}\) associated with its momentum. To deal with this issue, several observations can be made:

  • for small \(q^2_{\mathrm{low}}\), the photon momentum can be associated with \(q^2_{\mathrm{low}}\) most of the time. If this is the case, a cut on \(n_+q_{\mathrm{low}}> 3\) GeV suffices (similar to the non-identical lepton flavour case). In fact, a more detailed analysis shows that the cases falling outside this cut (i.e. the region which cannot be described in QCD factorization in which the photon has \(q^2_{\mathrm{high}}\) but \(n_+q\) small) is phase-space suppressed by two powers of \(1/m_b\) compared to the leading contribution.

  • for \(q^2\) bins above 1 GeV\(^2\), the situation is more complicated as the photon more often has \(q^2_{\mathrm{high}}\). Therefore, we have to ensure \(n_+q>3~\)GeV for both \(q^2_{\mathrm{low}}\) and \(q^2_{\mathrm{high}}\).

We thus have to restrict both \(n_+q_{\mathrm{low}}> 3\) GeV and \(n_+q_{\mathrm{high}}> 3\) GeV. These quantities are now defined in the following wayFootnote 5: for each event, we specify \(q^2_{\mathrm{low}}\) and \(q^2_{\mathrm{high}}\). We can then associate the remaining lepton plus neutrino as \(k^2_{\mathrm{low}}\) and \(k^2_{\mathrm{high}}\), respectively. Here high and low are just labels and in this case \(k^2_{\mathrm{low}}\) is not necessarily lower than \(k^2_{\mathrm{high}}\). Then using (2.8), both \(n_+q_{\mathrm{low}}\) and \(n_+q_{\mathrm{high}}\) can be calculated from their corresponding \(k^2\) and \(q^2\). Alternatively, one could cut on \(k^2_{\mathrm{low}}\) and \(k^2_{\mathrm{high}}\) directly.

Table 3 Branching ratio for the two identical lepton cases (in \(10^{-8}\)) integrated over different bins in \(q_{\mathrm{low}}^2\) applying two cuts: \(n_+q_{\mathrm{low}}>3\) GeV and \(n_+q_{\mathrm{high}}>3\) GeV. We show the individual contributions consecutively adding to the LP result the NLP local and \(\xi \) contributions and finally the resonances. In addition, we quote the uncertainties from varying the scales \(\mu _\mathrm{h,hc}\), \(r_\mathrm{LP}=0.2\pm 0.2\) and \(\lambda _B=350\pm 150\) MeV. The total uncertainty is obtained by adding these contribution in quadrature. For the total results, we also quote the result with only one cut: \(n_+q_{\mathrm{low}}> 3\) GeV in parenthesis. For electrons, we also consider a low bin with \(q^2_{\mathrm{min}}= 0.0025\) GeV\(^2\)

For our final results in Table 3, we thus include two cuts: \(n_+q_{\mathrm{low}}\) and \(n_+q_{\mathrm{high}}>3\) GeV for all bins. Our final results for the branching ratio for different \(q_{\mathrm{low}}^2\) bins are given in Table 3. Again we present the different contributions added successively. We emphasize that placing these two cuts on \(n_+q\) might be conservative, specifically for the low \(q^2\) bin as discussed above, given the phase space suppression of the region in which \(q^2_{\mathrm{high}}\) is associated with the photon. We confirm numerically that indeed this region is small, by calculating the rate with and without the cut on \(n_+q_{\mathrm{high}}\). For comparison, in Table 3 we also give the results for the total rate with only the \(n_+q_{\mathrm{low}}\) cut in parenthesis. For identical lepton flavours, the branching ratio contains two contributions as defined in (2.11). With this convention, we find that Br\(_\mathrm{int}\) contributes positively to the rate but is suppressed by at least one order of magnitude compared to the non-identical lepton flavour rate.

A comment on the low-\(q^2\) bin for \(B^-\rightarrow \mu ^-\mu ^+\mu ^-\bar{\nu }_\mu \) is in order. Our prediction for Br(\(B^+ \rightarrow \mu ^-\mu ^+\mu ^-\bar{\nu }_\mu \)) is \(1.54 \;(1.77)\cdot 10^{-8}\) and includes cuts on \(n_+q\). Yet, it lies close to the upper limit \(<1.6\cdot 10^{-8}\) given by the LHCb collaboration for this decay mode in this bin [8]. Hence the LHCb result may already point towards a larger value of \(\lambda _B\).

4.3 Sensitivity to \(\lambda _B\)

Fig. 7
figure 7

Branching ratio (in \(10^{-8}\)) for three different \(q^2\) bins as a function of \(\lambda _B\) for the decay mode \(B^-\rightarrow \mu ^- \mu ^+\,e^- \,\bar{\nu }_{e} \). We include the variation from \(\mu _{h, hc}\) and from \(\delta r_\mathrm{LP}\) as inner (red) and outer (blue) bands, respectively

Our predictions for the branching ratio suffer from a large uncertainty due to \(\lambda _B\). Therefore, a measurement of the branching ratio may be used to obtain a bound on \(\lambda _B\). Figure 7 shows the rate as a function of \(\lambda _B\) for the \([4m_\mu ^2, 0.96]\) GeV\(^2\) bin and for the [1, 3] and [3, 6] GeV\(^2\) bins. We have split the [1, 6] GeV\(^2\) to avoid integrating over the region where the sensitivity to \(\lambda _B\) variation switches sign (see Fig. 5). As the uncertainty of the branching ratio is dominated by the error on \(r_\mathrm{LP}\), we consider two options; \(\delta r_\mathrm{LP}=0.2\) and \(\delta r_\mathrm{LP}=0.1\). The latter option shrinks the error by half. These predictions include our model for the long-distance resonance contributions as described above, for which we do not add an uncertainty. We note that the sensitivity to \(\lambda _B\) is best for the small \(q^2\) bins, while it is significantly reduced for higher \(q^2\) bins. Comparing to \(B\rightarrow \gamma \ell \nu _\ell \) [20], we conclude that the sensitivity to \(\lambda _B\) for \(B\rightarrow \ell \nu _\ell \ell ^\prime \bar{\ell }^\prime \) in the low-\(q^2\) bin is comparable (compare to Fig. 5 in [5]) for \(\lambda _B<200\,\)MeV, but less when it is larger. However, in this bin the resonance contribution is sizeable and there is unquantified model dependence related to its interference with the factorization contribution. For the [1, 3] GeV\(^2\) bin, the resonance contribution is less pronounced and thus this bin could still provide information on \(\lambda _B\) despite its smaller sensitivity.

4.4 Dependence on the shape of the B LCDA

Up to now, we used the exponential model (4.2) to present our main results. However, it is known that for \(B\rightarrow \gamma \ell \nu \) [20] the shape of the B meson LCDA has a significant effect through the dependence of radiative corrections on the logarithmic inverse moments, and of the power-suppressed form factor \(\xi \) through its dependence on the shape of the LCDA in the sum rule calculation. In four-lepton decay the generalized inverse moments \(\lambda _B^\pm (n_-q)\) introduce further dependence on the shape of the LCDA.

To study this dependence, we consider three two-parameter models [20]

$$\begin{aligned} \phi _+^{\mathrm{I}}(\omega )&= \left[ (1-a) + \frac{a \omega }{2\omega _0}\right] \frac{\omega }{\omega _0^2} e^{-\omega /\omega _0} \quad \quad 0\le a \le 1 \nonumber \\ \phi _+^{\mathrm{II}}(\omega )&= \frac{1}{\Gamma (2+a)} \frac{\omega ^{1+a}}{\omega _0^{2+a}} e^{-\omega /\omega _0} \quad \quad -0.5<a<1 \nonumber \\ \phi _+^{\mathrm{III}}(\omega )&= \frac{\sqrt{\pi }}{2 \Gamma (3/2+a)} \frac{\omega }{\omega _0^2} e^{-\omega /\omega _0}U(-a,3/2-a,\omega /\omega _0)\quad \quad \nonumber \\&\quad 0<a<0.5, \end{aligned}$$
(4.4)

where \(U(\alpha , \beta , z)\) is the confluent hypergeometric function of the second kind. Given \((\omega _0,a)\) one determines \(\lambda _B\) and the dimensionless shape parameter \(\widehat{\sigma }_1\), related to the first inverse-logarithmic moment. The range of a is chosen such that the range \(-0.693147< \widehat{\sigma }_1 <0.693147\) is covered, where \(\widehat{\sigma }_1 =0\) in the exponential model, see [20] for more details. To study the influence of the shape of the LCDA, we are then interested in the envelope of theoretical predictions of all three models spanned by the variation of a for given \(\lambda _B\). For simplicity, we assume that these forms of the B meson LCDA hold at the scale \(\mu _{hc}=1.5~\)GeV, so that no renormalization group evolution to the hard-collinear scale needs to be performed. We obtain \(\phi _-(\omega )\) using the Wandzura–Wilczek (WW) relation [18]

$$\begin{aligned} \phi _-(\omega ) = \int _\omega ^\infty \frac{d\omega '}{\omega '} \phi _+(\omega ') \ . \end{aligned}$$
(4.5)

The \(n_-q\) dependent moments \(\lambda _B^\pm (n_-q)\) are then obtained using (3.12) (and equivalently for \(\lambda _B^-(n_-q)\)). Again we define \(\lambda _B \equiv \lambda ^+_B(n_-q=0)\), such that \(\omega _0\) can be related to \(\lambda _B\) via

$$\begin{aligned} \omega _0^{\mathrm{I}}&= \lambda _B \left( 1-\frac{a}{2}\right) \ ,\nonumber \\ \omega _0^{\mathrm{II}}&= \frac{\lambda _B}{1+a} \ , \nonumber \\ \omega _0^{\mathrm{III}}&= \frac{\lambda _B}{1+2 a} \ . \end{aligned}$$
(4.6)

In Figs. 8 and 9, respectively, we show the \(q^2\) dependence of \(1/\lambda _B^+(n_-q)\) and \(1/\lambda _B^-(n_-q)\) for fixed \(n_+q= 4\) GeV for the three B LCDA models by varying the parameter a within the ranges indicated in (4.4), fixing \(\lambda _B=350\) MeV. The black solid line represents the exponential model. There is a significant dependence of \(1/\lambda _B^\pm (n_-q)\) on the B-meson LCDA shape – this is expected, as for instance, the \(q^2\) dependence of the imaginary part \(1/\lambda _B^\pm (n_-q)\) is directly related to the \(\omega \)-dependence of \(\phi _\pm (\omega )\).

Fig. 8
figure 8

The \(q^2\) dependence of real (left) and imaginary (right) parts of \(1/\lambda _B^+(n_-q)\) at fixed \(n_+q= 4\) GeV for \(\phi ^\mathrm{{I,II, III}}(\omega )\) in blue, green and red, respectively. The bands are obtained by varying the model parameter a within its range indicated in (4.4) for fixed \(\lambda _B=350\) MeV

Fig. 9
figure 9

As Fig. 8 but for \(\frac{n_-q}{n_+q}\times 1/\lambda _B^-(n_-q)\)

Finally, we compute the effect of the B meson LCDA shape on the branching ratio. In Fig. 10, we show the dependence of the branching ratio in the \([4m_\mu ^2, 0.96]\) GeV\(^2\) \(q^2\) bin on \(\lambda _B\) and the shape parameter a for the three models. For given \(\lambda _B\) on the horizontal axis, the bands are obtained by varying a in its allowed range. In black, we also show the exponential model. Comparing with our previous results, we observe that the dependence on the shape is about as large as the dependence on the \(r_\mathrm{LP}\) variation from the power-suppressed form factor, see Fig. 7. The conclusion is thus similar to the case of \(B\rightarrow \gamma \ell \nu \) [20]. Once sufficient data is available, a correlated determination of \(\lambda _B\) together with the shape parameter \(\widehat{\sigma }_1\) (and, perhaps, others) should be performed. The important point is that the predicted branching fractions are highly sensitive to B meson LCDA input, even if not necessarily \(\lambda _B\) alone.

Fig. 10
figure 10

Branching ratio (in \(10^{-8}\)) of the \(B^-\rightarrow \mu ^- \mu ^+\,e^- \,\bar{\nu }_{e}\) decay mode in the \([4m_\mu ^2, 0.96]\) GeV\(^2\) \(q^2\) bin for the three B LCDA models of (4.4) as a function of \(\lambda _B\) at scale 1.5 GeV. The bands are obtained by varying the model parameters within their allowed ranges. The solid black curve refers to the exponential model

5 Conclusion

Motivated by the first search and upper limit [8] for the rare charged-current B decay to a four-lepton final state \(\ell \bar{\nu }_\ell \ell ^{(\prime )} \bar{\ell }^{(\prime )}\), this work considered the calculation of the decay amplitude with factorization methods. Combining methods previously applied to \(B^- \rightarrow \mathrm {\ell }^- \bar{\nu }_{\mathrm {\ell }}\gamma \) [5] and \(B_s\rightarrow \mu ^+\mu ^-\gamma \) [11], we obtain the \(B\rightarrow \gamma ^*\) form factors, which depend on the invariant masses of the two lepton pairs, in QCD factorization at next-to-leading order in \(\alpha _s\) and leading power in an expansion in \(\Lambda _\mathrm{QCD}/m_b\), and to leading order in \(\alpha _s\) at next-to-leading power. To this we added a simple Breit–Wigner parametrization of the \(\rho \), \(\omega \) intermediate resonances. Although suppressed beyond next-to-leading power, the resonances dominate the spectrum in the \(\ell ^{(\prime )} \bar{\ell }^{(\prime )}\) invariant mass \(\sqrt{q^2}\) locally, making the predictions more uncertain in this region than at large invariant mass or for \(B^- \rightarrow \mathrm {\ell }^- \bar{\nu }_{\mathrm {\ell }}\gamma \). Quite generally it must be noted that the parametric counting that justifies the heavy-quark expansion is not well respected, as is evidenced by the large contribution of the longitudinal polarization state of the intermediate virtual photon.

Our calculations predict branching fractions of a few times \(10^{-8}\) in the \(q^2\) bin up to 1 GeV\(^2\), which are accessible to the LHC experiments. The branching fraction rapidly drops with increasing \(q^2\), reaching \(10^{-9}\) in the bin [1.5, 6] GeV\(^2\).

Confronting these results to measurements checks our understanding of Standard Model dynamics in these rare decays. An important further motivation for this investigation has been to explore the sensitivity of the decay rate to the inverse moment \(\lambda _B\) of the leading-twist B meson light-cone distribution amplitude. For non-vanishing \(q^2\) the access to \(\lambda _B\) is less direct than in \(B^- \rightarrow \mathrm {\ell }^- \bar{\nu }_{\mathrm {\ell }}\gamma \), and requires some knowledge of the shape of the LCDA as well. At large \(q^2\), the sensitivity disappears. We find these expectations confirmed in Fig. 7, which shows that \(\lambda _B\) is best determined from the small-\(q^2\) bin. In this bin the sensitivity to \(\lambda _B\) is almost comparable to \(B^- \rightarrow \mathrm {\ell }^- \bar{\nu }_{\mathrm {\ell }}\gamma \) when \(\lambda _B<200\,\)MeV, but less when it is larger. However, one should be aware that in this bin the resonance contribution is sizeable and there is unquantified model dependence related to its interference with the factorization contribution. The \(q^2\) bin above 1 GeV\(^2\) can still yield useful bounds on \(\lambda _B\), despite its weaker sensitivity. As for the case of \(B\rightarrow \gamma \ell \nu \) [20], once sufficient data is available, a correlated determination of \(\lambda _B\) together with B meson LCDA shape parameters should be performed. Overall, we conclude that the four-lepton final state cannot fully replace the \(B^- \rightarrow \mathrm {\ell }^- \bar{\nu }_{\mathrm {\ell }}\gamma \) mode to measure \(\lambda _B\). However, given the current state of knowledge, any complementary experimental result on \(\lambda _B\) is worthwhile pursuing.

5.1 Note added

While this paper was being finalized, Ref. [31] appeared. We note the following important differences: (1) the third independent form factor, related to \(F_{A_\parallel }\), is missed, see Appendix A. (2) The \(q^2\) distribution is computed without a cut on \(n_+ q\), hence includes significant phase-space regions where the adopted QCD factorization treatment is not applicable. (3) The residual scale dependence of the leading-power form factor at NLO in the strong coupling is much larger than ours. Presumably this is because it is assumed (incorrectly) that the form of the exponential model is preserved by renormalization group evolution. (4) Only the region of small \(q^2<1~\)GeV\(^2\) is discussed. In this region our results are dominated locally by the Breit–Wigner parameterization of the \(\rho \) and \(\omega \) resonances, whereas [31] adopts the QCD sum rule expression [32] for the power-suppressed form factor \(\xi \), but in the time-like region. (5) For the case of identical lepton flavours, we present partially integrated branching fractions that correspond to experimental observables.