1 Introduction

\(B\rightarrow K^*\ell \ell \) decays are sensitive to modified short-distance physics from sources beyond the Standard Model (SM), and a great deal of experimental and theoretical work has been devoted to extract short-distance information from them. However, long-distance physics within the SM also contributes significantly to the decay, and its effects are very difficult to assess reliably from first principles. On the other hand, tighter experimental constraints from increasingly precise measurements of \(b\rightarrow s\) processes have significantly limited the size of allowed New Physics (NP) effects in \(B\rightarrow K^*\ell \ell \), which are now comparable to current SM uncertainties. Thus, our inability to reliably constrain these long-distance contributions to acceptable levels stands in the way of obtaining unambiguous information on physics beyond the SM.

The \(B\rightarrow K^*\ell \ell \) decay is conveniently described by the \(K^*\) transversity amplitudes (\(\lambda = \perp , \parallel , 0\))

$$\begin{aligned} \mathcal{A}_\lambda ^{L,R}= & {} \mathcal{N}_\lambda \ \bigg \{ (C_9 \mp C_{10}) \mathcal{F}_\lambda (q^2) \nonumber \\&+\frac{2m_b M_B}{q^2} \bigg [ C_7 \mathcal{F}_\lambda ^{T}(q^2) - 16\pi ^2 \frac{M_B}{m_b} \mathcal{H}_\lambda (q^2) \bigg ] \bigg \} \end{aligned}$$
(1)

where \(C_{7,9,10}\) are short-distance Wilson coefficients, and \(\mathcal{N}_\lambda \) are normalization factors. The non-trivial matter from the theory point of view is the determination of the “local” and “non-local” long-distance effects encoded in the functions \(\mathcal{F}_\lambda ^{(T)}(q^2)\) and \(\mathcal{H}_\lambda (q^2)\), respectively, which depend on the dilepton invariant mass squared \(q^2\).

The functions \(\mathcal{F}_\lambda ^{(T)}(q^2)\) are form factors, which can be calculated by means of Light-Cone Sum Rules (LCSRs) at low \(q^2\) (\(\lesssim 10\,\,\mathrm{GeV}^2\)) [1, 2], or by numerical simulations (Lattice QCD) at large \(q^2\) (\(\gtrsim 15\,\,\mathrm{GeV}^2\)) [3, 4]. Both methods agree reasonably well when extrapolated [5, 6], and there are good prospects for improvement [7,8,9,10,11]. The form factors are not the focus of this work.

Here we focus on the functions \(\mathcal{H}_\lambda (q^2)\), which are related to the contribution from 4-quark and chromomagnetic operators in the Weak Effective Hamiltonian, and emerge from the “non-local” matrix element

$$\begin{aligned} \eta ^*_\alpha \, \mathcal{H}^{\alpha \mu } \equiv \ i \int d^4 x\ e^{i q\cdot x}\, \langle \bar{K}^*(k,\eta )|\mathcal{K}^\mu (x,0) | \bar{B}(p) \rangle \ , \end{aligned}$$
(2)

where \(p=q+k\), \(\eta \) is the polarization vector of the \(K^*\), and \(\mathcal{K}(x,y)\) is a bi-local operator. The most relevant contribution to this matrix element in the SM arises from the current-current operators \(\mathcal{O}_{1,2}\), since they come with large Wilson coefficients. In this letter we consider only this contribution – the so-called “charm-loop effect” – for which the object \(\mathcal{K}^\mu (x,y)\) is given by:

$$\begin{aligned} \mathcal{K}^\mu (x,y) = T\big \{ j_\mathrm{em}^\mu (x), C_1 \mathcal{O}_1(y) + C_2 \mathcal{O}_2(y) \big \} \end{aligned}$$
(3)

with \( j_\mathrm{em}^\mu (x) = \sum _q Q_q\,\bar{q}(x) \gamma ^\mu q(x)\) the electromagnetic current. The scalar functions \(\mathcal{H}_\lambda (q^2)\) are given by the Lorentz decomposition:

$$\begin{aligned} \mathcal{H}^{\alpha \mu }(q,k) = M_B^2 \, \big [ S^{\alpha \mu }_\perp \,\mathcal{H}_\perp - S^{\alpha \mu }_\Vert \, \mathcal{H}_\Vert - S^{\alpha \mu }_0\, \mathcal{H}_0 \big ] \end{aligned}$$
(4)

where \(S_\lambda ^{\alpha \mu }\) are a set of structures given in the appendix.

In the heavy b-quark limit and for very small \(q^2\), the functions \(\mathcal{H}_\lambda (q^2)\) factorize into non-perturbative form factors and light-cone distribution amplitudes, up to perturbatively calculable “hard” functions [12]. However this perturbative expansion breaks down when \(q^2\) approaches \(4 m_c^2\), leading to questionable predictions for \(q^2\gtrsim 6\,\mathrm{GeV}^2\). The integral in Eq. (2) is in fact dominated by the region \(x^2\lesssim (2m_c - \sqrt{q^2})^{-2}\) [13], so for \(q^2\ll 4m_c^2\) one may expand the operator \(\mathcal{K}^\mu (x,0)\) around \(x^2 = 0\) (a light-cone operator-product expansion, or LCOPE). This leads to an expansion of Eq. (2) in powers of \((2m_c - \sqrt{q^2})^{-1}\), with matrix elements of operators that are non-local only along the light cone. This theory framework has been worked out up to NLO in \(\alpha _s\) [12, 14] and including subleading terms in the LCOPE [13], and can be safely applied for \(q^2\ll 4m_c^2\) (preferably at \(q^2<0\)). However, reliable predictions for larger values of \(q^2\) remain a challenge.

In this letter we consider a consistent, model-independent and systematically-improvable approach to determine the dominant long-distance contributions \(\mathcal{H}_\lambda (q^2)\) to \(B\rightarrow K^*\ell \ell \) in the region \(q^2 \lesssim 14\,\mathrm{GeV}^2\). It provides genuine SM predictions even in the presence of NP in semileptonic operators. In addition, this approach provides access to the inter-resonance region \(10\,\mathrm{GeV}^2 \lesssim q^2 \lesssim 13\,\mathrm{GeV}^2\). The idea is the following: We determine the analytic properties of the functions \(\mathcal{H}_\lambda (q^2)\) in the complex plane, by considering their dominant singularities. We then use this information to write down a general and model-independent parametrization. Two pieces of information are used to constrain the parametrized functions: data on \(B\rightarrow K^* J/\psi \) and \(B\rightarrow K^* \psi (2S)\), which is independent of NP in semileptonic operators; and theory at \(q^2<0\), where it is reliable. This method, which builds upon Refs. [13, 15], gives the most reliable and consistent a-priori determination of the functions \(\mathcal{H}_\lambda (q^2)\) to date. We use these results to compute SM predictions (assuming no NP in \(\mathcal{O}_{1,2}\)), and to perform a NP fit to \(C_9\). All our numerical computations are performed with the help of EOS [16], which has been modified for this purpose [17].

2 Analytic structure and parametrization

It is a standard assumption in quantum field theory that the only analytic singularities of a correlation function – as a complex function of all its complexified kinematic invariants – are those required by unitarity [18]. This principle of “maximal analyticity” can sometimes be derived from causality, and it is therefore well founded [19]. Unitarity, in turn, relates analytic singularities with on-shell intermediate states: poles for one-particle states, and branch cuts for multi-particle states. Thus, the analytic structure of a correlation function can be learned by analysing its on-shell cuts.

In the case at hand, inspection of the correlation function (2) reveals the following analytic properties of the scalar functions \(\mathcal{H}_\lambda (q^2)\):

\(\blacktriangleright \) On-shell cuts in the variable \(q^2\) include: two poles at \(q^2=M_{J/\psi }^2\simeq 9\,\mathrm{GeV}^2\) and \(q^2=M_{\psi (2S)}^2\simeq 14\,\mathrm{GeV}^2\) corresponding to one-particle intermediate states through \(B\rightarrow K^* \psi _n (\rightarrow \ell ^+\ell ^-)\), with \(\psi _1 = J/\psi \) and \(\psi _2 = \psi (2S)\); a branch cut starting at \(q^2=t_+\equiv 4 M_D^2\) corresponding to two-particle intermediate states through \(B\rightarrow K^* [\bar{D} D] (\rightarrow \ell ^+\ell ^-)\), plus other “\(c\bar{c}\)” cuts with higher thresholds; and “light-hadron” branch cuts starting at \(q^2\simeq 0\) from intermediate states such as \(B\rightarrow K^* [3\pi ](\rightarrow \ell ^+\ell ^-)\), which include finite-width effects of \(J/\psi \) and \(\psi (2S)\). The effects of these “light-hadron” cuts are OZI suppressed [20,21,22]. Given the limited precision of current data, we will neglect these OZI suppressed contributions, keeping in mind that this is a pending assumption that should be tested in view of future experimental prospects. These presumably small effects have never been considered in previous analyses before.

\(\blacktriangleright \) On-shell cuts in the variable \((q+k)^2\) (the “forward” or “decay” channel) include branch cuts from intermediate states such as \(B\rightarrow \bar{D} D_s \rightarrow K^*\ell ^+\ell ^-\). The physical point \((q+k)^2=M_B^2\) lies on these cuts, which implies that the functions \(\mathcal{H}_\lambda (q^2)\) are complex-valued for all values of \(q^2\). But this imaginary part is not associated with any singularity in the variable \(q^2\). Thus, one can write \(\mathcal{H}_\lambda (q^2) = \mathcal{H}_\lambda ^\mathrm{(re)}(q^2) + i\, \mathcal{H}_\lambda ^\mathrm{(im)}(q^2)\), with \(\mathcal{H}_\lambda ^\mathrm{(re,im)}(q^2)\) satisfying the analytic properties of the previous point as functions of \(q^2\), and obeying the same dispersion relation.

These properties can be exploited to write down a general parametrization for the correlator consistent with unitarity. A convenient way to do so is to re-express the functions \(\mathcal{H}_\lambda (q^2)\) in terms of the “conformal” variable z:

$$\begin{aligned} z(q^2) \equiv \frac{\sqrt{t_+ - q^2} - \sqrt{t_+ - t_0}}{\sqrt{t_+ - q^2} + \sqrt{t_+ - t_0}}, \end{aligned}$$
(5)

where \(t_+= 4M_D^2\) and \(t_0= t_+ - \sqrt{t_+(t_+-M_{\psi (2S)}^2)}\). This transformation maps the \(c\bar{c}\) branch cut in the \(q^2\) plane to the unit circumference \(|z|=1\), and the entire first Riemann sheet in the \(q^2\) plane to the interior of the unit circle \(|z|<1\). Our choice for \(t_0\) implies that within the relevant interval \(-7\,\mathrm{GeV}^2 \le q^2 \le M_{\psi (2S)}^2\), \(|z| < 0.52\).

The approach now resembles and is inspired by the z-parametrization used for the form factors [23, 24]. The functions \(\mathcal{H}_\lambda (z) \equiv \mathcal{H}_\lambda (q^2(z))\) are meromorphic in \(|z|<1\), with two simple poles at \(z_{J/\psi }\equiv z(M_{J/\psi }^2)\simeq 0.18\) and \(z_{\psi (2S)}\equiv z(M_{\psi (2S)}^2)\simeq -0.44\). Therefore, multiplying by the corresponding zeroes will give an analytic function in \(|z|<1\) that can be Taylor-expanded around \(z=0\). This expansion should converge reasonably well in the region of interest, where \(|z| < 0.52\). This is the basis of our proposed parametrization.

In order to assure that the leading terms in the expansion will capture most of the features of the function (thus improving convergence), we use two more pieces of information: First, the correlator inherits all the singularities of the form factor (e.g. the \(M_{B_s^*}\) pole), and the leading OPE contribution to the correlator is indeed proportional to the form factor. Therefore it is better to parametrize the ratios \(\mathcal{H}_\lambda (q^2)/\mathcal{F}_\lambda (q^2)\) instead. Second, the poles should not modify the asymptotic behaviour. This is achieved by introducing appropriate “Blaschke factors” [23]. All in all, we propose the following parametrization:

$$\begin{aligned} \mathcal{H}_\lambda (z) = \frac{1-z\, z^*_{J/\psi }}{z-z_{J/\psi }} \frac{1-z\,z^*_{\psi (2S)}}{z-z_{\psi (2S)}} \hat{\mathcal{H}}_\lambda (z)\ , \end{aligned}$$
(6)

with

$$\begin{aligned} \hat{\mathcal{H}}_\lambda (z) = \Big [ \sum _{k=0}^K \alpha _k^{(\lambda )} z^{k} \Big ] \mathcal{F}_\lambda (z)\ , \end{aligned}$$
(7)

where \(\alpha ^{(\lambda )}_k\) are complex coefficients, and the expansion is truncated after the term \(z^{K}\). This truncation unavoidably introduces some model dependence. The maximum value that can be chosen for K will depend on the available set of experimental measurements and theory inputs.

3 Experimental constraints

According to the LSZ reduction formula [25], the amplitudes for the decays \(B\rightarrow K^* \psi _n\) (with \(\psi _1 = J/\psi \) and \(\psi _2= \psi (2S)\)) are defined by the residues of the functions \(\mathcal{H}_\lambda (q^2)\) on the \(\psi _n\) poles:

$$\begin{aligned} \mathcal{H}_\lambda (q^2 \rightarrow M_{\psi _n}^2) \sim \frac{M_{\psi _n} f^*_{\psi _n} \mathcal{A}^{\psi _n}_\lambda }{M_B^2 (q^2 - M_{\psi _n}^2)} + \cdots , \end{aligned}$$
(8)

where the dots represent regular terms. Here \(\langle 0| j_\mathrm{em}^\mu |\psi _n(q,\varepsilon ) \rangle = M_{\psi _n} f^*_{\psi _n} \varepsilon ^\mu \), and \(\mathcal{A}^{\psi _n}_\lambda \) are the \(B\rightarrow K^* \psi _n\) transversity amplitudes. The most precise constraints on these amplitudes can be obtained from Babar [26, 27], Belle [28,29,30] and LHCb [31].

We use the data to produce two sets of five pseudo-observables (three magnitudes and two relative phases on each resonance):

$$\begin{aligned} |r_\perp ^{\psi _n}|,\, |r_\Vert ^{\psi _n}|,\, |r_0^{\psi _n}|,\, \arg \{r_\perp ^{\psi _n} r_{0}^{\psi _n*}\},\, \arg \{r_\Vert ^{\psi _n} r_{0}^{\psi _n*}\}, \end{aligned}$$
(9)

where

$$\begin{aligned} r_\lambda ^{\psi _n} \equiv {\text {*}}{Res}_{q^2\rightarrow M^2_{\psi _n}} \frac{\mathcal{H}_\lambda (q^2)}{\mathcal{F}_\lambda (q^2)} \sim \frac{M_{\psi _n} f^*_{\psi _n} \mathcal{A}^{\psi _n}_\lambda }{M_B^2\, \mathcal{F}_\lambda (M_{\psi _n}^2)}\ . \end{aligned}$$
(10)

The numerical values for these pseudo-observables are obtained from the posterior-predictive distributions of a Bayesian fit. The inputs for this fit and the results are provided for completeness in the appendix. These pseudo-observables will act as constraints on the parameters of the correlators at \(z = 0.18\) and \(z = -0.44\).

4 Theory constraints

At \(q^2<0\) the functions \(\mathcal{H}_\lambda \) can be calculated with the current approaches for the large recoil region. We use QCD-factorization at next-to-leading order in \(\alpha _s\), including the form factor terms and hard-spectator contributions [12, 32]. In addition, we includeFootnote 1 the soft-gluon correction calculated via a LCSR in Ref. [13]. For the form factors we use the results from the LCSR with B-meson distribution amplitudes [2], in order to have a mutually consistent description of form factors and non-local contributions and benefit from theoretical correlations among both. In this way we compute the ratios \(\mathcal{H}_\lambda (q^2)/\mathcal{F}_\lambda (q^2)\) at the points \(q^2=\{-7,-5,-3,-1\}\,\mathrm{GeV}^2\). These ratios are used as pseudo-observables to constrain the parameters in Eq. (6) at \(z=\{0.52,0.50,0.48,0.46\}\). Further details and results are presented for completeness in the appendix. We emphasize that no theory is used at \(q^2 \ge 0\) at all.

Table 1 Mean values and standard deviations (in units of \(10^{-4}\)) of the prior PDF for the parameters \(\alpha _k^{(\lambda )}\)

5 SM predictions

Fig. 1
figure 1

Results of the prior and posterior fits for the ratio \(\mathrm{Re}[{\hat{\mathcal{H}}}_\perp (z)]/\mathcal{F}_\perp (z)\). See the text for details

We now perform a fit of Eq. (6) to the combined experimental and theoretical constraints described above in Sects. 3 and 4. We find that Eq. (6) with \(K=2\) provides an excellent fit to all inputs, with a p-value of 0.91. All 1D-marginalised posteriors are reasonably symmetric around their modes. The result of this fit is a set of correlated values for the complex parameters \(\alpha ^{(\lambda )}_k\), which are summarized in Table 1. These values lead to a determination of the non-local correlator in Eq. (2) that is consistent with the \(B\rightarrow K^*\psi _n\) measurements, the theory calculations at negative \(q^2\), and it is independent of new physics in semileptonic operators. This is very different compared to the approach of Ref. [1], which uses short-distance dominated \(B\rightarrow K^*\mu ^+\mu ^-\) measurements to determine the non-local correlators, thereby assuming SM values of the \(b\rightarrow s\mu ^+\mu ^-\) Wilson coefficients. As a consequence, their SM predictions for the angular observables are model-dependent posterior predictions. The study presented here does not suffer from this model dependence, and thus we determine the non-local correlators and provide a genuine SM prediction of the angular observables.

The gray band in Fig. 1 shows the result of this “prior” fit for the case of the real part of \(\mathcal{H}_\perp (q^2)\). Similar plots for the other correlators are provided in the appendix for completeness.

With these results at hand, we can compute SM predictions for all observables of interest within the range \(0\le q^2 \lesssim 14\,\mathrm{GeV}^2\). One of them is the angular observable \(P'_5\) [34], which is the visible face of the “\(B\rightarrow K^*\mu ^+\mu ^-\) anomaly” [35]. Our SM prediction for \(P_5'\) is represented by the gray band in Fig. 2. We find relatively small uncertainties and a clearly apparent tension with LHCb data (represented by purple boxes in Fig. 2).

Fig. 2
figure 2

Prior and posterior predictions for \(P_5'\) within the SM and the NP \(C_9\) benchmark, compared to LHCb data

Another interesting SM prediction that we obtain from our analysis is:

$$\begin{aligned} \begin{aligned} BR(B^0\rightarrow K^{*0}\gamma )&= (4.2^{+1.7}_{-1.3}) \cdot 10^{-5}\ , \end{aligned} \end{aligned}$$
(11)

in agreement with the world average [36]. The larger uncertainties as compared to Ref. [1] are due to our doubling of the form factor uncertainties. SM predictions for all other observables will be given elsewhere.

6 New physics analysis

We now perform a fit to \(B\rightarrow K^*\mu ^+\mu ^-\) data using as prior information the SM predictions derived in Sect. 5. We include the branching ratio and the angular observables \(S_i\) [38] within the \(q^2\) bins in the region \(1 \le q^2 \lesssim 14\,\mathrm{GeV}^2\). We use the latest LHCb measurements [39, 40], and perform different separate fits, using the results from the maximum-likelihood fit excluding (LLH) and including (LLH2) the inter-resonance bin, or using the results from the method of moments [41] (MOM and MOM2), and both including (NP fit) and not including (SM fit) a floating NP contribution to \(C_9\).

The fits provide posterior distributions for the correlator, for \(B\rightarrow K^*\mu ^+\mu ^-\) and \(B\rightarrow K^*\gamma \) observables, and for \(C_9\). We first discuss some illustrative results of the LLH2 fit. The posteriors for the real part of \(\mathcal{H}_\perp (q^2)\) are shown in Fig. 1, both for the SM and the NP fits. In this case it is reassuring that both are consistent within errors with the result of the prior fit, indicating that modifying the long-distance contribution does not lead to improvement in the SM fit, and so the long-distance contribution is not likely to mimic a NP contribution.

The posterior NP prediction for \(P_5'\) (corresponding to the LLH2 fit) is shown in Fig. 2, exhibiting a much better agreement with the experimental measurements than the SM (prior) prediction.

The main conclusion of the fits is the following. The SM fits are relatively inefficient in comparison with the NP fits, with posterior odds [42] ranging from \(\sim 2.7\) to \(\sim 10\) (on the log scale) in favor of the NP hypothesis. The one-dimensional marginalized posteriors yield:

$$\begin{aligned} \text {(LLH)} : C_9&= 2.51 \pm 0.29, \end{aligned}$$
(12)
$$\begin{aligned} \text {(LLH2)}: C_9&= 3.01 \pm 0.25,\end{aligned}$$
(13)
$$\begin{aligned} \text {(MOM)} : C_9&= 2.81 \pm 0.37,\end{aligned}$$
(14)
$$\begin{aligned} \text {(MOM2)}: C_9&= 3.20 \pm 0.31. \end{aligned}$$
(15)

The corresponding pulls with respect to the SM point \(C_9^\mathrm{SM}(\mu = 4.2\,\,\mathrm{GeV}) = 4.27\) range from 3.4 to 6.1 standard deviations, and are illustrated in Fig. 3. These results, from a fit to \(B\rightarrow K^*\mu ^+\mu ^-\) data only, are in qualitative agreement with global fits [42,43,44,45,46,47,48], but rely on a more fundamented theory treatment.

Fig. 3
figure 3

Posterior distributions for \(C_9\) from the NP fits and their respective pulls. Dark and light shaded regions correspond to 68% and 99% probability

7 Conclusions

Analyticity provides strong constraints on the hadronic contribution to \(B\rightarrow K^*\ell \ell \) observables, and fixes the \(q^2\) dependence up to a polynomial, which under some circumstances is an expansion in a small kinematical parameter. In this letter we have exploited this idea to propose a systematic approach to determine the dominant non-local contributions, which at this time are the main source of theory uncertainty. This approach is systematically improvable with more precise data on \(B\rightarrow K^*\psi _n\) and/or more precise theory calculations at negative \(q^2\). In addition, this approach allows access to the inter-resonance region, which provides valuable information on short-distance physics. We have focused on \(B\rightarrow K^*\ell \ell \), but the approach applies to any other \(B\rightarrow M\ell \ell \) modes such as \(B\rightarrow \lbrace K,\pi ,\rho \rbrace \ell \ell \) and \(B_s\rightarrow \phi \ell \ell \).

We have performed a numerical analysis implementing this idea, and conclude that significantly improved theory predictions can be obtained, leading to a more precise and robust interpretation of experimental data and an improved sensitivity to short-distance physics. We identify two issues worth exploring further. One has to do with neglecting the OZI-suppressed cut and charmonium width. A dispersive approach should be able to exploit present and future data on charmless non-leptonic multi-body \(B\rightarrow K^* X\) decays in order to properly bound these presumably small effects. The other has to do with the convergence of the z expansion. In this respect, the fit including \(B\rightarrow K^*\ell \ell \) data can provide enough constraints to increase the order of the expansion considerably, especially in view of the extraordinary experimental prospects for the next ten years [49]. We thus believe that this approach will become very useful in future analyses of exclusive \(b\rightarrow s\) and \(b\rightarrow d\) transitions.