Hadronic uncertainties in $B\to K^*\mu^+\mu^-$: a state-of-the-art analysis

In the absence of direct evidence for New Physics at present LHC energies, the focus is set on the anomalies and discrepancies recently observed in rare $b \to s\ell\ell$ transitions which can be interpreted as indirect hints. Global fits have shown that an economical New Physics solution can simultaneously alleviate the tensions in the various channels and can lead to a significant improvement in the description of the data. Alternative explanations within the Standard Model for part of the observed anomalies have been proposed in terms of (unexpectedly large) hadronic effects at low dilepton invariant mass and attributing tensions in protected observables to statistical fluctuations or experimental errors. We review the treatment of hadronic uncertainties in this kinematic regime for one of the most important channels, $B \to K^*\mu^+\mu^-$, in a pedagogical way. We provide detailed arguments showing that factorisable power corrections cannot account for the observed anomalies and that an explanation through long-distance charm contributions is disfavoured. Some optimized observables at very low dilepton invariant mass are shown to be protected against contributions from the semileptonic coefficient $C_9$ (including any associated long-distance charm effects), enhancing their sensitivity to New Physics contributions to other Wilson coefficients. Finally, we discuss how the recent measurement of $Q_5$ by Belle (and in the future by LHCb and Belle-II) may provide a robust cross-check of our arguments.


Introduction
For many years the Standard Model (SM) has been probed and systematically confirmed in collider experiments, with tensions showing up only temporarily and in isolated channels. However, in recent years a consistent picture of tensions has emerged in interrelated channels in the flavour sector. In the 1 fb −1 data set [1], evaluated in 2013, LHCb detected a sizeable 3.7 σ deviation in one bin of the angular observable P 5 [2] in the decay B → K * µ + µ − (the so-called P 5 anomaly [3]). The fact that this anomaly was accompanied by a 2.9 σ tension in the second bin of another angular observable called P 2 (related to the forward-backward asymmetry) 1 pointed, for the first time, to a coherent pattern of deviations [3]. In 2015, using the 3 fb −1 data set [4], LHCb provided more accurate results for these angular observables, once again with a discrepancy betwen the measurement of P 5 and the theory prediction within the SM. The same experiment also uncovered new deviations (larger than 2σ) in the B s → φµ + µ − branching ratio (at low and large φ recoil) [5,6]. A few months ago, the Belle experiment performed an independent measurement of P 5 [7]: the value, compatible with the LHCb measurements, agrees again poorly with the theory expectations in the SM.
Another interesting tension was observed in the ratio R K = B B→Kµ + µ − /B B→Ke + e − indicating that this deviation would affect predominantly b → sµ + µ − compared to b → se + e − [8], and thus violate lepton-flavour universality. This difference among lepton modes was also supported by the fact that no deviation was observed in B → K * e + e − data at very large K * recoil [9]. Very recently, Belle has presented a separate measurement [10] of P 5 in the muon and electron channels, and hence of the observable Q 5 = P µ 5 −P e 5 proposed in Ref. [11]. While the muon channel exhibits a 2.6 σ deviation with respect to the SM prediction [12] and in good agreement with the LHCb measurement, the electron channel agrees with the SM expectation at 1.3 σ. Though it is not yet statistically significant, the result could point to a violation of lepton-flavour universality in P 5 in compliance with the one measured in R K . If this result is confirmed by LHCb with higher statistics and also other tensions in new experimental measurements of lepton flavour universality ratios, like the promising R K * = B B→K * µ + µ − /B B→K * e + e − [11][12][13], are detected, this would hamper any attempt to explain the P 5 anomaly in terms of non-perturbative QCD effects.
It is striking that all the above-mentioned deviations can be alleviated simultaneously by a common mechanism, namely by a New Physics (NP) contribution to the shortdistance coefficient of the semi-leptonic operator O µ 9 , i.e. to the vector component of the b → sµµ transition in the effective Hamiltonian describing these transitions at the b-quark scale. Global analyses of b → s decays performed by independent groups [12,[14][15][16] following different approaches (improved QCD-Factorisation or full form factors), using different form factor input (from Ref. [17] or [18]) and different observables (optimized P ( ) i or form factor dependent S i ) have established that a negative New Physics (NP) contribution to the Wilson coefficient C µ 9 of ∼ −25% with respect to its SM value is favoured with large significance (between 4-5 σ depending on the hypothesis on the Wilson coefficients receiving NP contributions).
However, a controversy arose concerning the interpretation of the observed deviations in the semi-leptonic B (s) decays since the predictions are plagued by perturbative and non-perturbative QCD effects and some of the non-perturbative effects may mimic a NP signal. It was argued that unexpectedly large effects could be caused by resonance tails leaking into the q 2 < 8 GeV 2 region. Very recently, LHCb measured the relative phases of the J/ψ and ψ(2S) with the short-distance contribution to B → Kµ + µ − and reported small interference effects in dimuon mass regions far from the pole masses of the resonances [19]. The obtained fit is coherent with the global analyses [12,[14][15][16] but finds a higher significance for a NP contribution.
Some of us discussed in Ref. [11] how, under the assumption of lepton-flavour universality violation, the presence of NP in b → sµµ can be probed in a clean way via the comparison of b → see and b → sµµ observables, in which hadronic uncertainties cancel. In the present paper we take another approach: we discuss the different sources of hadronic uncertainties and provide robust arguments disfavouring the possibility that these non-perturbative effects are the origin of the observed anomalies. At leading order (LO) in the effective Hamiltonian approach, predictions involve two types of contributions, i.e., tree-level diagrams with insertions of the operators (generated at one loop in the SM), as well as one-loop diagrams with an insertion of the charged-current operator (generated at tree level in the SM). In contributions of the first type, the leptonic and the hadronic currents factorise, and QCD corrections are restricted to the hadronic B → M current. This class of factorisable QCD corrections thus forms part of the hadronic form factors parametrising the B → M transition. Contributions of the second type, on the other hand, receive non-factorisable QCD corrections that cannot be absorbed into form factors.
Both types of corrections have to be taken into account to assess hadronic uncertainties in the computation of B → K * + − observables. In this paper, we collect arguments to demonstrate that the hadronic uncertainties are sufficiently under control and we further present counter-arguments to recent articles claiming SM explanations based on incomplete analyses. In Sec. 2, we recall the main elements of the computation of B → K * + − and explain our treatment of the various sources of uncertainties, applied, e.g., in the global fit in Ref. [12]. In Sec. 3, we then discuss, in a pedagogical way, the issue of scheme dependence of the factorisable power corrections of order O(Λ QCD /m B ), which was pointed out for the first time in [20]. We derive explicit formulae for the contribution from factorisable power corrections to the most important observables P 5 , P 2 and P 1 , which allow us to confirm in an analytic way the numerical findings of Ref. [20]. Moreover, we extract the amount of power corrections (including errors) contained in the form factors from Ref. [18] and find them to be small, typically at the order of 10%, in agreement with dimensional arguments. In Sec. 4, the role of cc loops for non-factorisable QCD corrections is discussed. In the framework of the effective Hamiltonian, these corrections correspond to a one-loop contribution from the operator O 2 , which can be recast as a contribution to C 9 depending on the squared dilepton invariant mass q 2 , the transversity amplitudes A L,R j (j = 0, ⊥, ||) and the hadronic states (as opposed to a universal contribution from New Physics). This cc contribution always accompanies the perturbative SM contribution C eff SM 9 pert and the NP one C NP 9 : C eff B→K * 9 j = C eff SM 9 pert + C NP 9 + C cc B→K * 9 j (q 2 ).
Using a polynomial parametrisation, we performed fits for C cc B→K * 9 j (q 2 ) in various scenarios. We discuss the quality of the fits and compare our results with those presented in recent articles [21,22], with an emphasis on their statistical interpretation. In Sec. 5, we provide further experimental tests of hadronic uncertainties, in particular P 2 at very low q 2 that exhibits a kinematic protection with respect to charm-loop contributions entering C 9 , opening the door to a theoretically clean exploration of NP contributions to the Wilson coefficient C 10 . We also discuss the recent measurement of the observable Q 5 by Belle in terms of NP and SM alternatives. Sec. 6 finally contains our conclusion, while App. A provides a dedicated comparison of the parametric uncertainties arising in recent theoretical predictions of B → K * observables by different groups and App. B our predictions for the observable R K * in several benchmark scenarios.
2 An overview of the computation of B → K * + − observables The theoretical framework used in Refs. [12,20] to describe the decay B → K * + − at low squared invariant dilepton masses q 2 (where the most significant tensions with the SM were found) is based on QCD factorisation (QCDF) supplemented by a sophisticated estimate of the power corrections of order Λ QCD /m B (improved QCDF). The use of effective theories [23][24][25][26] allows one to relate the different B → K * form factors at leading order in Λ QCD /m B and Λ QCD /E, where E is the energy of the K * . This procedure reduces the required hadronic input from seven to two independent form factors, the so-called soft form factors ξ ⊥ , ξ , which in the region of low q 2 can be calculated using light-cone sum rules (LCSR). Two sets of LCSR form factors are available in the literature which have been calculated with very different approaches: KMPW form factors [17] that were computed using B-meson distribution amplitudes, and BSZ form factors [18] that make use of light-meson distribution amplitudes and a prevalent application of equations of motion. The better knowledge of K * -meson distribution amplitudes led to results with a smaller uncertainty in the BSZ case compared to the KMPW computation. In Refs. [12,20] we took advantage of the possibility of comparing the results for the two different sets of form factors as a robustness test of the optimized observables P ( ) i [27,28]. For our default predictions we relied on the KMPW form factors which have larger uncertainties and thus lead to more conservative predictions for observables. By construction the choice of the set of form factors has a relatively low impact on optimized observables but it has a large impact on the error size of form-factor sensitive observables like the longitudinal polarisation F L or the CP-averaged angular coefficients S i . The large-recoil symmetry limit is enlightening as it allows us to understand the main behaviour of optimized observables in presence of New Physics in a form-factor independent way. However, for precise predictions of these observables it has to be complemented with different kinds of corrections, separated in two classes: factorisable and nonfactorisable corrections. Improved QCDF 2 provides a systematic formalism to include the different corrections as a decomposition of the amplitude in the following form [25]: Here, C i,a and T i,a (a =⊥, ) are perturbatively computable contributions for the various K * polarisations (i = 0, ⊥, ) and Φ B,K * denote the light-cone distribution amplitudes of the B-and K * -mesons.
• Factorisable corrections are the corrections that can be absorbed into the (full) form factors F by means of a redefinition at higher orders in α s and Λ QCD /m B : The two types of corrections to the leading-order form factor F ∞ (ξ ⊥ , ξ ) are factorisable α s -corrections ∆F αs and factorisable O(Λ QCD /m B ) corrections ∆F Λ . While the former can be computed within QCDF and are related to the prefactors C a,i of Eq. (4), the latter, which can be parameterized as an expansion in q 2 /m 2 B , represent part of the O(Λ QCD /m b ) terms of Eq. (4) that QCDF cannot predict. In our approach we obtain central values for the ∆F Λ corrections by performing a fit to the full LCSR form factors F LCSR , yielding results of typically (5 − 10)% × F LCSR in size, as expected for O(Λ QCD /m B ) corrections. The errors associated to ∆F Λ are estimated by varying ∆F Λ in an uncorrelated way in the range of 10% × F LCSR around the central values. Even though there is no rigorous way in validating this assumption on the error size of power corrections, we have already shown in Ref. [20] that our error assignment of 10% for power corrections is conservative with respect to the central values of KMPW form factors, and we show that the same applies for the BSZ form factors [18] (including uncertainties). In Sec. 3 we will further discuss the dependence of improved QCDF predictions on the scheme, i.e., on the choice of definition for ξ ⊥, in terms of full form factors. We will argue that an appropriate scheme is a scheme that naturally minimizes the sensitivity to power corrections in the relevant observables like P 5 .
• Non-factorisable corrections refer to corrections that cannot be absorbed into the definition of the form factors due to their different structure. One can identify two types of such corrections. On one side, non-factorisable α s -corrections originating from hard-gluon exchange in diagrams with insertions of four-quark operators O 1−6 and the chromomagnetic operator O 8 : they can be calculated in QCDF [25] and contribute to T a,i in Eq. (4). On the other side, there are non-factorisable power corrections of O(Λ QCD /m b ), some of them involving cc loops 3 . The long-distance cc-loop contribution is included as an additional uncertainty, estimated on the basis of the only existing computation [17] of soft-gluon emission from four-quark operators involving cc currents. The calculation in Ref. [17] was done in the framework of LCSRs with B-meson distribution amplitudes and makes use of an hadronic dispersion relation to obtain results in the whole large-recoil region. Taken at face value, the resulting correction would increase the anomaly [3]. However in our predictions of observables, we add the corresponding corrections to the three transversity amplitudes with prefactors s i that are scanned from −1 to +1: In this way we allow for the possibility that a large relative phase could flip the sign of the long-distance charm contribution [12]. We note that our conservative approach typically leads to larger uncertainties for observables as compared to other estimates in the literature [18,29].
Finally, it is interesting to notice that the use of a different theoretical approach (full form factors [30]) and of different hadronic input (BSZ form factors [18]) gives results for the relevant observable P 5 that are in good agreement with ours (the predictions agree within 1σ in every bin). In the following, we discuss the impact of the two types of low-q 2 hadronic uncertainties in more detail: factorisable power corrections (Sec. 3) and long-distance cc loops (Sec. 4).

Anatomy of factorisable power corrections
In the region of large recoil of the K * meson, the non-perturbative form factors needed for the prediction of B → K * µ + µ − are available from two different LCSR calculations in Refs. [17] (KMPW) and [18] (BSZ). In Ref. [18], the set of form factors has been provided together with the corresponding correlations, essential for the cancellation of the form factors at LO in optimized observables. Instead of using the results provided in Ref. [18], the dominant correlations can alternatively be assessed from first principles, by means of large-recoil symmetries which relate the seven form factors among each other. Among the advantages of this second method, the correlations are free from the model assumptions entering the particular LCSR calculation and the method can be applied also to sets of form factors for which the correlations have not been specified, e.g., Ref. [17]. As a drawback, these correlations are obtained only at leading order, and symmetry-breaking corrections of order O(Λ/m B ) have to be estimated from dimensional arguments, implying a scheme dependence of the predictions at O(Λ/m B ). We will discuss this scheme dependence in the following.

Scheme dependence
Theoretical predictions for the decay B → K * + − depend on seven hadronic form factors usually denoted as V, A 0 , A 1 , A 2 , T 1 , T 2 , T 3 . For small invariant dilepton masses q 2 m 2 B (large-recoil limit), and at leading order in α s and Λ/m B , the set of form factors becomes linearly dependent [23][24][25][26]: Here, m B and m K * are the meson masses, and E is the energy of the K * . Within the above-mentioned approximations the number of independent form factors thus reduces to two, the so-called soft form factors ξ ⊥ and ξ , and the full set of form factors V, A 0 , A 1 , A 2 , T 1 , T 2 , T 3 can be obtained as linear combinations of ξ ⊥ , ξ . Eqs. (7) allow us to construct observables in which the form factors cancel at leading order. For an illustration, let us focus at q 2 = 0, where the first relation in Eq. (7) implies while T 1 (0)/T 2 (0) = 1 holds exactly due to a kinematic identity from the definition of T 1 and T 2 . Observables involving ratios like the ones in Eq. Accurate QCDF predictions rely in an essential way on quantifying the uncertainty due to power-suppressed Λ/m B effects. This is typically done by assigning uncorrelated errors of the size δ ∼ 10% to Eq. (7) (and thus to the ratios in Eq. (8)). Note, however, that this cannot be done in a unique way. Let us, for instance, assume that the errors on A 1 (0)/T 1 (0) and T 1 (0)/V (0) are given by δ 1 and δ 2 , respectively: The error δ 3 on the ratio A 1 (0)/V (0) is then fixed by depending on how uncertainties are propagated. The assumption of a universal error size δ 1 = δ 2 ≡ δ for the first two ratios thus leads to an error δ 3 = √ 2δ or δ 3 = 2δ for the third one, although in principle the three ratios should be treated on an equal footing.
The same phenomenon can be understood also from a different point of view. In the QCDF approach, predictions of observables depend on the two soft form factors ξ ⊥ and ξ for which hadronic input (from LCSR) is needed. According to Eq. (7), there are various possibilities to select the input among the seven full factors V, A 1 , A 2 , A 0 , T 1 , T 2 , T 3 , and the choice defines an input scheme. One possible choice would consist for example in defining A different choice would consist in identifying where a αs V , a αs A 1 and a Λ V , a Λ A 1 are α s and Λ/m B corrections to Eq. (7) for each form factor and the ellipsis represents terms of higher orders. If Eq. (13) was determined to all orders in α s and Λ/m B , predictions for observables would not depend on the chosen input scheme. In practice, QCD corrections are known in QCDF up to O(α 2 s ) [31,32] while Λ/m B corrections can only be estimated, implying a scheme dependence in the computation of the observables at O(Λ/m B ) and O(α 3 s ). While the form factors taken as input inherit their uncertainties directly from the LCSR calculation, the remaining form factors receive an additional error for the unknown Λ/m B corrections a Λ . In the example above (scheme 2), we have with ∆T LCSR 1 (0) denoting the uncertainty of the LCSR calculation, and In this case, V (0) and A 1 (0) are subject to two main sources of uncertainties, namely the error ∆T LCSR 1 (0) of the LCSR calculation and the uncertainties ∆a Λ V,A 1 from unknown power corrections (we neglect the uncertainty ∆a αs V,A 1 from the perturbative contribution). On the other hand, if we had chosen V (0) or A 1 (0) directly as input for the soft form factor ξ ⊥ (0), the only source of error for V (0) or A 1 (0) would have been the respective LCSR error ∆V LCSR (0) or ∆A LCSR 1 (0). The choice of scheme thus defines the precision to which the various full form factors are known, keeping those taken as input free from a pollution by power corrections.
The freedom to choose between different input schemes is equivalent to the ambiguity in implementing the 10% requirement on the symmetry-breaking corrections to Eqs. (7) and (8). In the scheme 2, the uncertainties on the form factor ratios are: , we find that the resulting errors are in agreement with Eqs. (9) and (10). How can the ambiguity from the scheme dependence be solved? To answer this question, let us first have a look at the decay B → K * γ. The prediction of this branching ratio depends on the single form factor T 1 (0) and the natural choice thus consists in taking its LCSR value directly as input for the theory predictions 4 . Of course, one could take as input any other form factor to which T 1 is related through the symmetry relations in Eq. (7), e.g. V . Unlike T 1 , the choice of V would generate power corrections of O(Λ/m B ) in the prediction for B → K * γ, reflecting the fact that the identification V = T 1 is only an approximation, valid up to O(Λ/m B ), and that the "wrong" form factor, V , has been used for the prediction instead of the "correct" one, T 1 . The corresponding increase in the uncertainties is thus caused artificially by an inappropriate choice of the input scheme. This becomes even more obvious in the hypothetical limit where the errors of the LCSR calculation go to zero: In this case, the prediction for B → K * γ would be free from any form factor uncertainty (as it should be) when T 1 is taken as input, while the wrong central value would be obtained when V is used, together with an irreducible error of order The example of B → K * γ clearly illustrates the fact that an inappropriate choice of scheme can artificially increase the uncertainty of the theory prediction. The situation is less obvious in the case of B → K * µ + µ − , where typically all seven form factors enter the prediction of the observables. Ignoring the form factor A 0 , whose contribution is suppressed by the lepton mass, we observe that the form factors V, A 1 , A 2 enter the amplitudes together with the Wilson coefficients C ( ) 9,10 , whereas T 1 , T 2 , T 3 enter the amplitudes together with the coefficient C ( ) 7 . In the SM, C eff 7 Re(C eff 9 ) (where the effective coefficients C eff 7,9 include effects from perturbative qq loops), e.g. C eff 7 (q 2 0 ) = −0.29 and Re(C eff 9 )(q 2 0 ) = 4.7 at q 2 0 = 6 GeV 2 . Hence the (axial-)vector form factors V, A 1 , A 2 are in general more relevant than the tensor form factors T 1 , T 2 , T 3 , except for the very low q 2 -region where the C 7 contribution can be enhanced by the 1/q 2 pole from the photon propagator. In particular in the anomalous bins of the observable P 5 (4 ≤ q 2 ≤ 8 GeV 2 ), we find that the impact from C 7 is strongly suppressed compared to the impact from C 9 . This can be seen by setting some of the Wilson coefficients to zero and determining the resulting change in the predictions: one gets a shift of ∆P 5 (C 7 = 0) [4,6] = −0.19 when C 7 is switched off, compared to ∆P 5 (C 9 = 0) [4,6] = +1.34 when C 9 is switched off. With respect to the soft form factor ξ ⊥ , the observable P 5 is thus dominated by the ratio A 1 /V suggesting the form factor V , or alternatively A 1 , as a natural input for ξ ⊥ . Defining ξ ⊥ from T 1 , as done in Refs. [29,33], on the other hand represents an inadequate choice: to a good approximation, the prediction of P 5 in the anomalous bins does not depend on this form factor, due to a suppression by |C 7 /C 9 | 1. Together with the linear propagation of errors applied in Refs. [29,33], the choice of T 1 as input leads to an artificial inflation of the uncertainty by a factor of 2 in the anomalous bins of P 5 , as we demonstrated in Eqs. (10) and (16). In other words, we conclude that A 0 0.000 ± 0.000 0.054 ± 0.033 0.197 ± 0.203 0.000 ± 0.000 0.026 ± 0.020 0.055 ± 0.047 ± 0.000 ± 0.054 ± 0.112 ± 0.000 ± 0.020 ± 0.038 Table 1: Results for the fit of the power-correction parameters a F , b F , c F to the B → K * form factors from Ref. [18], using the input scheme 1 in the transversity basis. Furthermore, the relative size r(q 2 ) with which the power corrections contribute to the full form factors is shown for q 2 = 0, 4, 8 GeV 2 . In the first line of each entry, the central value and the error obtained from the fit are given. In the second line, the estimate the results on P 5 obtained in Ref. [33] correspond to an implicit assumption of 20% power corrections 5 because this is the size of symmetry breaking implicitly assumed for the dominant form factor ratio A 1 /V 6 . The situation is different for observables that vanish in the limit C 7 → 0, i.e. that depend on C 7 already at leading order in C 7 /C 9 , like the observable P 2 . In this case, it is not clear a priori whether the observable is more sensitive to the (axial-)vector or to the tensor form factors, and the answer to this question requires a closer inspection (see Sec. 3.3). In summary, in the soft-form factor approach, we expect the uncertainties of our predictions to be scheme dependent. An inappropriate choice of definition for the soft form factors will inflate the errors on the predictions. For each observable, we should thus choose a scheme as appropriate as possible to avoid an overestimation of the uncertainties.

Correlated fit of power corrections to form factors
Having clarified the issue of the scheme dependence, we can turn to the question of the actual size δ of the symmetry breaking corrections. Both Refs. [20] and [33] use δ = 10% as an error estimate. It is instructive to study how this ad-hoc value compares to the size of power corrections present in specific LCSR calculations. In Ref. [20], we extracted information on power corrections from the LCSR form factors in Refs. [17] (KMPW) and [34] (BZ), and we will discuss the results from Ref. [18] in a similar way, checking the robustness of this extraction.
The form factors F are parametrised according to Eq. (5). For a specific set of LCSR form factors {F LCSR (q 2 )}, the power corrections ∆F Λ (q 2 ) can then be determined as the difference between the full F LCSR (q 2 ) and the large-recoil result F ∞ (q 2 ) upon including α s -corrections ∆F αs (q 2 ) from QCDF. In practice we fit the coefficients a F , b F , c F of the parametrisation to the central value of the LCSR results. In Tab. 1 we show the results obtained within this approach initiated in Ref. [20] and applied now to the form factors from Ref. [18]. In contrast to previous LCSR calculations, Ref. [18] for the first time provided the correlations among the form factors, enabling us to fit not only the central values of the parameters a F , b F , c F but also their uncertainties according to the correlation matrix of the form factors, which will serve us to illustrate the good control of our method of factorisable power corrections. Tab. 1 displays the results for the input scheme 1, defined in Eq. (11), and parametrising power corrections in the transversity basis {V, A 1 , A 2 , A 0 , T 1 , T 2 , T 3 } (this corresponds to the default choice in Ref. [20]).
The relative size of power corrections, is displayed on the right-hand side of Tab. 1 for different invariant masses q 2 = 0 GeV 2 , 4 GeV 2 , 8 GeV 2 of the lepton pair. Typically, the central values of the power corrections are within the range of (5 − 10)%, with uncertainties below 5%. These findings are in line with the results for the central values of the form factors from Refs. [34] (BZ) and [17] (KMPW) obtained in Ref. [20]. Exceptions occur at large q 2 for the form factors A 2 and T 3 , which are calculated as linear combination of two functions in Ref. [18]. In the case of A 2 , the central values of the power corrections reach up to 19%, while the respective uncertainties still do not exceed 10%. Note that in scheme 1, the power corrections to A 2 are not an independent function, but they are fixed from the ones to A 1 as detailed in Ref. [20]. In the case of T 3 , the central values are quite small but come with uncertainties that grow up to 18%. It turns out that the power corrections to these two form factors  [18]. Left: error band according to the LCSR calculation from Ref. [18]. Right: error bands following the soft form factor approach with δ = 5%, 10%, 20% power corrections.
have no impact on the key observables P 5 , P 1 and P 2 as can be seen from the analytic formulae in Sec. 3.3, where these terms are either absent or numerically suppressed.
For comparison, Tab. 1 also features the estimate of power corrections by a generic size of δ = 10% following the approach of Ref. [20] to estimate the uncertainties on a F , b F , c F in the absence of information on the correlations among form factors. By definition, the ratio r(q 2 ) yields 10% for these estimates for all form factors, except for A 0 and A 2 where the power corrections are not independent but follow from correlations among form factors. The comparison with the results from the fit shows that the estimate of power corrections by a generic size of δ = 10% in Refs. [20] is conservative compared to the procedure followed in Refs. [14-16, 21, 22] consisting in a direct extraction of the errors from the uncertainties given in Ref. [18]. This is further illustrated in Fig. 1, where the form factor ratio A 1 /V dominating the observable P 5 is shown, comparing the direct error assessment from Ref. [18] (left plot) and our results from uncertainty assignments of δ = 5%, 10%, 20% power corrections.
Let us now illustrate how the treatment of power corrections affects the uncertainties of relevant B → K * observables. Taking the above results, and following a similar procedure for the scheme 2 defined in Sec. 3.1, we can compute the SM prediction for  . Results are shown for the three different options for the treatment of power corrections and for the two different input schemes discussed in the text. The last row contains the prediction from a direct use of the full form factors from Ref. [18].
extract the soft form factors ξ ⊥ and ξ which are considered as uncorrelated.
b) Determining the errors of a F , b F , c F from the fit to the form factors from Ref. [18] but including only the correlations dictated by the large-recoil symmetries exactly as in the previous case.
c) Determining the errors of a F , b F , c F from a correlated fit to the form factors from Ref. [18] and including the correlations between the a F , b F , c F and the soft form factors ξ ⊥ , ξ as extracted from the correlation matrix in Ref. [18].
The error estimate in option a) is mainly based on the fundamental large-recoil symmetries and thus to a large extent independent of the details of the particular LCSR calculation [18]. When going over option b) to c), we include in each step more information from Ref. [18] (the actual size of power corrections for option b), and the correlations for option c)). With option c) the full information from the particular LCSR form factors is used, implying that the result must be independent of the input scheme (apart from a residual scheme dependence from non-factorisable power corrections) and that it must coincide with the one obtained by a direct use of the correlated full form factors (displayed in the last row of Tab. 2). The numerical confirmation of this correspondence provides a consistency check for our implementation of the fit of the power corrections and the various methods. In Tab. 2, the errors obtained in option b) are very similar to the ones using option c). From this observation we conclude that the correlations among the power correction parameters a F , b F , c F and the ones among the soft form factors ξ ⊥ , ξ have very little impact and that the dominant form factor correlations are indeed the ones from the large-recoil symmetries. The difference in the errors for option a) between scheme 1 and scheme 2 is easily understood: while the LCSR results of Ref. [18] end up with about δ ∼ 5% power corrections, a generic size of δ = 10% is assumed for option a). In scheme 1, this leads to the expected increase of the errors by roughly a factor 2. On the other hand, in scheme 2, we find an increase of the errors by more than a factor 4, in accordance with the discussion in the previous section. As argued there, the implementation of option a) in scheme 2 actually corresponds to the assumption of δ = 20% power corrections for the relevant form factor ratio A 1 /V .

Analytic formulae for factorisable power corrections to optimized observables
We have considered a particular observable and demonstrated numerically that the prediction for observables depends on the scheme chosen for the soft form factors ξ ⊥, . In this section we illustrate this scheme dependence more explicitly by giving analytic formulae for the power corrections to the observables P 5 , P 1 and P 2 , both in the transversity and in the helicity basis. The two bases are related to each other via the relations given in Eq. (31) of Ref. [29]. In both cases we parametrize the power corrections according to Eq. (17). The formulae are given without fixing a particular scheme, i.e., before power corrections are partially absorbed into the non-perturbative input parameters ξ ⊥ and ξ .
In the helicity basis, the formula for P 5 reads whereξ = (E K * /m K * ) ξ and following Ref. [33], we have defined We denote the large-recoil expression as P 5 | ∞ and leave aside non-local terms, corresponding to non-factorisable corrections. Our result agrees with Eq. (25) of Ref. [33] for the terms proportional to a V − , a T − , a V 0 , a T 0 , but we find an additional term proportional to a V + . We would like to stress that precisely this term, which is hidden in "further terms" and not discussed in Ref. [33], dominates the power corrections in the anomalous region around q 2 0 ∼ 6 GeV 2 , as can be seen from the numerical evaluation of Eq. (19): This means that the discussion on the scheme dependence of P 5 in Ref. [33] only takes into account numerically subleading contributions. Converted into the transversity basis, Eq. (19) becomes with the dominant term being proportional to the combination a A 1 −a V of power correction parameters. If A 1 or V is chosen as input for ξ ⊥ , the corresponding parameter a A 1 or a V vanishes identically. On the other hand, if T 1 is taken as input, both a A 1 or a V survive and their independent variation leads to an increase of the errors associated to power corrections. This behaviour explains part of the inflated errors in Ref. [29] and it is analytically pinned down in Eqs. (19) and (25). The formulae support the numerical analysis reported in Fig. 2 of Ref. [20], where the binned predictions for P 1 , P 2 , P 4 , P 5 were given in the two schemes with ξ ⊥ defined from V or T 1 , respectively. Without any further assumption on the correlations between the parameters a F , Eqs. (19) and (25) manifest an explicit scheme dependence whose origin and interpretation was discussed in detail in Sec. 3.1.
For the observable P 1 , which vanishes in the large-recoil limit, we find in the helicity basis turning in the transversity basis into Our result, Eq. (23), fully agrees with Eq. (26) of Ref. [33]. The authors of Ref. [33] used this result to argue that P 1 should be much cleaner than P 5 because it only involves one soft form factor and a lower number of power correction parameters a F . However, the total number of power correction parameters is not the relevant criterion to decide whether an observable is clean: as seen before, in the case of P 5 the coefficients in front of the power correction parameters exhibit a strong hierarchy, so that in practice only one term becomes relevant. As a matter of fact, the leading power corrections for both P 5 and P 1 stem from a V + and the respective coefficients are of the same size, as seen when comparing the evaluation of Eq. (23) for q 2 0 = 6 GeV 2 , with the corresponding one for P 5 from Eq. (21). Therefore, P 1 and P 5 are on an equal footing with respect to power corrections, and all statements above, regarding the scheme dependence of P 5 , also apply to P 1 . Like P 5 , P 1 suffers from an increase of power corrections when ξ ⊥ is defined from T 1 instead of from V , as already demonstrated numerically in Fig. 2 of Ref. [20] and analytically in Eq. (24). Turning finally to the observable P 2 , we find in the helicity basis which translates into in the transversity basis, with P 2 | ∞ = C 9,⊥ C 10 /(C 2 9,⊥ +C 2 10 ). Unlike P 1 and P 5 , the leading term in P 2 involves both (axial-)vector and tensor power corrections, and at first sight it seems that there is no preference whether to define ξ ⊥ from V or from T 1 . Note, however, that the kinematic relation T 1 (0) = T 2 (0) implies a T 1 = a T 2 and that a definition from T 1 hence absorbs both a T 1 and a T 2 and leads to smaller uncertainties from corrections. Again, this is confirmed by the numerical results in Fig. 2 of Ref. [20].
We see that the scheme dependence of the angular observables can be explicitly worked out by studying the analytic dependence on the power correction parameters. Our results agree with Ref. [33] for P 1 , but we have shown that the formula for P 5 in Ref. [33] actually misses the dominant and manifestly scheme-dependent term. Our analytic formulae allow us to understand how different schemes can yield significantly different uncertainties if one treats power corrections as uncorrelated, in perfect agreement with the numerical discussion in Ref. [20]. We can spot the relevant form factor(s) whose power corrections are going to have the main impact on each observable, and thus identify appropriate schemes to compute each observable accurately.

Reassessing the reappraisal of long-distance charm loops
We now turn to the second main source of hadronic uncertainties: non-factorisable Λ QCD /m B corrections associated with non-perturbative cc loops. Since these contributions can mimic a shift in the Wilson coefficient C 9 , one may wonder how to disentangle them from possible short-distance new physics. While the latter would induce a q 2independent C 9 , universal for the three different transversities i =⊥, , 0, non-factorisable long-distance effects from cc loops in general introduce a q 2 -and transversity dependence that can be cast into effective coefficient functions C cc 9 i (q 2 ). A promising strategy thus consists in investigating whether the B → K * µ + µ − data points towards a q 2 -dependent effect. To this end the authors of Refs. [21,22] performed a fit of the functions C cc 9 i (q 2 ) to the data using a polynomial parametrisation. In Sec. 4.1 we comment on the results, before presenting in Sec. 4.2 our own analysis based on a different, frequentist, statistical framework.

A thorough interpretation of the results of Refs. [21, 22]
The analysis in Refs. [21,22] introduces for each helicity λ = 0, ±1 a second-order polynomial in q 2 : The functions h λ , with a total number of 18 real parameters, then enter the B → K * µ + µ − transversity amplitudes as follows: with the normalisation Here, s i = 0 indicates that only the perturbative quark-loop contribution Y (q 2 ) has been included in the amplitudes A λ L,R (s i = 0) while any long-distance contribution as the one calculated in Ref. [17] and included in Ref. [12] is switched off.
The coefficients h (i) λ parametrise the q 2 -expansion of the charm-loop contribution to the various helicity amplitudes, but can also (partially) be mimicked by NP contributions to the Wilson coefficients C 7 and C 9 . Note that a NP contribution to C 7 would yield a pole at s = 0 and thus contribute to h (0) λ and higher orders, whereas a NP contribution to C 9 would contribute only starting from h (1) λ and higher orders. Let us stress that both kinds of NP contributions would also contribute to h (2) λ , since they enter the transversity amplitudes as a Wilson coefficient multiplied by a q 2 -dependent form factor 7 . Contrary to Refs. [21,22], we have set h (0) 0 = 0 in order to avoid an unphysical pole at q 2 = 0 in A 0 L,R (which for instance would result in a divergence in BR(B → K * γ)).
For a proper interpretation of the results obtained in Ref. [21], it is important to note that the authors study two different hypotheses: • Hypothesis 1: No constraint is imposed on the long-distance charm-loop contribution represented by the coefficients h (i) λ , and the results of the LCSR computation in Ref. [17] are not used in the fit. Instead, after fitting the functions h λ (q 2 ) to the B → K * µ + µ − data they are compared with the functionsg M i calculated in Ref. [17]. We have checked the relation between the functionsg M i and the long-distance charmloop contributions h λ , given by Eq. (2.7) in Ref. [21] (up to the correction C 1 → C 2 noticed in Ref. [22]). Rewriting the amplitudes M 1,2,3 in Ref. [17] in terms of helicity amplitudes leads to 8 : Re h − (q 2 ) + Re h + (q 2 ) ]. (31) It is interesting to observe that the results of the fit in Ref. [21] forg M i seem to agree well with the LCSR estimates of Ref. [17] if in all amplitudes approximately the same q 2 -independent shift is added to the LCSR result. This observation is in line with the conclusions from global fits [12,14,15], bearing in mind that in Ref. [21,22] basically only B → K * µ + µ − data is used and that the authors interpret this constant shift 7 It is thus not correct to state that h (2) and higher coefficients can arise only due to long-distance physics as suggested in Ref. [21,22]. Even though the form factors do not vary strongly with q 2 , the presence of NP contributions to Wilson coefficients would generate terms corresponding to (small) contributions to higher orders in the polynomial expansion. 8 Even though Eq. (31) is also valid for the imaginary part of the functions, we only consider the real part of theg M i here, as the authors of Ref. [17] consider these contributions to be real in the region of interest within their approximations. as being of hadronic origin. Notice that such a q 2 -independent shift (very similar for all helicity amplitudes) is at odds with a q 2 -and helicity-dependent contribution expected in the case of an hadronic effect, in particular if it is attributed to tails of resonances. Note, however, that a firm conclusion can only be drawn by comparing the quality of a fit for a q 2 -independent contribution with the one for q 2 -dependent functions, a task that was not carried out in Refs. [21,22] and that will be performed in Sec. 4.2. In any case, one should keep in mind that a universal shift in C µ 9 due to NP can also explain the deviations in B s → φµ + µ − and the violation of leptonflavour universality suggested by R K and Q 5 = P µ 5 − P e 5 , which is not the case for hadronic cc contributions.
• Hypothesis 2: In a second analysis, the authors of Ref. [21] impose an additional constraint to the fit: they assume that the results of Ref. [17] hold exactly for q 2 ≤ 1 GeV 2 , while they do not make any assumptions for q 2 > 1 GeV 2 and again set all the Wilson coefficients to their SM value. The results obtained in this second approach have to be interpreted with great care: i) The authors of Ref. [21] decide to take the results of Ref. [17] as exact in the region q 2 < 1 GeV 2 but to discard them for larger q 2 : this choice of range is rather arbitrary, as the LCSR approach yields a computation valid up to 2 GeV 2 according to Ref. [17], and the extrapolation via the dispersion relation is deemed appropriate up to 4 GeV 2 by the authors of Ref. [21] themselves.
ii) The additional constraint artificially tilts the fit by forcing it to follow a behaviour at q 2 1 GeV 2 against the trend of data (which would prefer to have a constant shift C NP 9 , as discussed in Refs. [12,14,15,35], corresponding to nonvanishing h (1) λ in the framework of Ref. [21]). It is compensated by a spurious q 4 -dependence with h (2) λ = 0, which is then interpreted in Refs. [21,22] as an indication of non-local hadronic effects.
iii) In the region below 1 GeV 2 , the treatment of the distribution by LHCb means that the data correspond to slightly different observables from the optimized observables defined in Ref. [27,28], as discussed in Sec. 2.3.1 in Ref. [12] and below. This effect, which can be taken into account by a redefinition of the optimized observables, is not considered in Ref. [21,22] and can affect the outcome of the analysis. iv) Finally, the LCSR computation of Ref. [17] does not take into account all non-local effects but is an estimate of the soft gluon part with respect to the leading-order factorisable contribution, from which the imaginary part is still missing. In this sense it is not consistent to compare the absolute value of the fittedg M i obtained from data with the computation of Ref. [17], and if one still insists in doing so (ignoring all previous issues), at least one should compare their real parts rather than the absolute values.
We conclude that a fit under the second hypothesis cannot indicate whether a q 2 -dependent effect is favoured over a constant one, since it artificially creates a q 2 -dependence by putting a constraint on one side (below q 2 = 1 GeV 2 ). A fit under the first hypothesis can be an appropriate method, but requires to compare the quality of the fits obtained in both cases under consideration of the number of free parameters. We will address this issue in the following.

A frequentist fit
We are going to perform fits using the approach described in Ref. [12], taking LHCb data on B → K * µµ as data. We follow the theoretical framework of Ref. [12] for the predictions of the observables, but modify it slightly to remain as close as possible to the fits shown in Refs. [21,22]: we will not use the computation of long-distance charm effects in Ref. [17]. In practice, this amounts to keeping only the perturbative function Y (q 2 ) while setting all three s i = 0. We treat the form factors using the soft-form-factor approach with the inputs of Ref. [17], and employ the same parametrisation Eq. (29) as Refs. [21,22] for the long-distance charm contribution, extending it in a straightforward way to the order q 6 by introducing the parameters h (3) λ . We take all coefficients of the expansion as real, following Ref. [17]. Note that the results of Ref. [21,22] favour mostly real values for h + and h 0 , but not necessarily for h − .
Our fits differ from the ones in Refs. [21,22] with respect to the statistical framework. We use a frequentist approach and in particular do not assume any a-priori range for the fit parameters h (i) λ , contrary to the Bayesian approach in Refs. [21,22] where (flat or Gaussian) priors are used for the polynomial parameters. Keeping in mind that the functions h λ (q 2 ) are expansions in q 2 , we perform fits allowing for h (i) λ with i ≤ n, increasing progressively the degree of the polynomials n. At each order, we determine the minimum χ 2 min as well as the difference between the χ 2 min with polynomial degrees n − 1 and n, and the pull of the hypothesis h (n) 0,+,− = 0. This information indicates the improvement of the fit obtained by increasing the degree of the polynomial expansion.
In Tabs. 3 and 4, we provide the results in the SM case and in the NP scenario C NP 9 = −1.1, respectively, using only B → K * µ + µ − data. We see that in both cases, the fit clearly improves when increasing the degree of the polynomial from n = 0 to n = 1 (the addition of the parameters h (1) λ leads to a q 2 dependence similar to that of a NP contribution to the Wilson coefficient C 9 ). On the other hand, including quadratic or cubic terms does not provide any significant improvement. This implies that the fit does not hint at a q 2 -dependence beyond the one generated by the Wilson coefficients C 7 and C 9 . In Refs. [21,22] a different q 2 -dependence was advocated referring to the parameter h (2) − which showed a 2σ deviation from h (2) − = 0. We would like to emphasize that it is impossible to draw conclusions from a single parameter and that a global assessment of the whole fit is required. For instance, from our tables one can see that increasing the order of the expansion can lead to a reshuffling of the overall deviation from zero of   = 0, using LCSR from Ref. [17] in the soft-form-factor approach employed by Ref. [12]. All coefficients are given in units of 10 −4 . Different orders n of the polynomial parametrisation of the long-distance charmloop contribution are considered. If this contribution is set to zero, the fit yields χ 2 min;h=0 = 71.60 for N dof = 59. the functions h λ (q 2 ) among the various expansion parameters, even in the case that no significant improvement of the fit is obtained. For instance, in the SM fit (Tab. 3) the parameter h (0) + deviates from zero by 1.3σ at the order n = 2, but by 2.8σ at n = 3. We would expect a similar analysis to be possible in the Bayesian framework proposed in Ref. [21], by comparing the information criteria for the two hypotheses "no constraint for q 2 ≤ 1 GeV 2 and h (2) λ left free" and "no constraint for q 2 ≤ 1 GeV 2 and h (2) λ = 0", which is unfortunately not provided in Ref. [21].
In the SM fit we find the pattern while higher orders are compatible with zero. These findings are in rough agreement with Refs. [21,22] for the λ = 0, + helicities. The differences can be attributed to the different treatment and input for the form factors and to the differences in the statistical approach. The comparison cannot be done easily for the λ = − helicity, as large phases were found in Ref. [21] whereas we considered only real cc contributions. Setting C µ,NP 9 = −1.1 improves the χ 2 min significantly without modifying the above conclusions (see Tab. 4). As mentioned before, it is not strictly equivalent to modify h (1) or C 9 since the latter is multiplied by a q 2 -dependent form factor. Therefore the results of the fits are not exactly identical, both for the χ 2 min and the values of the expansion coefficients h (n) (this explains why the addition of h (1) still brings some improvement to the fit with C µ,NP 9 = −1.1, although more modestly than in the SM case). In Tab. 5, we present the same fit as in Tab. 3 (B → K * µ + µ − only, no NP contributions to the Wilson coefficients), taking the LCSR results from Ref. [18] within the full-form factor approach. As can be seen from the comparison of the two tables, the same conclusions hold independently of the specific input for the form factors.
We also performed another fit (Tab. 6) where we consider the SM case but include all the exclusive b → se + e − and b → sµ + µ − observables discussed in Ref. [12]. We take the same parameters for the charm-loop contributions in B s → φ + − and B → K * + − (i.e., we assume an SU (3) flavour symmetry for this long-distance contribution), but we neglect the effect of charm loops in B → K + − (in agreement with Ref. [17]). Compared to Ref. [12] and due to its direct relation with the charm-loop contribution, we have also added the B → K * γ branching ratio that was not included in our earlier analyses (we have checked that including this observable does affect neither the outcome of the global fits presented in Ref. [12], nor the fits presented in this section). We see again that there is no strong for quadratic h terms: h (2) − prefers to be slightly different from zero (positive), but the data can also be described equivalently well using only constant and linear contributions.
At this stage, we see that the data require constant and linear contributions, as expected also from Ref. [17]. On the other hand, the data do not require additional quadratic or cubic contributions, contrary to the claim made in Ref. [21]. This claim was later amended in Ref. [22], indicating that a solution with h (2) = 0 also leads to acceptable Bayesian fits. Our own fits indicate that the current data do not show signs of a large and unaccounted for hadronic contribution from charm loops.

Further experimental tests of the role of hadronic uncertainties
A different approach to hadronic uncertainties consists in identifying observables and kinematic regions totally (or partially) free from some of these uncertainties. Contributions from cc loops enter many B → K * observables, but it is worth noticing that not all of them exhibit the same sensitivity to these effects. Let us start by recalling a few facts concerning the structure of this contribution. The long-distance cc contribution has a 1/q 2 pole due to the photon propagator: following Ref. [17], we have absorbed this singular contribution into an effective C 9 . If only regular expressions (no poles) are preferred, one can split the cc contribution into two parts: the pole term affects C 7 and the remaining regular part enters C 9 .
The Wilson coefficient C 7 (SM and NP) is accurately extracted from the inclusive branching ratio BR(B → X s γ), where hadronic effects are tightly controlled, providing a slight preference for a narrow negative range for C NP  = −1.1, using LCSR from Ref. [17] in the soft-form-factor approach employed by Ref. [12]. All coefficients are given in units of 10 −4 . Different orders n of the polynomial parametrisation of the long-distance charmloop contribution are considered. If this contribution is set to zero, the fit yields χ 2 min;h=0 = 63.30 for N dof = 59.  = 0, using LCSR results from Ref. [18] in the full-form-factor approach. All coefficients are given in units of 10 −4 . Different orders n of the polynomial parametrisation of the long-distance charm-loop contribution are considered. If this contribution is set to zero, the fit yields χ 2 min;h=0 = 69.80 for N dof = 59.  = 0, using the same approach as in Ref. [12]. All coefficients are given in units of 10 −4 . Different orders n of the polynomial parametrisation of the long-distance charm-loop contribution for B → V + − are considered. If this contribution is set to zero, the fit yields χ 2 min;h=0 = 98.00 for N dof = 81.
coefficient (see Refs. [36] and [12]). The comparison between this inclusive observable and exclusive observables that contain long-distance charm contributions (like BR(B + → K * + γ) and BR(B 0 → K * 0 γ)) does not leave much space for a sizeable long-distance charm contribution at q 2 = 0 entering C 7 . The sum of the NP and long-distance charm contributions favours a negative contribution, increasing in absolute value the size of C SM 7 = −0.29 (see for instance Ref. [37]). Indeed the allowed ranges for C 7 and C 7 found in Ref. [37] (see Fig. 2) are in very good agreement with the results of the global fit shown in Fig.10 of Ref. [12] under similar conditions (keeping C NP 9 = 0). We can also illustrate this expectation of very small contributions by considering the charm-loop parametrisation introduced in Sec. 4. The long-distance charm contribution to C 7 for the transverse amplitudes can be expressed as [21] Using the values for the charm contribution obtained from the fit from Tab. 3 (SM) and Tab. 4 (C NP 9 = −1.1) in the optimal case n = 1 one can determine these contributions (see Table 7) 9 .  Table 7: Charm contribution entering C 7 as obtained from the fit for n = 1 in the SM (Tab. 3) and in presence of NP C NP 9 = −1.1 (Tab. 4).
After discussing C 7 , we can turn our attention to the other Wilson coefficients different from C 9 , which are not affected by long-distance charm contributions. The key observation is that some angular observables exhibit peculiar suppression mechanisms at low q 2 that protect them from contributions from C 9 . One can identify three optimized observables of interest: • P 1 and P 3 with a sensitivity to C 7 and C 7 , • P 2 with a sensitivity to C 7 C 10 and C 7 C 10 .
These observables are protected from C 9 and its associated long-distance charm (but obviously not from charm contributions to C 7 ) as they are built from the helicity amplitudes A L,R ⊥, that exhibit a photon pole contrary to the longitudinal amplitude A L,R 0 10 . We will discuss now more precisely the mechanism at play for these observables.

P 1 and P 3 at very low q 2
The observables P 1 (initially called A 2 T in Ref. [38]) and P 3 (initially called A (Im) T in Ref. [39]) are defined by At very low q 2 , P 1 and P 3 are sensitive only to the electromagnetic coefficients because J 3 has a double pole structure stemming from the photon pole. While P 1 is sensitive to Re[C 7 C 7 ], P 3 depends on Im[C 7 C 7 ]. For simplicity and in agreement with the absence of significant CP -asymmetries in the current measurements, we will assume that NP does not induce new weak phases and only C 9 is complex, with an imaginary part due to SM effects only. We denote the different C 9 (C cc 9 ) contributions associated with each amplitude (following the notation of Eq. (3)) C R 9 j ≡ Re C eff 9 j (q 2 ) = C eff SM R 9 pert + C NP 9 + C cc R 9 j (q 2 ) C I 9 j ≡ Im C eff 9 j (q 2 ) = C eff SM I 9 pert with j =⊥, , 0 and the superscript R (I) stands for Real (Imaginary) parts. In a similar way, the Wilson coefficient C 7 can be written as where C cc 7 ⊥, is an amplitude-dependent long-distance charm contribution associated to this coefficient and given in terms of helicity amplitudes in Eq. (33).
Under this hypothesis of only real NP contributions, P 3 does not carry relevant information (see below). P 1 can be expanded in powers ofŝ = s/m 2 B (withm b = m b /m B ): The ellipsis denotes higher orders in the expansion inŝ/(2m b ). We have not combined the expansions of the numerator and denominator for simplicity of the discussion. As can be seen from this expansion, the contamination from C 9 is suppressed at very-lowŝ (for s ≤ 1 GeV 2 ,ŝ ≤ 0.04). Long-distance charm pollution from C 7 at very-low dilepton mass is present in both numerator and denominator, but it is expected to be small according to our discussion at the beginning of Sec. 5. The determination of C 7 from P 1 is unlikely to become competitive with the extraction from b → sγ decays. On the contrary, in the absence of NP with imaginary contributions, P 3 becomes uninteresting (in the sense discussed in this section) since the leading term is kinematically suppressed and doubly contaminated by (the imaginary part of) C 9 and by charm inside C 7 : P 3 ∝ŝ C 7 ⊥ C I 9 − C 7 C I 9 ⊥ + C 7 (C I 9 + C I 9 ⊥ ) + . . .
5.2 P 2 at very low q 2 The observable P 2 (originally called A (Re) T in Ref. [39]) defined as involves all Wilson coefficients C ( ) 7 , C ( ) 9,10 . At very low q 2 , one would naively expect a behaviour similar to P 1,3 , with a sensitivity to C ( ) 7 and a suppression of the semileptonic C 9,10 coefficients. Actually one finds that P 2 is independent of C 9 in this range but does exhibit a sensitivity to C 10 . Contrary to P 1,3 , this sensitivity comes from a cancellation between left-and right-handed contributions in the numerator, which eliminates the double pole involving only electromagnetic operators and leaves the single pole as the dominant term. The same cancellation removes the sensitivity to the C 9 coefficient in the leading term. In the denominator the double pole survives and, as a consequence, the observable is globally suppressed byŝ. This can be seen analytically by expanding the observable in the large-recoil limit: In the numerator, the contributions from C R 9,(⊥, ) are suppressed byŝ with respect to the leading C ( ) 7 C ( ) 10 contribution. In the denominator, given in Eq. (38), the contributions from C R 9,(⊥, ) and C 9 are always suppressed byŝ. We have checked that this remarkable behaviour does not occur for other optimized observables: for instance, we find a very similar situation in the numerator of P 5 (with a factor m b m B ) but its denominator exhibits no suppression of C eff 9 at smallŝ. For smallŝ (in particular the first bin [0.1,0.98] GeV 2 ), P 2 is protected from contributions due to C eff 9 coming either from Standard Model, charm-loop, ad-hoc non-factorisable power corrections or New Physics. On the contrary, it is sensitive to the product C 7 C 10 and the corresponding chirally flipped ones. Then from Eq. (41) the leading term inŝ in the numerator of P 2 is of the form where the first term is large and positive, the second and third term are numerically subleading, the last two terms are even more suppressed, and finally the term collects all long-distance charm contributions. Focusing first on the numerator of P 2 , one can see that improving the agreement with the current LHCb data would require Given that |C 10 | |C 10 | according to the global fit in Ref. [12], one can safely neglect the right-handed currents in Eq. (43). According to Tab. 7, we see that this long-distance charm term ∆C cc 7 is positive in most of the 1 σ range. Assuming no sizeable right-handed currents and taking into account both the numerator and the denominator one finds that a positive (negative) ∆C cc 7 decreases (increases) the value of the first bin of the observable P 2 with respect to the SM by a factor (1 − ∆C cc 7 /(C eff SM 7 C 10 )). Using central values of Tab. 7, the value of P SM 2 is reduced in the first bin to 0.87 P SM 2 (0.83 P SM 2 ) for C NP 9 = −1.1 (C NP 9 = 0 respectively), when including these charm contributions (a much smaller effect is observed if the values of Ref. [17] for C cc 7 (⊥, ) are used instead). In order to illustrate the charm sensitivity of P 2 , in particular in the region of the first bin, we consider the impact of a (universal) charm contribution entering C 9 11 . This is illustrated in the left panel of Fig. 2. The right panel shows that the sensitivity to the charm contribution to C 7 yields a larger but still limited effect.
The sensitivity of P 2 for different NP scenarios is explored in Fig. 3. In agreement with Fig. 2, the variations in C 9 (whether from charm or NP) are irrelevant for the first bin. On the other hand, a positive contribution in C NP 10 improves the agreement between the prediction and data in the first bin. This contribution to C 10 also shifts the position of the maximum of P 2 , but its zero. Let us remark that this shift of the maximum of P 2 (also produced by C NP 9 = −1.1) would increase the value of P 2 in the bin [2,4.3] GeV 2 as observed in the LHCb 2013 data set (with a 2.9 σ tension with respect to the SM).
A comment is in order concerning the comparison between data (blue crosses in Figs. 2 and 3) and theory in the first bin. Figs. 2 (left and right) and 3 (left) show predictions for P 2 . Due to the limited statistics, the LHCb analysis of the full B → K * angular distribution is performed neglecting lepton mass effects, which corresponds to a change of the definition of the longitudinal polarisationF L compared to the definition F L commonly used theoretically (see Sec. 2.3 in Ref. [12] for the definitions). Indeed, the measurement of F L is performed using J 1c , rather than J 2c (used to define the optimized observables [27,28]): both differ by m -suppressed terms which are generally tiny, but noticeable at very low q 2 . An estimate of the impact of this approximation used by LHCb is shown in Ref [12] and it was found to decrease the SM prediction of P 2 by around 23% in the first bin compared to a computation based on J 2c . This implies that LHCb does not measure P 2 in this first bin but a modified observable,P 2 [12]. Numerically, in the case of interest analyzed here, we have found that one can easily transform the theoretical values of P 2 intoP 2 using P 2 [0.1,0.98] 0.77 P 2 [0.1,0.98] (as in the SM case). In Fig. 3 (right), we show the variation ofP 2 in several scenarios. Once again, a positive NP contribution in C 10 contribution improves the agreement between data and prediction.

The implications of the Belle measurement of Q 5
Our previous arguments show that neither factorisable power corrections nor charm loops are likely to account for the observed anomalies. In addition one can use a complementary  and powerful independent tool to support these arguments, namely data. The recently proposed observable Q 5 = P µ 5 − P e 5 [11] hampers any possibility to use an SM alternative to explain the anomaly in P µ 5 . Independently of how large or of unknown origin or even wrong is the contribution added to the prediction of P µ 5 , in the SM the electronic P e 5 counterpart will receive the same contribution. These SM contributions will automatically cancel in Q 5 , up to contributions highly suppressed by m 2 and q 2 leading to extremely clean SM predictions (shown in Fig. 4). Belle has been the first experiment to probe the observable Q 5 [10]: in the relevant bin [4,8] GeV 2 , a good agreement with the LHCb measurement of P µ 5 [4] was observed, with a 2.6 σ deviation w.r.t. the SM prediction while only a 1.3 σ deviation for the electronic observable P e 5 was found. This implies a 1.2 σ deviation w.r.t. the SM for the corresponding observable Q 5 in the bin [4,8] GeV 2 , which is reduced to 0.6 σ in the presence of a NP contribution C µ 9 = −1.1 (left-hand side of Fig. 4). In the bin [1,6] GeV 2 , one gets a discrepancy of 1.3 σ in the SM, reduced to 0.7 σ for a NP contribution C µ 9 = −1.1 (right-hand side of Fig. 4). The low statistical significance of this result prevents us from drawing any firm conclusion at this stage. It is however interesting to notice the similarities with the pattern observed in R K . Both LHCb and Belle-II should have the capacity to implement this important test and to provide a robust complementary test to the arguments discussed in this paper.

Conclusions
Over the last few years, a coherent pattern of deviations has emerged in b → sµ + µ − decays, from LHCb and Belle measurements. These deviations and their correlations can be analysed in the effective Hamiltonian approach, as done in several global analyses of b → sµ + µ − and b → se + e − modes [12,[14][15][16]. The outcome is intriguing: a shift in the Wilson coefficient C µ 9 by about -25% of its SM value is sufficient to achieve a significant improvement (by more than 4 σ) in the description of the data (contributions to other coefficients like C µ 10 and C µ 9 are also allowed). There have been several controversies concerning the assessment of theoretical uncertainties in the predictions of B → K * µ + µ − observables: some concerning the factorisable QCD corrections arising in the description of form factors, whereas other dealt with non-factorisable corrections only present at the level of the amplitudes and related to long-distance charm-loop contributions. Even though these effects could not explain the 2.6 σ anomaly in the ratio R K [8], which goes in the same direction as the (statistically not yet relevant) trend observed for the difference between P 5 for electrons and muons [10], it is interesting to assess these claims concerning B → K * µ + µ − observables.
The first discussion deals with factorisable corrections. In the limit m b → ∞, the seven B → K * form factors can be reduced to two soft form factors ξ || and ξ ⊥ , but these relations get corrected not only by computable perturbative corrections from hard gluons, but also by power corrections of O(Λ/m B ) (and higher). These power corrections must be modeled on the basis of dimensional estimates. Moreover, a choice must be made to determine ξ || and ξ ⊥ from non-perturbative input (typically obtained from light-cone sum rules). This is done by identifying these soft form factors with (combinations of) full form factors, and thus setting the corresponding power corrections to zero. There are several possible choices ("schemes") for this identification, and we assessed the role played by the scheme prescription for the accuracy of the SM predictions for B → K * µ + µ − observables.
We showed that, in the absence of further information on the correlations among form factors, the choice of scheme has an impact on the theoretical uncertainties for predictions. Uncertainties for observables can easily be overestimated by choosing an inappropriate choice of scheme, for instance if soft form factors are identified with full form factors playing little to no role in the computation of these observables. We demonstrated the origin of this scheme dependence in a pedagogical way and derived analytic formulae for the contribution from power corrections to the most important optimized observables P 5 , P 2 and P 1 . We further showed that a fit of power corrections for the scheme used in Ref. [12] to the form factor input from BSZ [18] yields uncertainties associated to power corrections in agreement with the generic 10% dimensional estimate as expected. We compared predictions for P 5 with uncorrelated power corrections and soft form factors to those where correlations are assessed from BSZ form factors, and we established that the main source of correlations among form factors comes from the symmetry relationships in the m b → ∞ limit, whereas the correlations among power corrections are subleading effects. Our findings disprove claims of significantly larger uncertainties from factorisable power corrections made in Refs. [29,33].
Concerning non-factorisable QCD corrections related to long-distance charm loops, the problem to disentangle NP from a non-perturbative QCD effect is more complicated, although a handle is provided by the expected non-trivial dependence on the squared dilepton-mass q 2 of the charm loop (and on the initial and final hadrons). The Wilson coefficient C 9 can be written in the particular case of B → K * µ + µ − as C eff 9 i (q 2 ) = C eff 9 SMpert. + C NP 9 + C cc 9 i (q 2 ), where i labels the transversity of the lepton pair. The perturbative SM and NP contributions are accompanied by a long-distance charm loop contribution C cc 9 i (q 2 ). In our analysis in Ref. [12] we included the partial LCSR computation from Ref. [17] as an estimate of the order of magnitude of the functions C cc 9 i (q 2 ). Recently, in Ref. [21] several fits of the C cc 9 i (q 2 ) to B → K * µ + µ − measurements were performed and it was claimed that the data favoured a q 2 -dependent contribution rather than a universal shift in C 9 . We re-analysed these claims and stressed that the q 2 -dependence observed in some of the fits in Ref. [21] was actually due to imposing a pure SM constraint from Ref. [17] at very large recoil, skewing the fit and generating an apparent q 2 -dependence to get a better agreement with data at higher q 2 . Moreover, we pointed out a mismatch in the identification of Ref. [21] to the results of Ref. [17]: the real parametrisation used in Ref. [17] is matched to the modulus of the complex parametrisation adopted in Ref. [21].
We further stress that a potential q 2 -dependence cannot be inferred from considering only the deviation of a single quantity among the large number of parameters entering the fits (as done in Ref. [21]). The relevant issue consists in the improvement of the quality of the fit when going from the hypothesis of a constant C 9 (NP-like contribution) to the hypothesis of a q 2 -dependent C cc 9 i (hadronic contribution). Using the polynomial parametrisation of Ref. [21] and the framework of Ref. [12], we have performed the corresponding analysis using a frequentist statistical approach. We considered only B → K * µ + µ − data, removed long-distance contributions estimated from Ref. [17] and introduced a polynomial parametrisation describing charm-loop contributions with parameters to be fitted. We assumed either the SM value for the Wilson coefficients or we took C µ,NP 9 = −1.1, we used different form factors and approaches, and we even considered a fit including all available data on other b → sµ + µ − and b → se + e − channels. In none of the scenarios there is a motivation to go beyond the linear order in the polynomial parametrisation (corresponding to a q 2 -dependence closely equivalent to a constant contribution to C 9 ): even if in some cases one may get fits with quadratic terms different from zero, the improvement compared to the linear case is completely marginal 12 . These findings show that there is currently no indication for a non-trivial q 2 -dependence for the C 9 contribution 13 , disfavouring an explanation of the B → K * µ + µ − anomalies via non-factorisable QCD effects corresponding to a charm-loop contribution with a pole at q 2 = m 2 J/ψ . Although we did not find an indication for underestimated hadronic uncertainties affecting the extraction of C 9 from global fits, we would like to stress that it is important to assess also potential NP contributions to other Wilson coefficients, whose interpretations in terms of short-distance physics are not affected by hadronic uncertainties. Indeed, the high sensitivity of a large set of observables to the Wilson coefficient C 9 pointing to a large tension with its SM value may have hidden contributions from the remaining semi-leptonic Wilson coefficients. Even if a global fit may constrain all Wilson coefficients simultaneously, some observables in specific regions may prove better adapted to track specific coefficients different from C 9 and potentially very interesting in terms of NP. In particular, we have discussed how P 2 for B → K * µ + µ − at very low q 2 could provide further information on the Wilson coefficient C 10 . A deviation from SM expectations for this observable can only be explained by NP in C 10 , which cannot be mimicked by SM hadronic effects: charm-loop contributions to C 7 are constrained to be small from the comparison of inclusive and exclusive b → sγ decays, whereas C 9 contributions are suppressed for this observable in this kinematic range. Interestingly, a positive NP contribution to C 10 could improve the agreement between data and theory in the very low q 2 region. This approach complements the one presented in Ref. [11], which dealt with the case where lepton-flavour universality is violated (as suggested by the observables R K , R(D), R(D * )): two observables B 5 and B 6s provide then clean information on (C µ 10 − C e 10 )/C e 10 (with no pollution from C 7 ).
We conclude with the obvious remark that the observation of deviations in optimized and lepton-flavour-violating observables like Q i = P µ i − P e i would be an unambiguous signal of New Physics, rendering the discussion on hadronic explanations in Refs. [29] and [21,22] irrelevant. A first step in this direction, albeit with a still limited statistical significance, is provided by the very recent results of the Belle experiment [10], which suggest that P 5 would agree with the SM for electrons but disagree for muons, in the same direction as global fit results [12,[14][15][16]. Such exciting results call for more measurements from both LHCb and Belle-II collaborations in order to exploit the full potential of b → s + − transitions in the search for New Physics.  Table 8: Uncertainties on the SM prediction for P 5 in the long bin [1, 6] GeV 2 : the first row gives the parametric and form factor uncertainty added in quadrature, the second row provides the error from factorisable power corrections. In the BBD14 case [41], only the total error size is given for nominal priors, suggesting that this number should be taken as an upper bound of the subset of errors discussed here.
A Parametric and soft form factor errors for B → K * predictions In this article, we have focused mainly on two sources of uncertainties: (factorisable) power corrections and (non-factorisable) charm-loop contributions. For completeness, we discuss here the size of other error sources computed in different articles, CJ12 [29], DHMV14 [20], BSZ15 [18] and BBD14 [41], considering an observable predicted in all papers: P 5 [1,6] . Given that parametric and form factor errors are not separated in some of the papers we will add them in quadrature for this comparison. The result is shown in Tab. 8 where also factorisable power correction errors in this bin are given. The parametric and soft form factor errors in DHMV14 [20] were computed by performing a random flat scan of all relevant parameters (masses, decay constants, renormalization scale ...) within their uncertainty, keeping all other parameters (form factors, power corrections) to their central values. Then the observables are computed at each point of the scan and their error bars were obtained in DHMV14 [20] computing the difference between the extreme values obtained for the observables in the scan with respect to the central value of the observable. The corresponding scan of parameters in BSZ15 [18] yields smaller errors than the ones in DHMV14 due to the much smaller uncertainties of the form factor inputs and the Gaussian treatment of all errors in BSZ15. Let us also remark that the total error in BBD14 [41] in the nominal-prior evaluation is in the same ballpark as the one in DHMV14.
On the other hand, it is interesting to notice that the parametric uncertainty (including form factors) in CJ12 [29] is 2 to 3 times larger than the one in DHMV14 [20] and BSZ15 [18], respectively. This issue is independent of and adds to the inflation of errors associated to factorisable power corrections by a factor of 2 due to the choice of scheme, as is discussed in Sec. 3.2 and can be seen in the second row of Tab. 8. Let us also mention that in a subsequent article (CJ14, Ref. [33]) by the same authors, the total error for the same bin increased by 40% with respect to to the previous prediction in CJ12. In a later article from the Belle collaboration [7], the prediction for the same quantity, provided by one of the authors of CJ12 and CJ14, got an uncertainty reduced by 60% compared to CJ14 (see Table VI of Ref. [7]). Unfortunately the absence of a precise error budget in Refs. [7,33] prevents us from exploiting the corresponding results for our comparison. Moreover, we are not in a position to explain the origin of the 40% increase and subsequent 60% decrease in these two articles, which is unfortunately not commented on in either case.
One might suspect that the origin of this large difference between the error attached to parametric and soft form-factor uncertainties in DHMV14, BSZ15 and BBD14 on one side and CJ12 and CJ14 on the other could be the error attached to the soft form factor. However, the uncertainty for ξ ⊥ (0) = 0.31 ± 0.04 in CJ14, estimated by considering only the central values of different form factor determinations, is even significantly smaller (by a factor around 4) than the one for ξ ⊥ (0) = 0.31 +0. 20 −0.10 in DHMV14 from the calculation in Ref. [17].
In summary, in addition to the inflated power correction error related to an inappropriate choice of scheme, discussed in Sec. 3.2, we conclude that the analysis of the parametric errors in CJ12 is at odds with the results of three different groups (DHMV14, BSZ15, BBD14).

B Predictions for R K * in various scenarios
As discussed in the introduction and in section 5.3, it is of utmost importance to have observables able to test lepton-flavour universality. Among this type of observables, R K [8] and the recently measured Q 5 [10] are already providing very interesting information. Following the structure of R K one can construct observables with similar capacities for other channels. Because of the anomalies observed in the B → K * µµ mode [4][5][6], the observable R K * = B B→K * µ + µ − /B B→K * e + e − [13] becomes a natural candidate to analyse. In this appendix, we provide our predictions for R K * in three different bins both in the context of the SM and considering several NP scenarios suggested by global fits [12].  Table 9: Predictions for R K * = B B→K * µ + µ − /B B→K * e + e − in the SM and various NP scenarios.