Non-local matrix elements in $B_{(s)}\to \{K^{(*)},\phi\}\ell^+\ell^-$

We revisit the theoretical predictions and the parametrization of non-local matrix elements in rare $\bar{B}_{(s)}\to \lbrace \bar{K}^{(*)},\phi\rbrace\ell^+\ell^-$ and $\bar{B}_{(s)}\to \lbrace \bar{K}^{*}, \phi\rbrace \gamma$ decays. We improve upon the current state of these matrix elements in two ways. First, we recalculate the hadronic matrix elements needed at subleading power in the light-cone OPE using $B$-meson light-cone sum rules. Our analytical results supersede those in the literature. We discuss the origin of our improvements and provide numerical results for the processes under consideration. Second, we derive the first dispersive bound on the non-local matrix elements. It provides a parametric handle on the truncation error in extrapolations of the matrix elements to large timelike momentum transfer using the $z$ expansion. We illustrate the power of the dispersive bound at the hand of a simple phenomenological application. As a side result of our work, we also provide numerical results for the $B_s \to \phi$ form factors from $B$-meson light-cone sum rules.

To this end, several intrinsically non-perturbative hadronic matrix elements must be calculated reliably. The main contributions to these amplitudes come from the semileptonic and electromagnetic dipole operators, and they are proportional to hadronic matrix elements of local quark currents. These matrix elements, which can be expressed in terms of "local" form factors, are very similar to the ones appearing in semileptonic decays mediated by charged currents. Beyond these terms, there are also contributions from four-quark and chromomagnetic dipole operators, which require the calculation of hadronic matrix elements of the T -product of these local operators and the electromagnetic current [18][19][20][21]. These matrix elements can be expressed in terms of "non-local" form factors and are considerably more complicated to compute than the local form factors.
The decay amplitudes for any of these processes can be written as 1 where q 2 is the invariant squared mass of the lepton pair and L µ V (A) ≡ū (q 1 )γ µ (γ 5 )v (q 2 ) are leptonic currents. We have included explicitly the effects of the semileptonic operators O 7 , O 9 and O 10 but suppressed contributions from other local semileptonic and dipole operators that are not relevant in the SM, as well as from higher-order QED corrections. Nevertheless, the decomposition in Eq. (1.1) is exact in QCD. All non-perturbative effects are contained within the local and non-local hadronic matrix elements F B→M (T ),µ and H B→M µ , defined as where j em µ = q Q qq γ µ q with q = {u, d, s, c, b}. In Eq. (1.4) we retained only the terms containing the operators O 1 = (sγ µ P L T a c)(cγ µ P L T a b) and O 2 = (sγ µ P L c)(cγ µ P L b) , (1.5) which have large Wilson coefficients in the SM. The contribution of these terms is commonly called the "charm-loop effect". The contributions of all the other WET operators are suppressed by small Wilson coefficients and/or by subleading CKM matrix elements. In the literature, the non-local contributions to Eq. (1.4) are sometimes included through a shift of the Wilson coefficients C 7 and C 9 . The resulting effective Wilson coefficients C eff 7,9 (q 2 ) become both process-and q 2 -dependent [20]. In this work we prefer not to use C eff 7,9 (q 2 ) to keep the non-local contributions explicitly separated from the local ones.
The local and non-local form factors F B→M λ,(T ) (q 2 ) and H B→M λ (q 2 ) are the invariant functions of a Lorentz decomposition of the matrix elements The structures S λ define our conventions for the form factors and are given in Appendix A.
The non-local contributions H B→M µ are currently the main source of theoretical uncertainty in the predictions ofB → M + − observables. All theoretical calculations of H B→M µ rely on some form of Operator Product Expansion (OPE), which allows to expand the non-local Tproduct in terms of simpler operators. Depending on the q 2 value, the two relevant OPEs are: The OPE is carried out in terms of local operators of the forms(0)D α . . . D ω b(0) [23,24]. The matching conditions are known to next-toleading order in QCD [25][26][27][28][29][30] and the corresponding matrix elements are related to the local form factors.
The calculation of the non-local contributions in the region below the open-charm threshold, that is for q 2 14 GeV 2 , then proceeds in three steps: 1. Calculation of the LCOPE matching coefficients up to the desired order (both in the QCD coupling and in the LCOPE power counting). This calculation must be carried out for q 2 values that ensure rapid convergence of the LCOPE, which is the case for q 2 −1GeV 2 .
2. Calculation of the hadronic matrix elements of the operators that emerge in the LCOPE using non-perturbative methods.
3. Analytic continuation of the LCOPE results from the point of calculation to q 2 values that correspond to semileptonic decays, i.e. for 4m 2 ≤ q 2 4M 2 D .
The matching coefficients of the leading (local) operators of the LCOPE are the same as the ones in the local OPE. The next (subleading) order in the LCOPE involves lightcone operators with the insertion of a single soft gluon field. These contributions have been previously considered in Refs. [11,31] where the matching has been computed at leading order in α s .
The matrix elements of the leading operators are related to the local form factors, which have been computed using both lattice QCD and light-cone sum rules (LCSRs) with uncertainties of 10% or less [22,[32][33][34][35][36]. The matrix element of the first subleading operator with a soft gluon field has been calculated in the framework of LCSRs with B-meson light-cone distribution amplitudes (B-LCDAs) [11].
The more formal ones involve dispersion relations [11,12] or analyticity [16]. Both approaches have the advantage of providing parametrisations that are consistent with QCD, and their parameters can be determined from both theory and data [16,42,43]. However, the rate of convergence of these analytic expansions is not well understood, which makes it difficult to assign a truncation error to the approach.
The purpose of this paper is to provide improvements on each of the three steps described above. In Section 2.1, we review the calculation of the matching coefficients of the subleading operator in the LCOPE, giving a cleaner representation of the result. In Section 2.2, we recalculate LCSRs for the matrix elements of the non-local subleading operator, employing for the first time a complete set of B-LCDAs and updating the values of several crucial inputs. When putting together these results, we find that the subleading contributions are two orders of magnitude smaller than the previous calculation [11]. Our results significantly reduce the uncertainties on the non-local contributions. In Section 3, we improve on the parametrization of Ref. [16] and derive for the first time the dispersive bound on the non-local form factors H B→M λ (q 2 ). This bound allows us to constrain the possible effect of truncated terms in the analytic expansion. Our conclusions follow in Section 4. A series of appendices contains details on the definition of the hadronic matrix elements in Appendix A, the calculation of the matching coefficients to subleading order in the LCOPE in Appendix B, the outer functions needed for the dispersive bound in Appendix C, and our results for the B s → φ form factors in Appendix D.

Subleading Contributions in the LCOPE
We perform a LCOPE to calculate the time-ordered product in Eq. (1.4). This is achieved by expanding the charm-quark propagators near the light-cone, i.e. for x 2 0. The first two terms in this expansion are [44] The leading power in the LCOPE, which coincides with the leading power of the local OPE, reads 2 The matching coefficients ∆C 7 and ∆C 9 have been computed to next-to-leading order in QCD [25][26][27][28][29][30]. At the leading order, one finds with y(q 2 ) = 4m 2 c /q 2 > 1. The next-to-leading power term in the LCOPE was discussed for the first time in Ref. [11]. In Section 2.1, we review the calculation of its matching coefficient at leading order in α s . In Section 2.2, we recalculate the corresponding non-local matrix elements using LCSRs with B-LCDAs.

Matching Condition for the Subleading Operator
To obtain the next-to-leading power term in the LCOPE, one has to expand one of the two charm-propagators at next-to-leading power in x 2 . This expansion introduces a gluon field G στ (ux), as one can see from the second line of Eq. (2.1). Following Ref. [11], we use the translation operator to rewrite G στ (ux) as To make the calculation simpler it is convenient to introduce the light-like vectors n µ ± , which satisfy the following identities: where v µ is the four-velocity of the B meson. As anticipated in the previous section, to ensure the convergence of the LCOPE one has to impose that 4m 2 c − q 2 Λm b . Hence, we fix The quantities ∆C 7 and ∆C 9 are the same as in Ref. [30]. Now, we can decompose the covariant derivative in Eq. (2.5) in light-cone coordinates: As shown in Ref. [11], the contributions arising from the (n − · D) are further power-suppressed and hence can be neglected. See also Refs. [45,46]. We reproduce the form of the subleading operator as in Eq. (3.14) of Ref. [11]: This allows us to express the next-to-leading power in the LCOPE as Here α s ≡ α s (µ c ), with µ c a perturbative scale, and ω 2 is the n − light-cone component of the gluon momentum, which is related to the variable ω of Ref. [11] through ω = ω 2 /2. We also abbreviate C KMPW The matching coefficientĨ µρστ is a four-tensor that depends on the gluon momentum ω 2 n − and the momentum transfer q. We reproduce the expression for I µραβ ≡ − 1 2 ε αβ στĨ µρστ in Eq. (3.15) of Ref. [11] as well. Nevertheless, a few comments oñ I µρστ are in order: • To leading order in α s Eq. (2.10) only receives contributions from the charm electromagnetic current. Thus, we factored out the charm electric charge in the r.h.s. of Eq. (2.10), rendering the definition ofĨ µρστ consistent with Ref. [11].
• To leading order in α s , the matching coefficientĨ µρστ is finite in the limit ε → 0. Hence, it cannot depend explicitly on the renormalization scale µ to this order. Any residual µ dependence emerges only from the use of scale-dependent quantities; here, from the use of the charm quark mass m c . This behaviour is reflected in the fact that the matching coefficient of the LCOPE term at leading power already compensates the running of the semileptonic and electromagnetic dipole coefficients C 9 and C 7 to order α s .
• The form in which I µραβ is presented in Ref. [11] is explicitly dependent on µ, in apparent contradiction to the previous comment. However, after integrating over u, this explicit scale dependence is removed. We therefore prefer to present the matching coefficient in such a way that the explicit scale dependence does not appear in the first place. Details on the manipulations to achieve this are presented in Appendix B.
Thus, we write the subleading matching coefficient in the form Here, we approximate the square of the momentumq µ = q µ − v µ uω 2 /2 asq 2 q 2 − uω 2 m b and adopt the convention ε 0123 = +1. Even though Eq. (2.11) has a slightly different form with respect to the matching coefficient presented in Ref. [11], we emphasize that the two results are analytically equivalent. The form Eq. (2.11) has also been used to calculate the numerical results of Ref. [11] in a non-public Mathematica code 3 . Therefore, we fully confirm the analytic results of Ref. [11] for the tensor-valued matching coefficient due to soft-gluon interaction at subleading power in the LCOPE.

Calculation of the Non-Local Hadronic Matrix Elements
We proceed to calculate the hadronic matrix elements of the operator using LCSRs with B-LCDAs. To derive the sum rule, we start defining the correlator where j Γ ν ≡q 1 Γ ν s is the interpolating current, withq 1 = {ū,d,s} depending on the decay channel. The Dirac structure Γ ν of the interpolating current is chosen such that the respective M -to-vacuum matrix element does not vanish. We use (2.14) The light-cone sum rule calculation of the matrix elements ofÕ µ (q) is based on a LCOPE of the correlator Eq. (2.13) in the framework of heavy quark effective theory. Here, the momentum q is fixed according to the considerations in the previous section. For a fixed q, we now need to ensure that k 2 is chosen appropriately, i.e. that the expansion of Eq. (2.13) is dominated by bi-local operators with light-cone dominance. We find that for −k 2 ∼ a few Λ 2 the correlator is dominated by contributions at light-like distances y µ (y · n − ) n µ + 2 such that 3 We are very grateful to Yu-Ming Wang for sharing this code with us. y · q (y · n − )4m 2 c /m b [11]. Therefore, we can expand the strange quark propagator near the light-cone as well, keeping only the leading power term. We obtain where a, b are spinor indices. The non-local B-to-vacuum matrix elements can be expressed in terms of B-LCDAs. While in the sum rules of local form factors the leading contribution involves only two-particle B-LCDAs, the leading contribution in Eq. (2.15) comes from the three-particle B-LCDAs. The three-particle distribution amplitudes of the B-meson can be defined as [47,48] where we have suppressed the arguments of the B-LCDAs, e.g. ψ A ≡ ψ A (ω 1 , ω 2 ), for brevity. The derivatives, which act only on the hard-scattering kernel, are abbreviated as ∂ µ ≡ ∂/∂l µ , with l µ = (ω 1 + uω 2 )v µ . In addition, we introduced the shorthand notation where ψ 3p represents any of the three-particle B-LCDAs appearing in Eq. (2.16). In Ref. [11] only the B-LCDAs ψ A , ψ V , X A , and Y A have been taken into account. The distribution amplitudes of Eq. (2.16) have no definite twist, which is defined as the difference between the dimension and the spin of the corresponding operator. It is important to express the B-LCDAs in Eq. (2.16) in terms of B-LCDAs with definite twist, to ensure a consistent power counting. The relations between the twist basis and the basis of Eq. (2.16) are given in Ref. [48]. Inverting these relations, one obtains (using the same nomenclature as in Ref. [35]) where the subscripts 3, 4, 5, 6 indicate the twist of the respective B-LCDA. We include in our calculation contributions up to twist four with the models given in Section 5.1 of Ref. [48].
The models for B-LCDAs of twist five or higher are not currently known. However, they "are not expected to contribute to the leading power corrections O (1/M B ) in B decays" [48].
To proceed with the extraction of our sum rule, we need to obtain the hadronic dispersion relation of the correlator (2.13). This can be done by inserting a complete set of states with the appropriate flavour quantum numbers in between the interpolating current j Γ ν and the operatorÕ µ : where the sum runs over all the possible polarizations. The function ρ Γ µν (s, q 2 ) is the spectral density, which encodes the information about the continuum as well as the exited states, while s h denotes the continuum states threshold. The matrix elements of the interpolating current are expressed in terms of the decay constants f P for P = K and f V for V = K * , φ: The matrix elements of the operator are decomposed in terms of the scalar valued functions V B→M λ : where the definitions of the Lorentz structures S λ are given in Appendix A.
The expressions for the non-local form factors H B→M λ then read where the ellipses stands for higher powers in the LCOPE and spectator scattering interactions. The definitions of the form factors F B→M λ are given in Appendix A as well.
The last step in the calculation of the sum rule is to match the LCOPE result of Eq. (2.15) into the hadronic representation. To get rid of the contributions of the excited and continuum states of Eq. (2.18), we exploit the semi-global quark-hadron duality approximation. We also perform a Borel transform to reduce the impact of potential quark-hadron duality violations. In order to isolate the individual contributions of the functionsṼ B→M λ , we select a suitable Lorentz structure PṼ µν . Hence, we decompose the two-point functions in terms of scalar-valued functions ΠṼ (k 2 , q 2 ): The LCOPE results for the functions ΠṼ can always be written in the form where we have introduced the new variable σ = ω 1 /M B and defined For ease of comparison, we give our results in the basisÃ,Ṽ 1 ,Ṽ 2 ,Ṽ 3 as in Ref. [11] instead The relations between these two bases read is the Källén function. For convenience, we have also introduced the functionṼ We can now write down the sum rule for any of the quantitiesṼ =Ã,Ṽ 1 ,Ṽ 2 ,Ṽ 23 , which readsṼ with IṼ n ≡ JṼ n /NṼ . We abbreviate σ 0 ≡ σ(s 0 , q 2 ), s (σ, q 2 ) ≡ ds(σ, q 2 )/dσ, and the differential operator d dσ Here s 0 is the effective threshold s 0 of the sum rule, which differs in general from the continuum threshold s h . The functions IṼ n can be represented as integrals of the three-particle B-LCDAs consists of a universal part and a structure dependent part: The quantities PṼ µν , NṼ , and KṼ 2 are listed in Table 1, while the coefficients C (Ṽ ,ψ 3p ) n,r of Eq. (2.32) are provided in an ancillary Mathematica file.
We do not expect to find full agreement between our results and those of Ref. [11] for the matching coefficients CṼ ,ψ 3p n,r . The reason is that our results are expressed in terms of the full set of three-particle B-LCDAs as discussed in Ref. [48], while the results of Ref. [11] use an incomplete set of Lorentz structures and B-LCDAs. For the calculation of local form factors, this issue is not numerically relevant, since the three-particle contributions are numerically small compared to the leading twist and even the next-to-leading twist two-particle contributions; see also the discussion in Ref. [35]. For this particular LCSR calculation, the two-particle contributions are absent and hence the three-particle contributions are numerically leading.
Our main results can be summarized as follows: • Restricting our results to the same set of Lorentz structures and independent threeparticle B-LCDAs as in Ref. [11], we find full agreement with the results of that paper.
• Using the full set of three-particle B-LCDAs, the thresholds setting procedure of Ref. [49] produces results that are compatible with the thresholds obtained for the local form factors in Ref. [35]. This is not the case when restricting our analytical expression to the subset of Lorentz structures as discussed in the previous point.
• Our final results are one order of magnitude smaller than in Ref. [11], when using the same input parameters as in that paper. This difference becomes even larger when using up-to-date inputs, as explained in detail in the next subsection. We find that this reduction in size arises from cancellations across the Lorentz structures, since the "new" structures enter the coefficient functions with opposite signs. Consequently, the phenomenological impact of the soft-gluon contribution to the non-local matrix elements is significantly reduced in the region where the LCOPE is applicable.

Numerical Results
The values of the parameters used in our numerical analysis are collected in Table 2. For the B → K and B → K * transitions, these parameter coincides with the ones used in Ref. [35]. In particular, we employ the same effective thresholds s 0 given in that paper. This ensures consistency when using both local and non-local form factors in a simultaneous phenomenological analysis.
For the B s → φ transition, we need additional parameters that are not discussed in Ref. [35]. We follow the procedure outlined in Ref. [54] to estimate the first inverse moment 1/λ Bs,+ of the leading twist B s -LCDA. This estimate agrees with the recent calculation carried out in Ref. [57]. The values of the parameters λ 2 Bs,E and λ 2 Bs,H , which enter in the models of B s -LCDAs, are assumed to be equal to λ 2 B,E and λ 2 B,H , respectively. Given the large uncertainties of these latter parameters, we expect potential SU (3)-flavour symmetrybreaking effects to be negligible. To determine the effective threshold for the LCSRs in the B s → φ transition, we first calculate the local B s → φ form factors using the analytical results of Ref. [35]. In fact, these results can be employed to predict the local form factors for any B → V transition. We can then set the effective threshold applying the procedure described in Ref. [49]. Our numerical predictions of the local B s → φ form factors are given in Appendix D. Note that this is the first calculation of these form factors using LCSRs with B-LCDAs.
For all the transitions considered, we vary the scale of the charm quark mass in the MS scheme between m c itself and 2m c and the Borel parameter in the interval 0.75 GeV 2 < M 2 < 1.25 GeV 2 , as in Ref. [11]. We verified that in this Borel window the tail of the LCOPE result is much smaller than the LCOPE result integrated between 0 and σ 0 . Moreover, the sum rule dependence on M 2 is mild (< 6% in the Borel window considered here) and negligible compared to the parametric uncertainties in our calculation.
We can now evaluate the sum rule (2.30) for B → K ( * ) and B s → φ transitions. The computer code needed to obtain our numerical results will be made publicly available under an open source license as part of the EOS software [58]. Our predictions forÃ,Ṽ 1 ,Ṽ 2 , andṼ 3 are shown in Table 3. For the B → K ( * ) transitions we also compare our results with Ref. [11], while the results for the B s → φ transition are calculated for the first time. One can easily observe that our results are roughly two orders of magnitude smaller than in Ref. [11]. As explained in the previous subsection, one order magnitude can be attributed to the different treatment of the three-particle B-LCDAs between the two papers. The remaining difference is due to the updated input parameters used in our numerical analysis, in particular to the values of λ 2 B,E and λ 2 B,H . These parameters enter as the normalization of the three-particle B-LCDAs, and have therefore a large impact on the overall size of the hadronic matrix elements. In Ref. [11] the approximation λ 2 B,E = λ 2 B,H is adopted, which is not justified by calculations of these parameters [22,55,59]: In Ref. [11] it is also assumed that λ 2 B,E = 3 2 λ 2 B,+ , based solely on the desire that the exponential model for the leading-twist B-LCDA satisfies exactly the Grozin-Neubert relations [59]. This assumption yields a central value for λ 2 B,E approximately 20 times bigger than the one found in its most recent calculation [55]. We do not use this rather strong assumption, and use the calculated values instead.
We emphasize that although the relative uncertainties of our results listed in Table 3 are similar to the ones in Ref. [11], our absolute uncertainties are much smaller.  4 The uncertainties of the local form factors can be further reduced by using combined fits to lattice QCD and LCSR results [35,36].
In anticipation of the next section, we keep only the contributions proportional to the charm quark electric charge Q c in the above results. Their uncertainties are negligible compared to the ones of the local form factors. The findings of Ref. [11] imply that the next-to-leading power contribution in the LCOPE could be larger than the leading power contribution in the computation of the non-local form factors H B→M λ . This has been cause for concern about the rate of convergence of the LCOPE even at spacelike momentum transfer. One of the main findings of our work is that the contribution at next-to-leading power is in fact negligible compared to the theory uncertainties of the leading-power term. The authors of Ref. [11] come to a different conclusion, due to the missing terms in the calculation of the hadronic matrix element. As a consequence, theoretical predictions of the H B→M λ are dominated by the leading-power of the LCOPE, i.e. stem from the first two terms in Eq. (2.24). Therefore, our findings thoroughly eliminate the concern and give confidence that a precise theoretical prediction in the spacelike region is now possible.

Dispersive Bound
The results for the non-local form factors H B→M λ of Section 2 need to be analytically continued from the spacelike region of q 2 , where they are obtained, to the timelike region, where they are required for phenomenological studies. This requires a suitable parametrization of the hadronic matrix elements. Previous parametrizations based on series expansions involve an uncontrollable truncation error [11,13,15,16]. In the case of local form factors, this problem is solved by imposing dispersive bounds, which provide control over the systematic truncation errors by turning them into parametric errors. However, no such bound has been derived for  non-local matrix elements as of yet.
The purpose of this section is to derive the dispersive bound for the non-local matrix elements for the first time. To this end, we construct a parametrization of these matrix elements that manifestly satisfies a dispersion relation derived from the total cross section of e + e − → bsX. We obtain the dispersive bound by matching two representations of the discontinuity due to bs on-shell states of a suitable correlation function Π: Here, the operators O µ (q; x) and O †,ν (q; 0) are defined as 5 The discontinuity Disc bs Π satisfies a subtracted dispersion relation: Here n is the yet-to-be-determined number of subtractions and Q 2 is the subtraction point, chosen so that an OPE can be performed. We first isolate the contribution to Π that stems exclusively from the bs cut in Section 3.1, thereby determining Disc bs Π using a local OPE. We then derive the hadronic dispersion relation for χ in terms of the non-local hadronic matrix elements in Section 3.2. We then use our knowledge from the previous two sections to construct a parametrization of the non-local hadronic matrix elements that manifestly fulfils a dispersive bound on its parameters in Section 3.3. We finally present a practical application of the dispersive bound in Section 3.4.

Calculation of the Discontinuity in a Local OPE
We now calculate the contributions to the discontinuity of the correlation function (3.1) that exclusively arise from bs-flavoured intermediate states.
To simplify this task, we use the lowrecoil OPE for the operators in Eq. (3.2) well within its region of applicability, that is for Here, d is the mass dimension of the local operators O µ d,n , while n labels the different operators with the same mass dimension. The Wilson coefficients of operators of dimension d scale as The first few operators in this expansion read [24] (3.5) The Wilson coefficients of the leading dimension-three operators are given by 2c (q 2 ) , 1c (q 2 ) + C 2 F 2c (q 2 ) , where we use the same definition of f LO and F (7,9) 1c,2c as in Ref. [30], thereby only retaining the contributions proportional to the charm quark electric charge Q c . Here and in this section we use α s ≡ α s (m b ). Already in the calculation of C 3,1 at leading order in α s one encounters UV divergences in dimensional regularization. Hence, the coefficients C 3,j are always understood to be renormalized [25].
A few comments are in order regarding the OPE: • The Wilson coefficients of the operators of dimensions 4 and 5 start at O (α s ); • operators of dimensions d = 3, 4 interfere with each other, but these interference terms arise only at order α s m s /m b ; • the operators at dimension d ≥ 5 do not interfere with the ones at mass dimension d = 3 or d = 4 to leading-order in α s .
Based on the above, we adopt the power counting ε 2 ∼ Λ had /m b ∼ α 2 s . Thus, up to corrections of order ε 3 , we can express the discontinuity of Π as Disc bs Π OPE (s) = |C 3,1 (s)| 2 Disc bs Π 1,1 (s) + |C 3,2 (s)| 2 Disc bs Π 2,2 (s) where we have used the short hand notation It is instructive to investigate the perturbative expansion of Disc bs Π OPE by writing Π i,j (s) = (3.9) To LO we find the compact expressions Here λ kin ≡ λ(m 2 b , m 2 s , s). The NLO expressions have been calculated in the context of the gauge boson self-energies [60]. The one needed here is given by where Im Π + T (s) has been calculated in Ref. [60]. Inserting these results in Eq. (3.3) we find that at least two subtractions (n = 2) are needed in Eq. (3.3), and thus (3.14) For

Hadronic Dispersion Relation
By means of unitarity, the discontinuity of the correlation function Π can be expressed in terms of a sum of sesquilinear combinations of hadronic matrix elements: The one-body contributions to Disc bs Π involveB * s -to-vacuum matrix elements of the nonlocal operators. While we do not include these contributions here, they can be easily accounted for in future works. The two-body contributions to Disc bs Π arise from intermediateBK, BK * ,B s φ and further bs states that also include baryons, such as Λ bΛ . Their contributions to Disc bs Π can be expressed as follows: Here H b and Hs denote hadrons with flavour quantum numbers B = −1 and S = 1, respectively, and the two-body phase space measure is given by Since we work in the isospin limit, the contributions due toB 0 K ( * )0 and B − K ( * )+ are identical. Hence, we simply multiply theB 0 K ( * )0 = B − K ( * )+ ≡BK ( * ) contributions by a factor of 2. Keeping only the contributions due toBK ( * ) andB s φ, and using Eq. (A.10), we find 3 32iπ 3 Disc bs Π had (s) = The non-local matrix elements H B→M µ develop a series of branch cuts starting at q 2 = 4M 2 D , that is below the (M B + M M ) 2 threshold. Although they do not contribute to Disc bs Π, they still spoil the analyticity of the non-local form factors H B→M λ in the semileptonic region 0 ≤ q 2 ≤ (M B − M M ) 2 , which is the phenomenologically interesting one. This makes the derivation of the bound for non-local matrix elements considerably more complicated than the one for local matrix elements (see e.g. Ref. [62]). In particular, it implies that the coefficients of the Taylor expansion in the variable z of the non-local form factors do not fulfil a dispersive bound. In the next section, we show that appropriately chosen functions of z, which fulfil a non-trivial orthogonality relation on the integration domain, cure this problem.

Derivation of the Bound
We start by matching the OPE result onto the hadronic representations of χ(Q 2 ) -defined in Eq. (3.3) with n = 2 -by means of global quark-hadron duality: We then use Eq. (3.19) to rewrite Eq. (3.20) as a dispersive bound on weighted integrals of the hadronic matrix elements: Following the usual procedure to obtain dispersive bounds of the parameters of the hadronic matrix elements [63], we define the map Here s + is the lowest branch point of the matrix element and s 0 can be chosen freely in the open interval (−∞, s + ). In our case, we have s + = 4M 2 D rather than (M B + M K ( * ) ) 2 as in the case for the local B → K ( * ) form factors. Using this map and the fact that z = e iα on the unit circle, we obtain + further positive terms , (3.23) where the integral limits are given by with s XY defined previously below Eq. (3.18). The central improvement of this paper is the change of the parametrization discussed in Ref. [16] to one that fulfils a dispersive bound. As in that paper, we remove the dynamical singularities of the non-local form factors H B→M λ using the Blaschke factor (3.25) Here z ψ = z(s = M 2 ψ ) is the location of the two narrow charmonium poles in the complex z plane. These are the only poles on the open unit disk. In addition, to formulate the bound in a concise form and to avoid kinematical singularities, we introduce suitable outer functions φ B→M λ (z) [64]. These outer functions are defined such that on the integration domain their modulus squared coincides with Eqs. (C.1)-(C.3) and they are free of unphysical singularities inside the unit disk. The precise form of these functions and their derivation is provided in Appendix C. We can then define the functionŝ 27) which are analytical on the open unit disk.
At this point, we can express the dispersive bound as The next step is to find a basis of orthogonal functions on an arc of the unit circle covering angles −α XY to +α XY . In lieu of a closed formula, we construct the first three orthonormal polynomials p X→Y n as p X→Y .
The higher order polynomials can be determined using an orthogonalization procedure. For an in-depth review of the mathematical properties of these orthogonal polynomials on the unit circle, we refer to Ref. [65]. The practical considerations of this application of the orthogonal polynomials are discussed in Ref. [66].
Using the orthogonal polynomials, we can now expand The dispersive bound then takes the simple form In that case, the z monomials constitute a complete and orthonormal basis of polynomials on the integration domain, which is the unit circle in the z plane. We have seen that the integration domain for the non-local form factor case is only an arc of the unit circle, due to the appearance of DD and similar branch cuts below theBM thresholds. As a consequence, the orthonormal polynomials in this integration domain are the ones given in Eq. (3.29). While these polynomials are clearly much more complicated than the z monomials, they allow us to write the dispersive bound in the diagonal form shown in (3.31). An inconvenient feature of the p B→M n polynomials is that their magnitude increases for n → ∞ in the semileptonic region. Nevertheless, since the series in Eq. (3.30) is convergent for |z| < 1 due to the analyticity ofĤ B→M λ , this only implies that the coefficients a B→M λ,n must fall off sufficiently fast such that higher order terms in the series are suppressed.
In the next subsection we present a simple application of the bound in Eq. (3.31) to H B→K 0 . We remark that it is also possible to expand theĤ B→M λ functions in terms of z monomials using the same Blaschke factors and outer functions give here. However, the coefficients of that expansion do not satisfy any dispersive bound.

Application toB →K + −
We now explore some of the implications of the dispersive bound (3.31). Considering only thē B →K + − contribution to the bound, we havê where we have assumed that the series expansion is truncated at n = N . Depending on the value of N , the bound sets a global constraint on the size of |Ĥ B→K 0 |. This is shown in the left panel of Figure 1, where the constraints for N = 0, 1, 2 are shown in yellow, green and blue, respectively. In order to express the result as a function of q 2 , we have taken s 0 = 0.
Of course, one may include the theoretical calculation based on the LCOPE at negative q 2 . In order to see how this information impacts the constraints on the size of |Ĥ B→K 0 |, we impose that H B→K 0 takes the central values quoted in Table 4 at q 2 = −1 GeV 2 and q 2 = −5 GeV 2 . At this point we need to make a choice on the value of the subtraction point Q 2 . Here we take Q 2 = −m 2 b in the outer functions. In the case N = 2, these two theory constraints fix two independent (complex-valued) combinations of a B→K 0,1,2 , leaving one complex free parameter. This free parameter is then constrained by the dispersive bound, and leads to the black region shown in the left panel of Figure 1. This black region could be regarded as an estimate of the truncation error when using two theory data points at negative q 2 to fix the N = 1 series expansion ofĤ B→K 0 (q 2 ). In relative terms, one may consider the ratio of the resulting allowed values forĤ B→K 0 (q 2 ) with N = 2 to the theoretical curve for the N = 1 expansion fixed by the two theory data points. This is shown by the black region in Figure 2. according to the dispersive bound on the expansion with one (orange), two (green) and three (blue) coefficients. The black region shows the allowed region for the three-coefficient expansion including two theory constraints at q 2 = −1 GeV and −5 GeV. Right: The same, assuming allB →K ( * ) + − andB s → φ + − modes contribute equally to the bound. In this case the region that includes the theory constraints is shown in red.  Figure 1). This example does not take full power of the dispersive bound, in particular by neglecting the fact that other modes (e.g.B →K * + − andB s → φ + − ) also contribute to the bound. An analysis that takes this into account will lead to simultaneously correlated bounds for all H B→K 0 (q 2 ), H B→K * λ (q 2 ) and H Bs→φ λ (q 2 ), and thus it is beyond the scope of this section. In order to estimate what the impact of adding these other modes might be in a simple setting, we can make the reasonable simplifying assumption that all eleven modes in Eq. (3.31) contribute equally to the bound, and thus take N n=0 a B→K n 2 < 1 11 (involves assumption) . (3.33) The corresponding bounds are shown in the right panel of Figure 1, and the red region in Figure 2. In this case the constraint is tightened by more than a factor of two. To finish, it is interesting to see what these bounds look like at the amplitude level in comparison to the contribution from C 9 . To that end, we consider the quantity , (3.34) which is defined in such a way that The corresponding limits on the magnitude of ∆C B→K 9,had (q 2 ) in the same circumstances as those in Figure 2 are shown Figure 3 (left panel). One can see that the non-local contribution cannot exceed (in magnitude) the SM value for C 9 (m b ) 4 for q 2 3 GeV 2 , or even for q 2 5 GeV 2 in the simplified scenario where the K * and φ modes are included.
The raise of the bound for q 2 → 9 GeV 2 is due to the fact that ∆C B→K 9,had (q 2 ) contains a pole at q 2 = M 2 J/ψ . In this sense it may be convenient to consider instead the combination P(z) × ∆C B→K 9,had (q 2 ) where the narrow charmonium poles have been removed. This is shown in the right panel of Figure 3. This quantity is also directly related with the B → Kψ n amplitudes, at q 2 = M 2 ψn , and thus the experimental measurement of these amplitudes can be used to constrain further the non-local form factor [11,12,16]. In particular, with the conventions of Ref. [16], . These experimental data points are shown in the right-hand plot of Figure 3, and are situated well within the dispersive bounds, as required. Including this experimental information in the determination of H B→K 0 (q 2 ) is rather important [11,12,16], since it essentially turns the extrapolation from the spacelike to the timelike region into an interpolation, up to undetermined strong phases that are not fixed by the non-leptonic amplitudes. The result of adding these two charmonium data points on the allowed ranges for |∆C B→K 9,had (q 2 )| is shown by the dashed region in the left-hand plot of Figure 3. In this case, only the theory data point at q 2 = −1 GeV has been kept, since otherwise the system would be overconstrained. Again, the resulting region is situated well within the dispersive bound.
A more detailed and complete phenomenological study of the implications of the dispersive bound and the determination of the non-local contributions toB → M amplitudes is left for future work.

Summary and Conclusions
The contributions from four-quark effective operators are an essential part of the exclusivē B (s) → {K ( * ) , φ} + − andB (s) → {K * , φ}γ amplitudes. These contributions must be under reasonable theoretical control in order to derive solid conclusions from the measurements of such decay observables. However, they enter the decay amplitudes through a non-local matrix element of non-perturbative nature, which is very difficult to calculate with controlled uncertainties. In this work we have revisited this non-local effect and made progress at two fronts.
First, we have recalculated the main subleading effect beyond the local OPE contribution, which arises from a soft gluon coupling to a quark loop. This recalculation involves a lightcone sum rule with B-meson light-cone distribution amplitudes (B-LCDAs), and improves upon the only previous calculation of this quantity by including the full set of B-LCDAs up to and including twist-four. Our reanalysis leads to a result for this soft-gluon effect that is two orders of magnitude smaller than the previous calculation. A substantial part of the difference is due to cancellations arising from the inclusion of the B-LCDAs that were missing. The remaining difference is due to updated inputs.
Second, we have revisited the analytic continuation of the non-local effect from the LCOPE region (where it is calculated) to the physical region relevant for B decays. In particular, we have proposed a modified analytic parametrization of the non-local matrix element and derived a dispersive bound that constrains this parametrization. The combination of our new parametrization with the dispersive bound allows for the first time to control the inevitable systematic truncation error from which every existing parametrization of the matrix elements suffers.
Our results lead to a better understanding of the non-local contributions to decay modes such asB →K ( * ) + − andB s → φ + − , which are currently under intense experimental and theoretical scrutiny. An important question is whether the anomalies observed in these modes are due to physics beyond the SM. Our ability to answer this question hinges on our ability to bound poorly known QCD effects that could potentially be responsible for the discrepancies. We can now say that the first subleading correction to the hadronic non-local contribution is very small in the decays considered here, giving support to theory calculations that neglect this subleading effect. In addition, our dispersive bound sets a solid ground for the analytic continuation of calculations from the LCOPE region to the physical one. Future phenomenological applications of the results presented here will lead to more accurate global analyses of b → s + − data.

Acknowledgements
We are very grateful to Alexander Khodjamirian, Sebastian Jäger, Roman Zwicky and Yu-Ming Wang for helpful discussions. We also would like to acknowledge helpful discussion with Christoph Bobeth, who participated in the first steps of this project. The

A Definitions of the Hadronic Matrix Elements
Following Ref. [16], we use a common set of Lorentz structure S λ to decompose both the local and the non-local hadronic matrix elements emerging in B → P (seudoscalar) and B → V (ector) transitions. In this work, we restrict ourselves to the B → K, B → K * , and B s → φ transitions, but the considerations of this appendix also apply to, e.g., the B s → K and B s → K * transitions.

A.1 B → P Transitions
For B → P transitions there are two independent Lorentz structures S λ µ in the decomposition of the matrix elements. Here µ is a Lorentz index and λ = 0, t denotes either longitudinal or timelike polarization of the underlying current. The structures read Using these structures, we decompose the B → P matrix elements into form factors as follows: Here and throughout this article we suppress the argument of the local and non-local form factors, which are functions of the momentum transfer squared: F B→M λ ≡ F B→M λ (q 2 ) and H B→M λ ≡ H B→M λ (q 2 ). The relations between our local form factor basis and the traditional basis of form factors (see, e.g., Refs. [35,56]) read The non-local form factor H B→P 0 defined in Eq. (A.4) is related to the non-local form factor H B→P defined in Ref. [11] through

A.2 B → V Transitions
For B → V transitions there are four independent Lorentz structures S λ αµ in the decomposition of the matrix elements. Here α and µ are Lorentz indices and λ =⊥, , 0, t denotes the different polarization of the underlying current. The structures read where λ kin ≡ λ(M 2 B , M 2 V , q 2 ) is the Källén function. We decompose the local and non-local B → V matrix elements as The minus signs in front of F B→V and F B→V 0 have been introduced to ensure that the form factors are all positive in the semileptonic phase space; the signs in front of H B→V and H B→V 0 then follow. The relations between our local form factor basis and the traditional basis of form factors (see, e.g., Refs. [35,56]) read The non-local form factor H B→V 0 defined in Eq. (A.10) is related to the non-local form factor H i defined in Ref. [11] through (A.12)

B Matching Coefficient at Subleading Power
The matching coefficient for the next-to-leading power of the LCOPE of correlator (2.2) was computed for the first time in Ref. [11]. The result was written in the form I µραβ (q, ω 2 ) = 1 8π 2 1 0 du ūq µqα g ρβ + uq ρqα g µβ −ūq 2 g µα g ρβ dI(q 2 ) dq 2 −ū − u 2 g µα g ρβ I(q 2 ) , (B.1) where I(q 2 ) = 1 0 dt ln µ 2 m 2 c − t(1 − t)q 2 , It is convenient to rewrite the function I(q 2 ) as Since the part of I µραβ proportional to I(q 2 ) is multiplied by (ū − u), the term ln µ 2 m 2 c vanishes after integrating over u. In addition, using the identity In this work we adopt the convention ε 0123 = +1.
Here s 0 is a parameter of the s → z mapping (3.22). The values of the parameters a, b, c, d of Eq. (C.5) for the outer functions considered in this work are listed in Table 5.