Long distance effects in inclusive rare $B$ decays and phenomenology of $\bar{B}\to X_d \ell^+\ell^-$

Rare inclusive $B$ decays such as $\bar{B}\to X_{s(d)} \ell^+\ell^-$ are interesting probes for physics beyond the Standard Model. Due to the complementarity to their exclusive counterparts, they might shed light on the anomalies currently seen in exclusive $b \to s$ transitions. Distinguishing new-physics effects from the Standard Model requires precise predictions and necessitates the control of long distance effects. In the present work we revisit and improve the description of various long distance effects in inclusive decays such as charmonium and light-quark resonances, nonfactorisable power corrections, and cascade decays. We then apply these results to a state-of-the-art phenomenological study of $\bar{B}\to X_d \ell^+\ell^-$, including also logarithmically enhanced QED corrections and the recently calculated five-body contributions. To fully exploit the new-physics potential of inclusive flavour-changing neutral current decays, the $\bar{B}\to X_d \ell^+\ell^-$ observables should be measured in a dedicated Belle II analysis.


Introduction
Since the Higgs discovery at the LHC in 2012 [1,2] completed the particle content of the Standard Model (SM) of particle physics, no new fundamental degrees of freedom have been discovered in direct searches for physics beyond the SM (BSM). The current situation therefore underlines the importance of indirect searches for BSM particles via virtual effects. The latter requires precision studies of low-energy observables, most prominently in quark and lepton flavour physics.
Inclusive flavour-changing neutral current (FCNC) decays of B mesons provide a perfect environment for this kind of program for several reasons. First, FCNC decays are especially sensitive to potential BSM effects because they proceed through loop-suppressed electroweak interactions in the SM. Second, the necessary precision can be achieved on both the theoretical and experimental side. Theoretically, inclusive FCNC B-meson decays can be reliably predicted using an Operator Product Expansion (OPE), in which non-perturbative effects appear as corrections to the partonic rate at inverse powers of the heavy b-quark mass.
The theory approach to inclusive FCNC decays is in this sense somewhat different compared to exclusive ones; in particular, the underlying hadronic uncertainties in inclusive modes are largely independent of those in exclusive transitions. Hence, one useful way to shed light on the nature of the anomalies currently outstanding in exclusive B decays at various experiments [3][4][5][6][7][8][9][10][11][12][13][14][15] is a cross-check via the corresponding observables in the inclusive modes. Indeed, a study on the combined new-physics sensitivity clearly revealed the synergy and complementarity of exclusive versus inclusive FCNC decays [16].
The FCNC decays that have been studied most intensively are b → s transitions. The amplitude for these decays contains the three combinations of Cabibbo-Kobayashi-Maskawa (CKM) elements V * ts V tb , V * cs V cb , and V * us V ub . In an expansion in the Wolfenstein parameter λ ≈ |V us | 0.22 they start at orders O(λ 2 ), O(λ 2 ), and O(λ 4 ), respectively. Neglecting V * us V ub compared to the other two and using CKM unitarity, the b → s amplitudes are thus proportional to the single combination V * ts V tb . In b → d transitions the situation is different since V * td V tb , V * cd V cb , and V * ud V ub are all O(λ 3 ). This renders the size of the b → d rate about two orders of magnitude smaller compared to its b → s counterpart. On the other hand, the b → d unitarity triangle is non-degenerate. Trading V * cd V cb in favour for the other two via CKM unitarity, one obtains a piece proportional to V * td V tb (which is analogous to the V * ts V tb term in b → s transitions except for a replacement of the overall CKM factor) and a piece proportional to V * ud V ub that contains the effective operators P u 1,2 whose matrix elements are not CKM-suppressed in the b → d case.
While b → d + − decays have played little role in the program of flavour experiments so far because of their low statistics, they will become accessible in the Belle II era. A naive rescaling of the correspondingB → X s + − errors given in [16] (without taking detector efficiencies etc. into account) shows promising prospects for this decay at Belle II. Therefore, it would be worthwhile to carry out a dedicatedB → X d + − analysis at Belle II. Besides serving as a cross-check of inclusiveB → X s + − and exclusive b → d + − measurements it has the potential to yield important information on the phenomenon of CP violation.
On the theoretical side the latest phenomenological study ofB → X d + − dates back fifteen years [17]. Since it is based on short-distance partonic contributions only and includes neither power corrections nor effects from resonances, it is lacking a lot of features that are inherent to inclusive semileptonic FCNC decays. In view of the prospects on the experimental side, a new theory analysis ofB → X d + − including nonperturbative features is therefore timely.
In the theoretical description of inclusiveB → X d + − decays, many of the results obtained in inclusiveB → X s + − apply after trivial modifications. The short-distance partonic amplitude of the latter is known to NLO [18,19] and NNLO [20][21][22][23][24][25][26][27][28][29][30][31][32][33] in QCD, and to NLO in QED [34][35][36]. Power-corrections that scale as 1/m 2 b [37][38][39][40], 1/m 3 b [41,42], and 1/m 2 c [43] have been analysed. The contributions specific toB → X d + − decays are available from [17,44], where two-loop virtual and bremsstrahlung corrections involving P u 1,2 have been computed. Recently, also contributions from multi-particle final states at leading power have been calculated analytically [45]. In the present work we derive the logarithmically enhanced QED corrections to the matrix elements of P u 1,2 . Whereas the theoretical prediction of the branching ratio in the low-q 2 region is well under control and a precision below ∼ 8% − 10% can be achieved the same quantity in the high-q 2 region suffers from large uncertainties of O(40%) due to the failure of the heavy-mass expansion near the kinematic endpoint: The partonic rate tends to zero while the local 1/m 2 b and 1/m 3 b power corrections within the heavy mass expansion approach a finite, non-zero value. It was found in [46,47] that the expansion is effectively in inverse powers of m b (1 − √ s min /m b ) and depends on the lower dilepton mass cut s min . Therefore, only integrated observables are meaningful in the high-q 2 region. In practice the large uncertainty originates from poorly known HQET matrix elements of dimension five and six operators that scale as 1/m 2 b and 1/m 3 b , respectively. In the present work we obtain their values and uncertainties from analyses of moments of inclusive charged-current semi-leptonic B [48] and D decays [49]. We emphasize that the precision of theoretical predictions for semileptonic FCNC decays in the high-q 2 region would greatly benefit from further studies and lattice calculations of these HQET matrix elements. In order to reduce the uncertainties from 1/m 2 b and 1/m 3 b corrections, it was proposed in [42] to normalise thē B → X s + − rate to the inclusive semi-leptonicB 0 → X u ν rate with the same dilepton mass cut. Subsequent phenomenological analyses showed indeed a pronounced reduction of the uncertainties forB → X s + − [35,36] and we confirm this behaviour forB → X d with a uū component such as ρ and ω are also relevant. The resonance regions can be removed by appropriate kinematic cuts in the dilepton invariant mass squared q 2 . This leads to the so-called low-q 2 region 1 GeV 2 < q 2 < 6 GeV 2 and the high-q 2 region with q 2 > 14.4 GeV 2 . Whereas the low-q 2 region is only affected by the tail of the cc peaks, rather broad resonances are present in the high-q 2 region itself. One way of dealing with the resonances was proposed by Krüger and Sehgal (KS) [50,51]. They relied on the assumption that the cc loop and the b → s(d) transition factorise into two color-singlet currents, and used a dispersion relation to connect the electromagnetic vacuum polarisation, whose imaginary part is proportional to the hadronic R-ratio, to the b → s(d)cc → s(d) + − long distance amplitude. In the present work, we revisit, refine and improve the KS approach in several respects. We use all available data from BESII and BaBar on e + e − → hadrons as well as from ALEPH on τ → ν+hadrons for a precise description of the imaginary part of the vacuum polarisation. Moreover, we carefully investigate the impact of the choice of the subtraction point of the dispersive integral, and the replacement of the perturbative loop functions by the KS functions. Finally, we comment on the size of the uncertainties that originate from the KS integral and their impact on theB → X d + − observables.
It has been pointed out in the literature that color-octet production of charmonium resonances can be sizeable [52][53][54][55][56], and that the pure color-singlet treatment by the KS approach does not capture the full size of the ψ resonances. In the present article we further elaborate on the size and treatment of color-octet cc production and the impact on B → X s(d) + − observables. To cure the situation a purely phenomenological factor has been introduced [50] to reproduce the hadronic branching fraction B(B → ψX s ). However, as was already argued in refs. [30,35], the introduction of such kind of factor leads to a double-counting because nonfactorizable corrections due to a cc loop are already taken into account as one of the so-called resolved contributions in the low-q 2 region. These are nonlocal power corrections which occur when other operators than the leading ones are considered in the effective field theory. They indicate a breakdown of the local heavy mass expansion in Λ QCD /m b . Recently the factorisation of these nonlocal power corrections inB → X s + − was analysed within the soft-collinear effective theory (SCET) [57,58] by systematically computing these resolved contributions in the low-q 2 region. Furthermore, it was shown that in the high-q 2 region the dominating power contribution (due to operators beyond the leading ones), the nonfactorizable cc contribution, can be expanded in local operators again and can be treated along the lines of [43,59]. The systematic SCET analysis of the resolved power corrections in the low-q 2 region allows for another crucial observation: the resolved uū contribution vanishes at order Λ QCD /m b in CP-averaged observables in the low-q 2 region [57,58], which significantly reduces the uncertainties in theB → X d + − . For the CP asymmetry, these long distance effects dominate, making our theoretical prediction less clean. Irrespectively, the CP asymmetry remains an interesting observable because it might receive sizeable contributions from BSM effects. An additional long distance effect at low-q 2 comes from cascade decaysB → X 1 (cc → X 2 + − ) through the radiative decay of a narrow charmonium resonance such as η c , J/ψ, χ cJ etc. They form a background that has to be removed by suitable kinematic cuts. Inclusive radiative charmonium decays have been discussed in the context ofB → X s γ [43] and B → X s + − [54]. Here we revisit and systematically investigate the role these decays play as a background, as well as their dependence on a kinematic cut on the hadronic invariant mass M X . A cut on the hadronic mass M X might still be required on the experimental side to remove other sources of background at Belle II. The effect of such an M X cut was previously analysed inB → X s + − in [60][61][62]. However, the authors of ref. [63] indicated a conceptual problem in those analyses and the authors of refs. [57,58] showed that the assumption made in refs. [60][61][62] that the photon virtuality in the low-q 2 region scales as a hard mode in SCET is problematic since the kinematics in the presense of an M X cut implies the scaling of q 2 as (anti-)hard-collinear in the low-q 2 region. This leads to a different matching and power counting, as well as to the existence of resolved contributions within SCET. It was shown in refs. [57,58] that the resolved contributions represent an irreducible uncertainty even in the absence of an M X -cut. The results of the numerical analysis of these corrections, as given in [57,58], are used in the phenomenological part of the present paper. Finally, we emphasize that our predictions are given for the case without a hadronic mass cut, leaving such a study for future work. This article is organised as follows. In sections 2 and 3 we define theB → X d observables under consideration and give master formulas for their phenomenological implementation, respectively. Section 4 contains our study of long distance contributions such as the qq resonances, cascade decays, and resolved contributions. Sections 5 and 6 contain the extraction of our input parameters and the phenomenological results, respectively. We conclude in section 7. Appendix A contains the expressions for certain two-loop functions, while appendix B contains those of the logarithmically enhanced QED corrections.

Definition of the observables
In this work, we consider the CP-averaged branching ratio, forward-backward asymmetry and the CP asymmetry of the inclusiveB → X d + − decay. Additional angulardistribution observables [64] are left for possible future studies in case they become accessible experimentally. Alike for the inclusiveB → X s + − decays, appropriate kinematic cuts have to be taken in order to remove the large peaks of the cc resonances. Here we focus on two regions, the low dilepton mass region 1 GeV 2 < q 2 < 6 GeV 2 and the high dilepton mass region q 2 > 14.4 GeV 2 .
The definitions of the differential decay width dΓ/dq 2 and of the differential forwardbackward asymmetry dA FB /dq 2 are given by where z = cos θ and θ is the angle between the + and the B meson in the dilepton rest frame. The differential forward-backward asymmetry dA FB /dq 2 is related to the angulardistribution observable H A (q 2 ) by [64] dA FB To keep consistent with [36], we give the master formula for H A (q 2 ), from which dA FB /dq 2 can be derived. We will calculate the normalized forward-backward asymmetry A FB integrated in a region q 2 m < q 2 < q 2 .
The integrations in eq. (2.4) are performed numerically.
In the high-q 2 region we include hadronic power corrections up to O(1/m 3 b ). As we will show, similar to the b → s case, the uncertainties on these power corrections dominate in that region. These uncertainties can be significantly reduced by normalising theB → X d + − decay rate to the semileptonicB → X u ν decay rate with the same q 2 -cut [42]: We note that the above quantities are all CP averaged. In addition, we also calculate the normalized integrated CP asymmetry, defined by 3 Master formulas for the observables As emphasized earlier, the inclusiveB → X d + − decay distinguishes itself fromB → X s + − since the u-quark current-current operators are not CKM suppressed and have to be taken into account. The effective Lagrangian is as follows [65,66], where and q = u, d, s, c, b and l runs over the three charged lepton flavours. Similar to the analyses forB → X s + − [34,35], we make a double expansion iñ , and in the squared amplitude retain terms up to O(α 3 s κ 3 ). In addition, we normalize our observables to the inclusiveB → X c eν decay, As mentioned above, among the angular observables H T,A,L [64] we consider only the branching ratio (I = B = T + L) and the forward-backward asymmetry (I = A). In the following, we give the expressions forB → X d + − only. From these, the CP averaged quantities and the CP asymmetry can be trivially obtained. Here [32,67] and Φ u is defined by [34] Γ Explicitly [36], where the O(α 2 s ) are taken from [68]. Here, n h = 2 and n l = 3 are the numbers of heavy and light quark flavours, respectively, and β (5) 0 = 23/3 is the one-loop QCD β-function for five active flavours. We include explicitly the power-suppressed 1/m 2 b terms λ 1 , λ 2 , the 1/m 3 b terms ρ 1 and the four-quark matrix element f u . These matrix elements are defined and discussed in more detail in section 5.
The dimensionless function Φ + − (ŝ) arises from the matrix elements of all the operators and is written as where i, j = 1u, 2u, 1c, 2c, 3 . . . 10, 3Q . . . 6Q, b. The low-scale Wilson coefficients C eff i are given explicitly (both analytically and numerically) in [34] and C eff i are unequal to C i only for i = 7, 8. To be specific, we use: The different CKM prefactors are given by . For the braching ratio, 9,10 (3.11) For I = A, only the interference terms contribute: (3.12) The matrix elements M i will be discussed in section 3. For the relevant combinations, we find (3.14) The one-loop QCD functions for the branching ratio, ω N M,B (ŝ), are given by eqs. (127) and (129) - (131) in [34] (see also [18,19,23,69]) and the non-vanishing function for H A is given in eq. (A.2) of [36] (see also [27,64]). The two-loop QCD function ω (2) 99,B (ŝ) = ω (2) 1010,B (ŝ) is given by where β (5) 0 = 23/3. The ω =ŝ expanded result for the low-q 2 function X 2 (ω) is given in eq. (60) of [70] and X 0 (ω) in eq. (2) of [71]. The δ = 1 −ŝ expanded results for the high-q 2 functions X 2 (δ) and X 0 (δ) are given in eq. (2) and (3) of [72]. Note that the normalization of the X i are different by a factor of 2 between [70,71] and [72]. We have checked the consistency between eq. (3.15) and the fit results for two-loop QCD functions ω (2) 99,T (ŝ) and ω (2) 99,L (ŝ) in eq. (A.3) of [36]. For the two-loop QCD function ω (2) 910,A (ŝ) for H A , we use the fit result given in eq. (A.3) of [36], which was extracted from the fully differential calculation of the inclusiveB → X u ν decay at two loops in QCD [73]. Other two-loop QCD functions such as ω 1 +α s κ f pen 9 (ŝ) 0 0 i=10 0 0 1 The plus distribution can be defined via These expressions have been checked to be consistent with [36-38, 40, 64]. The O(1/m 3 b ) corrections to the forward-backward asymmetries are missing, but fortunately we are only concerned about the forward-backward asymmetries in the low-q 2 region, where the O(1/m 3 b ) corrections are negligible. The quantities ∆H I ij contain additional corrections that can be parameterized as where b I ij represent finite bremsstrahlung corrections that appear at NNLO and are given in [17,25] for I = B and in [17,28] for I = A. In addition, c I ij (u I ij ) are the non-perturbative c(u)-loop power corrections, while e I ij are the ln(m 2 b /m 2 )-enhanced electromagetic corrections and f I ij are five-body contributions. We discuss these contributions in the following subsections.

Matrix elements
The matrix elements entering the master formula in eq. (3.11) and (3.12) are obtained from one-loop penguin contractions of the four-fermion operators. They are given by   Table 2. Coefficients ρ q i and γ i occurring in the four-quark operator matrix elements in eq. (3.23). and f pen 9 (ŝ) = 8 ln Here y a = 4(m 2 a,pole − iη)/q 2 with an infinitesimally small, positive quantity η that takes care of the correct analytic continuation. Compared to the widely-used one-loop function g(y a ) in the literature, we introduce h(y a ) = g(y a ) + (8/9) ln(m b /m a ) here. Contrary to previous studies forB → X s + − , we split the coefficients ρ i into their different quark flavour contributions ρ q i , collected in table 2. These numbers reduce to those presented in table 7 of [34] once the distinction between the light quark flavours is given up and h(y a ) is traded in for g(y a ). The perturbative one-loop function is for y a < 0, which holds for massive particles, i.e. a = b, c, τ . The harmonic polylogarithm of weight "+" simply reads H + (z) = ln 1+z 1−z . For the light quarks u, d, s, this function reduces to The two-loop contributions F A i (ŝ) for the c-operators P c 1,2 valid in the low-q 2 region are given in [23] as an expansion inŝ up to O(ŝ 3 ). In the high-q 2 region they were first calculated in [30] using a semi-numerical method and analytically given in [31]. For the u-operators, which is the massless limit of the charm, analytic functions for all q 2 are available [44] where A(ŝ), B(ŝ) and C(ŝ) are listed eqs. (29) -(31) of [44]. In a recent study [33] (see also [74]), the analytical expressions of the two-loop functions F A i (ŝ) for both the up and charm cases have been worked out for arbitrary q 2 . As in our previous studies [34,35] we convert the pole masses of the bottom and charm quark perturbatively to a short-distance mass at the level of the squared amplitude in order to eliminate renormalon ambiguities.
When considering also non-perturbative corrections, the factorizable pieces of these perturbative corrections are replaced by their corresponding Krüger-Sehgal (KS) functions, which we discuss in detail in section 4.1. Contrary to previous works, we do not only replace the one-loop perturbative corrections h(y u,d,s,c ) by the KS functions, but also the factorizable pieces of the two-loop corrections F A i . These two-loop factorizable pieces, defined as h (1) q with q = u, d, s, c, can be found in [33] and are listed in Appendix A.

Resolved photon corrections at low q 2
Contrary to previous studies inB → X s + − [36], we do not use the local description of the nonfactorizable cc power correction of order O(Λ 2 QCD /m 2 c ) in the low-q 2 region [43,75]. Instead, we follow the recent analysis in [57,58] to adopt a more systematic approach to describe these so-called resolved power corrections, including the effect of non-local shape functions. We relegate the conceptual description of these contributions to section 4.2, and at this point quote the numerical result for the relevant contributions available up to order Λ QCD /m b [57,58]. The sum of the resolved contributions, including also an estimate of a numerically relevant term at quadratic order, leads to an additional uncertainty on the branching ratio of [−4.9, +5.1] %. We add this uncertainty to our numerical results in section 6.1. For the forward-backward asymmetry, the first nontrivial resolved contribution is of order Λ 2 QCD /m 2 b and yet unknown, which leads us to add an uncertainty of ±5% to our final result until an explicit estimate is available [76].
We emphasize that in the low-q 2 region the nonfactorizable uū power correction to CPaveraged observables vanish due to specific properties of the corresponding shape functions. This contribution previously represented the main uncertainty in b → d decay (for more details see section 4.2).

Nonfactorizable power corrections at high q 2
For the high-q 2 region the power corrections from nonfactorizable cc and uū loops are available. They can be described as a local power correction as shown in [43] and discussed in more detail in section 4.3. For the cc contributions the coefficients are given by [43] where r = q 2 /(4m 2 c ) > 1 for the high-q 2 region and [43] F (r) = 3 2r For the uū contribution, the results are obtained by taking the m c → 0 limit of the c I ij in eq. (3.33). Explicitly,

Logarithmically enhanced electromagnetic corrections
The functions e I ij describe the logarithmically enhanced electromagnetic corrections. For I = B, we find with a = u, c and j = 7, 9. The σ functions were introduced already in eq. (3.14). Exact analytical expressions are available for most of the ω (em) ij (ŝ) functions [34][35][36] and for completeness are listed in Appendix B. However, the operators P u 1 and P u 2 induce additional functions which were up to now not available. We derived these formulas following the methods discussed in [34], and listed them in Appendix B.

Five-particle contributions
The five-particle processes b → dqq + − at the partonic level also contribute to the in-clusiveB → X d + − decay. While similar contributions are CKM suppressed for the b → s transition, such five-particle contributions are at the same order in the Wolfenstein expansion compared to the partonic three-particle ones. The branching ratios and the forward-backward asymmetries of b → dqq + − have been calculated at tree level in [45]. Correspondingly, the f I ij functions in eq. (3.21) summarizing such contributions can be written as for the others , where the functions F ij (ŝ) and A i (ŝ) can be found in eqs. (31) and (37)

Long distance contributions and backgrounds
If only the operators P 7,9,10 in the effective Hamiltonian were considered, the local heavy mass expansion would be applicable toB → X s(d) + − observables integrated over the hadronic mass M X . Then, a local OPE would hold and the hadronic decay would be described in terms of the partonic decay plus local power corrections. Inclusion of operators other than P 7,9,10 introduces various other long distance effects, and the purpose of the present section is to categorize them forB → X s(d) [54]. These resonances are a long-distance feature of the partonic decay; they are not power suppressed. While these two resonances can be removed by appropriate q 2 -cuts, the persistence of higher charmonium resonances in the high-q 2 region renders a purely perturbative prediction unreliable there, even if the large 1/m b corrections are taken into account. Perturbation theory is likewise unreliable in the light-quark resonance region q 2 < ∼ 4 GeV 2 , affecting low- To incorporate the resonances into the phenomenological analysis the Krüger-Sehgal (KS) approach [50] is adopted [35,36]. It connects the factorizable part of the resonant amplitude to the hadronic vacuum polarization which can be extracted from e + e − → hadrons via a dispersion relation. In the present work we significantly improve the KS approach in several aspects. Following ref. [56] we use accurate interpolations of e + e − → hadrons data directly as opposed to parameterizations of the resonances, and in contrast to ref. [51] we show that in order to extract the uū, dd and ss correlators we need to use τ -decay data which projects out the u-quark vacuum polarization from the rest. We also investigate the uncertainties associated with the KS functions and find them to be small. Moreover, for the first time we properly combine resonant amplitudes and O(α s ) corrections. Lastly, we emphasize that the subtraction point of the dispersion relation must be chosen large and negative to avoid sensitivity to vacuum condensates. We thoroughly investigate all these points in section 4.1.
At this point we want to pick up the issue of color-octet production of charmonium resonances. It was pointed out in the literature that this production mechanism leads to sizeable effects [52][53][54]56], and that the pure color-singlet treatment of the KS approach does not capture the full size of the ψ resonances. In the case of the narrow J/ψ and ψ(2S) resonances, color octet effects are of course very important at the position of the resonances, but due to their sharpness are confined to the close neighborhood of the peaks. One can therefore expect the low-q 2 region to remain unaffected by these effects. The high-q 2 region is more delicate in this respect since one integrates over broad resonances. However, there the non-factorisable cc-resonances are included in the Voloshin term [43,75], which corresponds to a local power correction in the high-q 2 region (see below and section 3.3), as long as one considers integrals over sufficiently large dilepton invariant mass intervals. In that case one can -via global quark hadron duality -expect that the color-octet induced "wiggles" average out and are effectively taken into account by the partonic description of the Voloshin effect. In total, we reason that the color-singlet resonances are under control with the KS method and for the color-octet ones we correctly include their integral via the Voloshin term, thereby also avoiding double counting.
As pointed out above, the local heavy mass expansion breaks down if operators other than P 7,9,10 in the effective field theory are considered. This breakdown leads to nonlocal power corrections that can be described in the low-q 2 region within SCET using subleading shape functions [57,58]. In the high-q 2 region the Voloshin term mentioned already above can be expanded locally [43,75]. In sections 4.2 and 4.3 we review the essential conceptual steps which lead to this behaviour in the low and high-q 2 region, respectively, while the numerical impact of these findings were already given in sections 3.2 and 3.3.
Finally, in section 4.4 we emphasize that the charmonium cascade decaysB → X 1 (cc → X 2 + − ), where the total X d = X 1 + X 2 is measured, are not captured by other elements of our calculation and would form a large background in the low-q 2 region if not for the M X cut, which is very effective in removing them.

The Krüger-Sehgal approach
Under the assumption that the currents associated with the production of a vector hadronic system V in B → X d V and the subsequent electromagnetic decay V → + − factorize, the hadronization of the X d system following the electroweak decay is described by an OPE in Λ QCD /m b , while the lepton pair production is modified by the quark vacuum polarization amplitudes accessible in hadron spectroscopy experiments. Krüger and Sehgal (KS) used e + e − → hadrons data and a dispersion relation for B → X d + − applications [51], following similar work in B → X s + − [50,77]. Here we supplement the procedure with data from inclusive hadronic τ decays for the first time in a data-driven analysis. The correlation functions between each individual quark current and the electromagnetic current which couples to the leptons are needed. We define the following KS function for each flavour, normalized in accordance to their evaluation in perturbation theory, c.f. eqs. (3.25) and (3.26).
where Q q is the quark charge and the currents are The electromagnetic current in (4.1) guarantees that the correlator has a transverse structure according to the Ward identity. The contributions from the correlators J q 1 J q 2 between different quark flavours are systematically included here, although they are suppressed in perturbation theory at O(α 3 s ). The imaginary part of the photon vacuum polarization Π γ in is accessible in the inclusive cross section σ had for e + e − → hadrons, represented in terms of the hadronic R-ratio Similarly, the imaginary part of the charged vector current correlator Πū q in where J μ uq =ūγ µ q (q = d, s), is related to the vector spectral function which in turn parameterizes the nonperturbative effects in inclusive hadronic τ decays into In sections 4.1.

and 4.1.3, we explain under what approximations Im[h KS
q ] is obtained from R had and V 1q . In particular, the charge-weighted sum of (4.1) is fixed to experiment: Both the real and imaginary parts of h KS q appear inB → X d + − observables through interference effects with the short distance amplitudes. Therefore, the real parts are obtained through the subtracted dispersion relation where s 0 < 0. The subtraction point s 0 should be chosen sufficiently large and negative such that h KS q (s 0 ) is dominated by short distance fluctuations of the correlator (4.1) and can be reliably computed in perturbation theory. This is especially important for light quark loops, for which the perturbative matrix elements in (3.26) diverge at s = 0. In the following we choose s 0 = −(5 GeV) 2 to minimize the impact of higher order perturbative corrections which depend on log s 0 /µ 2 b . When replacing the perturbative functions by the KS functions there are a number of subtleties at higher orders in the coupling and power expansion. The KS functions encompass factorizable corrections to all orders in α s . Therefore, the KS functions should replace not only the one-loop but also by the n-loop factorizable perturbative contributions to avoid double counting. Factorizable QCD corrections are known analytically up to two loops (see [33] 1 and Appendix A). The procedure is schematically shown in the first line of  Figure 3. The hadronic R-ratio [78] and spectral function from hadronic τ decay [90]. Under isospin considerations, 3V 1d is to be compared with the isovector contribution to R had (from the ρ resonance) with the discrepancy due to the ω shown in the subfigure. Flavour SU (3) symmetry predicts R had = 4V 1d . For further details see section 4.1.3.

Experimental inputs
For the flavour threshold regions we use a compilation of all available data on the hadronic R-ratio [78], in which the data is provided in center of mass energy points with a total point-to-point covariance matrix. The BESII data [79] dominates the statistics in the charm threshold region. The broad oscillation at √ s ∼ 1.6 GeV due to the phase space enhancements of isobar processes including e + e − → ρ + ρ − → π + π − π 0 π 0 has been resolved by precise measurements of multi-body final states at BABAR [80][81][82][83][84][85][86][87][88][89]. The total nonstrange vector spectral function from τ decays is taken from ALEPH [90]. A compilation of the data is shown in figure 3. We do not use the strange spectral function because the vector (V) and axial vector (A) contributions are more difficult to distinguish experimentally in this case, and are currently only available in the form V+A. We supplement this data outside of the resonance region with the results of the program Rhad [91] [92]. The default decoupling scales µ c = 2m c , µ b = m b and µ t = m t are used, and the scale is varied between √ s/2 < µ < 2 √ s to estimate the effect of higher order corrections. In our datadriven approach, we integrate the data directly rather than fit it to a certain model for the resonances.
We note that a charmonium resonance can form from a vector current of light quarks in e + e − → hadrons through single photon or three gluon exchange. This mixing has a substantial effect for the CP asymmetry inB → X d ψ decays [93,94]. Since the QED correction contributes to the present calculation without logarithmic enhancement, we neglect it and the comparable QCD correction 2 and we expect in this case no major nonperturbative enhancement away from the ψ and ψ resonances.

Light quark resonances at low q 2
The most important new feature for the light quark resonances which we introduce in this paper is to include matrix elements J q J q q =q involving different light-quark currents at very low q 2 . As a consequence, R had alone is not sufficient to extract Im[h KS u,d,s (q 2 )]. The dominant contributions to the e + e − → hadrons OPE are from J q J q for q = u, d, s as they enter at O(α 0 s ). The leading power contributions to J q J q q =q (see left panel of figure 4) are O(α 3 s ) and therefore very small 3 . An expression for these contributions can be found in [91]. This is what lead the authors of ref. [51] to systematically neglect all terms 2 In [93] it was stated that three gluon exchange is comparable to single photon exchange because of the similarities of the branching fractions B(ψ → + − ) and B(ψ → ggg). 3 Moreover, in the SU (3) limit, the sum of all these contributions vanishes due to an exact cancellation among light quark charges.
with q = q in (4.1). Unfortunately, at low q 2 the OPE for e + e − → hadrons breaks down and we have to adopt a sum-over-hadrons picture. For instance, at very low q 2 ( < ∼ 1 GeV 2 ) the dominant hadronic final states are two and three pions -corresponding essentially to the ρ and ω resonances -and J u J d ∼ J u |ππ(π) ππ(π)|J d ∼ J u |ππ(π) ππ(π)|J u ∼ J u J u . At larger q 2 ( > ∼ 4 GeV 2 ) there is a proliferation of multiparticle intermediate states and the OPE result that J u J d ∼ 0 is recovered via dramatic cancellations between various exclusive final states [95,96] as confirmed by lattice-QCD calculations [97].
To quantify these effects, it is convenient to work in terms of a basis of neutral isospin currents where J 0 , J s are singlets under isospin and J 1 transforms as a vector. The correlation functions of these currents describe the propagation of the relevant degrees of freedom in the low energy resonance region: Note that if the six correlators Π ab were known exactly, the six correlators between quark currents J q 1 J q 2 and finally the KS functions (4.1) would be determined exactly through simple relations at the operator level. We note in particular that the electromagnetic current and the u−quark current are given exactly by (4.13) In the isospin limit, the correlators Π 10 and Π 1s vanish, (4.3) and (4.5) simplify to 14) and the KS functions simplify to Since the KS functions in the isospin limit depend on the four correlators Π 11 , Π 00 , Π ss , Π 0s and only two observables R had and V 1d are available, additional assumptions are required whose range of applicability depends on the energy. Below KK threshold, in addition to isospin symmetry, we assume that the hidden strange contributions to the final states π + π − and π + π − π 0 are small (Im[Π ss ] = Im[Π 0s ] = 0). Then the Krüger-Sehgal functions as well as R had and V 1d depend only on Π 00 and Π 11 , and inverting these equations yields the second column of  Table 3. Imaginary parts of the KS functions in various regions as determined from experimental data. One hasR had = 4π/9R had andV 1d = 4π/9V 1d .
The φ resonance, which we identify as the region between KK and KKπ thresholds, decays predominantly into π + π − π 0 and KK final states, and these contributions are understood as contributions to Π ss up to rescattering effects suppressed in Π 0s and further suppressed in Π 00 , which we both neglect. The isovector background dominated by the tail of the ρ is understood from the τ data, using the isospin correspondence R I=1 had → 3V 1d (see figure 3). This yields the third column of table 3.
Above KKπ threshold, all four correlators Π ab appearing in (4.15) are important. To proceed, we consider the consequences of enlarging the symmetry group to flavour SU (3), introducing the currents: where J 0 is an SU (3) singlet and J 3,8 transform as vectors (the subscripts refer to Gell-Mann matrix indices). In the flavour symmetry limit, the correlators Π 00 . Therefore flavour symmetry predicts R had = 4V 1d , to be compared with experiment (figure 3). The difference between R had and 4V 1d corresponds to the breaking of flavour symmetry, which is apparent but moderate. In the flavour symmetry limit the Krüger-Sehgal functions are independent of flavour, but there is a systematic error associated to the difference between R had and 4V 1d . We account for this by introducing standard normal variables δ u,d,s which are varied in the error analysis (see fourth column of  parameters are included in the error analysis. We investigated the dependence on the subtraction point s 0 and found it to be small.

Resolved contributions at low q 2
As discussed at the beginning of this section, the local heavy mass expansion breaks down if one includes operators beyond the leading ones in the effective field theory. One then finds nonlocal power corrections in the low-q 2 region which can be systematically analysed within soft-collinear effective theory (SCET). The resolved photon contributions to the inclusive decayB → X s(d) + − contain subprocesses in which the virtual photon couples to light partons instead of connecting directly to the effective weak-interaction vertex.
These resolved contributions of theB → X s + − decay were calculated in SCET in the presence of an M X cut to order 1/m b [57,58]. They can be represented as the convolution integrals of a jet-function, characterizing the hadronic final state X s(d) , and of a soft (shape) function which is defined by a non-local heavy-quark effective theory matrix element. The hard contribution is factorized into the Wilson coefficients. It was explicitly shown [57,58], that the resolved contributions stay nonlocal when the hadronic cut is released and thus, represent an irreducible uncertainty. The support properties of the shape function imply that the resolved contributions (besides the O 8g − O 8g one 4 ) are almost cut-independent. Within the inclusive decayB → X d + − , there are four resolved contributions at leading order in 1/m b for the decay rate, namely from the interference terms For the b → d case the resolved contributions need some obvious modifications compared to the b → s case which was calculated in Refs. [57,58]: The CKM parameter combinations λ s i = V * is V ib have to be replaced by λ d i = V * id V ib and s-quark fields have to be replaced by d-quark fields in the shape functions. These modifications only change the numerical results.
It is well-known that the O u 1 − O 7γ contribution is CKM-suppressed in the b → s case, but not in the b → d case. However, both in b → d and in b → s, this contribution from the u-quark loop vanishes within the CP averaged quantities at the order 1/m b as one can derive from the results given in Ref. [58]: If we start with the O c 1 − O 7γ contribution in eq. (6.3) of ref. [58] and consider the penguin functions given in eqs. (4.4) and (4.5) of that reference, which enter the jet function, we find in the limit m c → m u = 0 that the ω 1 integral reduces to The trace formalism of HQET (see ref. [98]) implies that Moreover, it is a consequence of PT invariance that g 17 is real. Thus, the integration of ω 1 leads to the result that the interference term O u 1 − O 7γ vanishes within the integrated CP averaged rate. This is a crucial result for all CP-averaged inclusive b → d + − quantities because previously no estimate for this up-quark loop of order Λ QCD /m b was available (see ref. [43]) and thus represented the main uncertainty in the inclusive b → d + − observables. Further insight into the moments of g 17 was recently given in [99].
The calculation of the other (nontrivial) resolved contributions given in refs. [57,58] starts with the explicit form of the shape functions as HQET matrix elements and derives general properties of those. One can then use various model functions which have all these properties to get conservative estimates of the resolved contributions by maximizing the value of the convolution integral of the subleading shape function with the perturbatively calculable jet function (for more details see ref. [58]). We are interested in the relative magnitude of the resolved contributions compared to the total decay rate. We finally get for the various contributions at order 1/m b in the b → s and b → d decays: Summing them up in a conservative way we arrive at It was found [57,58] that at leading order in 1/m b there is no resolved contribution to the forward-backward asymmetry. This starts at order 1/m 2 b only with an interference term of O c 1 − O 10 for example. Also the resolved O c 1 − O 9 term, contributing to the rate, only occurs at the subleading 1/m 2 b order. This is a consequence of the fact that the virtual photon is hard-collinear and not hard in the low-q 2 region as explicitly shown in refs. [57,58]. On the other hand, these 1/m 2 b terms might be numerically relevant due to the large ratio |C 9/10 | ∼ 13|C 7γ | of Wilson coefficients which necessitates their calculation [76].
Because of the opposite sign of C 9 compared to C 7 one can also expect the same behaviour of the For the first nontrivial resolved contribution to the forward backward asymmetry from the O c 1 − O 10 term at order 1/m 2 b we add an error of ±5% in our final result before an explicit estimate is available [76].

Nonfactorizable power contributions at high q 2
Power corrections due to operators beyond the leading ones also exist in the high-q 2 region. The only available pieces are the nonfactorizable charm-and up-loop diagrams of the fourquark operator P 1,2 with a soft gluon which interacts with the spectator cloud. However, in the high-q 2 region the dilepton mass q 2 is a hard momentum and any cut on the hadronic mass has no influence in the high-q 2 region. Thus, the kinematic situation is a different one compared to the low-q 2 region, in particular there is no nonlocal shape function involved. In this case the original treatment of Voloshin [59] is applicable which leads to a local expansion again [43].
Here we briefly recall the crucial issues of the calculational details presented in ref. [43]. The nonperturbative effect due to thesbγg vertex is represented by a form factor F which depends on the two variables r = q 2 /(4m 2 c ) and t = k · q/(2m 2 c ) where k denotes the soft gluon momentum (k 2 = 0) and q the virtual photon momentum. The form factor F (r +t, t) is given in eq. (4.28) of ref. [43]. One may expand F in powers of t . In the high-q 2 region q is a hard momentum (of order m b ) and k is a soft momentum. Thus, if m b Λ QCD /(2m 2 c ) is small the first term in the expansion about t = 0 can be regarded as dominating. Moreover, one may additionally expand the form factor also in 1/r which is of order 4m 2 c /m 2 b . The authors of ref. [43] then keep only the leading term in 1/r in each of the coefficients of t n and find This means that the leading corrections to the t = 0 result are suppressed by t/r = 2k·q/q 2 . An additional numerical test [43] suggests that the t = 0 term is the dominating one in the high-q 2 region. The concrete results for the leading 1/m 2 c term are given in section 3.3. If we consider the corresponding nonfactorizable contribution with an up-quark loop in the high-q 2 region, one finds that the leading term is of order Λ 2 QCD /q 2 and corrections are suppressed by powers of t/r ∼ Λ QCD /(2 q 2 ) [43]. The leading order results for the up-quark are also given in section 3.3.

Charmonium cascade backgrounds
Another long distance effect at low q 2 are the cascade decaysB → X 1 (cc → X 2 + − ) through the radiative decay of a narrow charmonium resonance η c , η c , ψ, ψ , χ cJ , h c or exotic XYZ state, collectively referred to as cc, as depicted in the left panel of figure 6. In contrast to the infamous ψ → + − , for example the decay ψ → η + − completely escapes the upper cut q 2 = 6 GeV 2 inB → X q + − : In the following we focus onB → X s (cc → X 2 + − ) with the understanding that the relative effect of the cascades onB → X d + − andB → X s + − are roughly the same due Table 4. Branching ratios of (direct) inclusive B-decay into charmonium, and of vector charmonium dilepton decay to light pseudoscalars. Numbers marked with † are NRQCD estimates, and unless otherwise stated are taken from the PDG [92].
to the CKM scaling of charmonium production: (4.25) Charmonium production from B-decays is reasonably well described by an expansion in the heavy quark velocity (NRQCD) [53,100] and has been investigated by several experiments, summarized in table 4. The inclusive spectra from cc → X + − are not yet available, although the decays ψ → (π 0 , η, η ) + − have been measured at BESIII [101][102][103], and happen to be the most important. We note that the dilepton decays between charmonium states [104][105][106] are not pertinent toB → X s + − because the leptons in this case come with invariant mass below the difference in charmonium masses, which is less than 1 GeV. Radiative and dilepton charmonium decays have been mentioned in the context of B → X s γ [43,54,107] andB → X s + − [43]. The background inB → X s + − could simply be subtracted by vetoing events where the two leptons and any permutation of light hadrons reconstruct any of the charmonium masses. However, only the direct leptonic decay of ψ and ψ were interpreted as background toB → X s + − at Belle [108,109] and BaBar [110]. This is problematic because there are also cascade decays of the typē B → X 1 ψ → X 1 X 2 + − that form a reducible background in the limit in which interference between the cascade and the genuine short distanceB → X s + − amplitudes is negligible.
On general grounds this interference is expected to be much smaller than the square of the cascade amplitude. If estimates of the cascade contributions are low enough, we can argue that interference effects can be neglected, implying that these cascades are a reducible background that can be either separately calculated and subtracted or experimentally removed. As we show below, cascades inB → X s + − satisfy this requirement, but only after taking into account the cut on the invariant mass of the X s system which is required experimentally to remove the double semileptonic background, as shown in the right panel of figure 6. In the rest of this subsection we show how to estimate the impact of the M X cuts on a generic cascade.
The relative momentum between X 1 and X 2 implies that the cascade events come with somewhat large total M X when the two systems are combined. Since it is already necessary to measureB → X s + − with an M X cut to remove the double semileptonic background, if cascade events were efficiently removed by this cut, then there would be no need for their further consideration. The invariant mass of the X system is given by where θ X is the angle between the X 1 and X 2 systems in the charmonium rest frame. It is interesting to consider the minimum value of M X 2 such that M X > M cut X for all q 2 and M X 1 . This is given in closed form by and corresponds to the extreme values θ X = 0, M X 1 = 0 and q 2 = 0. States with M X 2 heavier than (4.29) are completely removed by the M X cut. In the case of ψ-decays with M cut X = 2 GeV, M cut X 2 ∼ 1.2 GeV (the minimum mass M X 1 = M K and cut at q 2 = 1 GeV 2 causes this to be slightly smaller). The decays ψ → (π, η, η ) + − and nonresonant S-or Dwave ψ → 2π + − are therefore of interest, while the resonances ψ → (f 0 , f 2 ) + − are cut away. Inferring from the photon energy spectrum in ψ → 2πγ, nonresonant ψ → 2π + − is probably very small. Similarly h c → (η, η , 2π) + − and χ c1 → (ρ, ω) + − are of interest but they are suppressed by an order of magnitude compared with the ψ decays in the real photon case: The sequenceB → X 1 ψ , ψ → 2πψ, ψ → η(η ) + − is also of interest but the total M X from X 1 , 2π and η(η ) is largish. Finally, we observe that the branching ratio of ψ → π + − is about two orders of magnitude smaller compared to ψ → η(η ) + − (see table 4). Hence the conclusion is that the direct decay ψ → η + − and ψ → η + − dominate the background toB → X s + − from all charmonium radiative decays in the presence of a cut M X < 2 GeV. The helicity-projected rates for ψ → η(η ) + − normalized to the rate ψ → η(η )γ can be calculated from first principles in terms of a single q 2 -dependent form factor [114]. The angular distribution is simply given in terms of the polarization α as  Figure 7. The background from B + → K + (ψ → h + − ) with h = π, η, η on theB → X s + − phase space. The color bar corresponds to the branching ratio differential in dilepton mass ( √ s) and hadronic mass (M X ), in units of 10 −8 GeV −2 . The outlined box indicates the 1 GeV 2 < q 2 < 6 GeV 2 and M K < M X < 2 GeV cuts. A single-pole parameterization of the transition form factor [114] with the same Λ = 3.686 GeV for π, η, η was used to generate the plots. The fact that the cuts are less efficient for ψ → π is not problematic due to the small rate of this channel.
Due to the V−A coupling of the underlying transition b → scc,B → ψX 1 prefers the longitudinal polarization [115,116] with corrections quantified in NRQCD [117]; this fact reduces the background from the cascades because η(η ) and X 1 cannot be collinear through the dominant longitudinal polarization. The resulting distribution in the [q 2 , M X ] plane for the decays into η and η are presented in figure 7, where we take for simplicity a constant value α = −0.59 corresponding to the low M X 1 bin of the BaBar measurement: this causes the M X cut to be more efficient by about 20%. The contributions of the cascade into η and η to the totalB → X s + − branching ratio before any M X cut are 1% and 5%, respectively. After imposing M X < 2 GeV these effects are diluted to 0.05% and 0.0005%. Keeping in mind that the impact of the M X cut on the non-resonant decay is about 60%, we conclude that the experimental cut on M X completely removes any pollution from cascade charmonium decays. This conclusion persists as long as the M X cut is at most 3 GeV.

Inputs
The numerical inputs used in the phenomenological analysis are presented in table 5. Most of the quantities listed in the table have been determined with great accuracy and will not be discussed further. The required HQET matrix elements, on the other hand, necessitate a more in depth discussion. For our phenomenological study, we need the matrix elements of the following dimension five and six operators: [δf ] SU (3) = (0 ± 0.04) GeV 3 ρ 1 = 0.038(70) GeV 3 [48] [δf ] SU (2) = (0 ± 0.004) GeV 3 Table 5. Numerical inputs used in the phenomenological analysis. Unless specified otherwise, they are taken from PDG [92]. In order to avoid somewhat uncontrolled non-perturbative effects, we use the pole mass of the top quark obtained from cross section measurements. The CKM matrix elements have been obtained by propagating the uncertainties on the four CKM Wolfenstein parameters (λ, A,ρ andη) taken from the global fit as of Summer 2018 presented by the CKMfitter Group [118]. All HQET matrix elements are calculated between physical B mesons. Only the combination λ eff 2 ≡ λ 2 − ρ 2 /m b enters inB → X d,s + − . The annihilation matrix elements required , respectively, where we use the notation f q ≡ (f 0 q + f ± q )/2. As explained in the text we express them in terms of the valence and non-valence matrix elements, f V and f NV , and the flavour SU (3) and the isospin breaking differences [δf ] SU (3) and [δf ] SU (2) . 3) where a = 0, ± denotes the charge of the meson, q = u, d, s is the flavour of the spectator quark and [75] The leading matrix elements λ 1,2 and (ρ 1,2 , f a q ) scale as m −2 b and m −3 b respectively. The matrix elements λ 2 and ρ 2 appear only in the combination Note that we consider exclusively HQET matrix elements between physical B mesons. Matrix elements in the infinite mass limit are independent of the heavy quark mass and are sometimes used when combining fits involving both b-and c-hadrons. For the two leading dimension five operators the relation between these matrix elements is: where the non-local matrix elements τ i can be found, for instance, in ref. [123]. The λ i and ρ i matrix elements can be extracted from measurements of several leptonic and hadronic moments of the inclusiveB → X c ν spectrum, under the assumption that its shape is unaffected by new physics effects. The most recent analysis has been presented in ref. [48], where the results are expressed in the kinetic scheme [124][125][126][127]. In this scheme the renormalized matrix elements at a scale µ = 1 GeV are connected to the usual pole-scheme ones by calculating several leptonicB → X c ν moments in the small velocity (SV) limit (see ref. [124] for a pedagogical review) and using µ as a Wilsonian cut-off; this implies that the difference between pole and kinetic scheme matrix elements is proportional to powers of µ and not just logarithms (as in the MS scheme). When using these matrix elements in the calculation of any other observable (e.g. inB → X q + − ) one has to modify the perturbative part of the calculation accordingly by introducing the same Wilsonian cut-off in both virtual and real corrections. The alternative, which we adopt, is to convert the matrix elements to the pole scheme (which corresponds to setting µ = 0) and keep the rest of the calculation unchanged.
The explicit expressions that we use are (see eqs. (9) of ref. [127] and eqs. (11) -(13) of ref. [128]): with β 0 = 9 (three active flavours), C F = 4/3 and C A = 3. Note that following ref. [128] (see footnote above eq. (13)) we omit terms of order µ 3 in eqs. (5.15) and (5.16); this is necessary in order to convert the matrix elements extracted from the fit presented in ref. [48] (which differ from the kinetic scheme matrix elements by terms of order µ 3 ) into the pole scheme. A discussion of the absence of µ 2 terms in [µ 2 G (µ)] pert can be found in ref. [129]. The inputs summarized in table 5 are obtained from the results presented in table II of ref. [48] with the help of eqs. (5.11) -(5.18); we estimate the uncertainty due to missing O(α 3 s ) corrections in eqs. (5.15) and (5.17) by assuming that the relative magnitudes of NNLO and NNNLO terms are identical.
The discussion of the weak annihilation matrix elements f a q is greatly simplified by isospin and flavour SU (3) considerations: where we indicate the valence and non-valence terms with respect to external B 0,± states. Therefore, up to isospin and flavour SU (3) breaking effects, the six matrix elements needed forB → X s,d + − reduce to two. In the vacuum saturation approximation these matrix elements vanish: they can be written as [42,75]. Assuming violations of this approximation at the δB A ∼ O(0.1) we find f V ∼ f NV 0.4. As numerical inputs we adopt upper limits extracted from branching ratios and the first two moments of semileptonic D 0,± and D s decays rescaled by a factor m B f 2 B /(m D f 2 D ). The result of a re-analysis of semileptonic D decay data from the CLEO-c Collaboration [130] following closely ref. [49] 5 are summarized in table 5, where we present the two largely uncorrelated quantities f NV and f V − f NV . Additionally, following ref. [42], we assume SU (3) and SU (2) breaking effects at the level of [δf ] SU (3) = 0.04 and [δf ] SU (2) = 0.004, respectively.
Note that we calculate Γ(B → X s(d) + − )/Γ(B → X u ν); therefore, for the X s case we need f 0,± s and f 0,± u and for the X d one we need f 0,± d and f 0,± u . The required inputs for the various observables are (f q = (f 0 q + f ± q )/2): , .

Phenomenological results
In this section, we present the final numerical results, for which we use the inputs defined in table 5. We give the results for the branching ratios integrated over the low-q 2 region (1 GeV 2 < q 2 < 6 GeV 2 ) and over the high-q 2 region q 2 < 14.4 GeV 2 . The corresponding CP asymmetries are given as well. In order to reduce large uncertainties from power corrections in the high-q 2 region we compute the ratios R(s 0 ). The forward-backward asymmetries A FB and the related angular observable H A are computed for the low-q 2 region. Due to a zero-crossing in the differential A FB and H A , we subdivide the low-q 2 region into two bins, bin 1 (1 GeV 2 < q 2 < 3.5 GeV 2 ) and bin 2 (3.5 GeV 2 < q 2 < 6 GeV 2 ) when presenting the results for these two observables. In addition, we give the position of the zero crossing. As is customary, we present our results for both electron and muon final states separately. To obtain our phenomenological results, we expand our observables up toα 3 s and κ 3 , and neglect all higher terms. In addition, we expand up to linear terms in the powercorrection parameters λ 1,2 , ρ 1 , f 0,± u , f s and drop all higher powers and product of these parameters. For the low-q 2 region, we neglect 1/m 3 b corrections. Below, we give the central values of all the observables with uncertainties from different sources. These uncertainties are obtained by varying the inputs within their ranges indicated in table 5, where we assume that m c and C are fully anti-correlated. The total uncertainties are obtained by adding the individual ones in quadrature. The uncertainties from the Krüger-Sehgal functions are always below the percent level of the central values and are therefore not included. We present our results up to 2 decimal digits, however in some cases where this would lead to 0.00, we give the first significant number. We emphasize that the contribution of λ 1 to the error budget is tiny in both low and high-q 2 and therefore it is not displayed.

Branching ratio, low-q 2 region
The branching ratios for the low-q 2 region are found to be nearly 10 −7 , smaller than thē B → X s + − number by about 2 orders of magnitude mainly due to the CKM suppression.
The total uncertainties are about 8%.

Branching ratio, high-q 2 region
In the high-q 2 region, the 1/m Here we do not quote the uncertainty coming from the variation of α s as this is negligible. We quote the uncertainty coming from f u and f d together, by summing quadratically individual uncertainties from variation of f NV , f V − f NV and [δf ] SU (2) , where f NV gives the dominant uncertainty.

Forward-backward asymmetry, low-q 2 region
The integrated H A rate and the forward-backward asymmetry are tiny when integrated over the full low-q 2 bin, as was already observed in theB → X s + − case [36]. This is because of a zero-crossing in H A which occurs close to the middle of the low-q 2 region. Therefore, we separate our results in two additional bins: As discussed in sec. 4.2, at order 1/m 2 b , H A receives resolved-photon contributions from the interference between P c 1,2 − P 10 . Since an explicit estimate of such contributions is not yet available [76], we added an additional uncertainty of 5% to our results.
For completeness, we also quote the value of the normalized forward-backward asymmetry, which can be obtained from H A by using eqs. The forward-backward asymmetries are obtained by taking H A normalized by the corresponding branching ratios, both of which receive resolved-photon contributions. Since the the contributions to H A and the corresponding branching ratios are induced by different operators, i.e. P c 1,2 − P 10 and P c 1,2 − P 7,9 , respectively, we have assumed that the resolvedphoton uncertainties of the branching ratios and H A are independent. We emphasize that the uncertainties stemming from the scale and α s are very pronounced. This is caused by the opposite effect of the scale and α s variation in H A and the branching ratios.
In the high-q 2 region, the perturbative uncertainty is drastically reduced. However, as for the branching ratio, large uncertainties arise from the non-perturbative 1/m 2,3 b corrections.
We emphasize that for high-q 2 , the nonfactorizable contributions from both charm and up-loops are taken into account (see section 3.3). We found that the contributions of the latter are negligible.

Conclusion
As a FCNC process the inclusiveB → X d + − decay provides many observables sensitive to BSM physics. Contrary to the more frequently studiedB → X s + − channel,B → X d + − receives contributions from the operators P u 1,2 without CKM suppression and can thus yield more, complementary, information. In particular, the CP violation inB → X d + − is expected to be much larger than that ofB → X s + − . In the present work, we perform a state-of-the-art phenomenological analysis ofB → X d + − , providing the SM predictions for observables including the branching ratio, the forward-backward asymmetry and the CP asymmetry, which are quite promising to be studied at Belle II. Disentangling potentially small new-physics effects from SM uncertainties requires both precise theoretical predictions and accurate experimental measurements. For inclusive FCNC decays, this not only necessitates the inclusion of perturbative and local power corrections associated to the partonic rate, but also requires attention to additional long distance contributions on which we put particular emphasis in the present work.
The most prominent among the long distance contributions arises from intermediate charmonium and light-quark resonances such as J/ψ, ρ and ω, which are not captured by the local OPE. Even with kinematic cuts, the resonances may still have sizable effects on the observables, especially the high-q 2 ones. To handle the color-singlet resonance contributions, we adopt the Krüger-Sehgal (KS) approach with improvement in several aspects compared to previous studies. The most up-to-date e + e − → hadrons and τ → ν + hadrons data are used to interpolate the imaginary parts of the KS functions. In the dispersion integral to obtain the real parts, we choose the subtraction point s 0 = −(5GeV) 2 large and negative, which is not only far away from the charm and light-quark resonances but also avoids large logarithms log(s 0 /µ 2 b ) from higher-order perturbative corrections. Besides the flavoured u-and c-quark KS functions we also obtain the d-and s-quark KS functions which are also featured in c → u transitions. At last, the one-and twoloop factorizable perturbative functions together get replaced by the corresponding KS functions. We find that the asymptotic behaviour of the perturbative and the KS functions match very well. We also study the uncertainties associated with the KS functions and find that they turn out to be negligible in the numerical results of the observables.
Beyond the KS treatment, the nonfactorizable resonance contributions toB → X d + − can also be considerable. For the high-q 2 observables we adopt the description of the nonfactorizable charm-loop diagrams with a soft gluon connecting the quark loop and the spectator cloud from [43]. It was pointed out in [43,58] that for large q 2 the effects due to nonfactorizable charm-loop diagrams are local and hence easy to handle. The corresponding nonfactorizable up-loop contribution is obtained by taking the m c → 0 limit of the charm result. It turns out that both the nonfactorizable charm-and up-loop contributions are very small in the high-q 2 region. In the low-q 2 region, the procedure in [43] does not apply because the local heavy mass expansion fails. Such nonfactorizable contributions, including the one from charm loops, have been systematically studied within SCET by calculating resolved-photon contributions up to order 1/m b [58]. We also discuss that the effects from up loops vanish in all CP-averaged quantities, but might give rise to a large uncertainty in the CP asymmetry. Combing the results of [58] with a conservative estimate of the potentially large 1/m 2 b resolved contributions we find that they lead to an additional 5% uncertainty in the branching ratio and the forward-backward asymmetry.
Finally, we identify another long distance effect contributing to the inclusiveB → X s,d + − decays in the low-q 2 region, the cascade decaysB → X 1 (cc → X 2 + − ). In the context of inclusive FCNC decays they are considered for the first time in the present work. We thoroughly investigate their kinematics and phase space distributions. It is found that under a typical cut M X < 2 GeV taken in the experiments to suppress the double semileptonic background, the potentially most important cascade channels,B → X 1 (ψ → η + − ) andB → X 1 (ψ → η + − ), contribute only 0.05% respectively 0.0005% to the totalB → X + − branching ratio. Therefore a hadronic mass cut effectively removes the pollution from the cascade decays as long as M X < 3 GeV. As mentioned earlier, the theoretical predictions in the present work are given for the case without a hadronic mass cut. We leave the thorough theoretical and phenomenological study of theB → X s(d) decays with an M X cut to a future project. In our calculation, we take into account all available perturbative and power corrections. While many of the expressions forB → X d + − can be taken over fromB → X s + − , effects from the current-current operators P u 1,2 are new. In particular we derive the logarithmically-enhanced QED corrections associated with these operators and also include the partonic multi-particle contribution recently calculated in [45]. As for the power corrections, we extract the relevant HQET matrix elements that scale as 1/m 2 b and 1/m 3 b from inclusive semileptonic B-and D-decay data. We find that the poorly determined HQET matrix elements dominate the uncertainties in all high-q 2 observables. We therefore stress again that lattice calulations of these HQET matrix elements should be performed, as they would significantly improve the theoretical precision of semileptonic FCNC decays in the high-q 2 region.
We update the SM predictions for the branching ratio, the unnormalized and normalized forward-backward asymmetry, the zero crossing of the forward-backward asymmetry and the CP asymmetry ofB → X d + − in the low-q 2 region. The uncertainties of the CP-averaged quantities are in general from 5% to 20%, except for the forward-backward asymmetries in the entire low-q 2 region for which the central values are small due to the zero crossing in the middle of the low-q 2 region. The low-q 2 CP asymmetry may receive large and unknown uncertainties from the resolved contributions in addition to the parametric and perturbative ones. For the high-q 2 region, we give the predictions for the branching ratio, the observable R(14.4) and the CP asymmetry. Owing to the poorly determined hadronic parameters characterising the 1/m 2,3 b power corrections, the branching ratio and the CP asymmetry have large relative uncertainties of O(40%). On the other hand, the uncertainties arising from the hadronic parameters get largely cancelled in the ratio R(14.4) such that its uncertainty is smaller than 10%.
In light of the anomalies persistent in exclusive b → s transitions, a cross-check via the corresponding inclusive b → s and b → d modes is very much desired. The complementarity between inclusive and exclusive b → s decays in the search for new physics was already pointed out in [16], and b → d transitions will yield additional useful insights. TheB → X d + − observables should therefore be measured in a dedicated Belle II analysis.
where the factorizable perturbative function is given by Here h (1) q are the factorizable perturbative two-loop functions. For the q = u, d, s cases, the analytical expression is available: For the charm case the functions can be found in [33]. The fits in s = q 2 for our default value of m c read .04 + 0.939s − 0.0421s 2 + 0.0178s 3 + 32 9 log µ b , low-q 2 , 78.5 − 11.1s + 0.472s 2 − 0.00683s 3 + 32 9 log µ b +i(105 − 9.42s + 0.354s 2 − 0.00475s 3 ) , high-q 2 , where s and µ b have to be inserted in units of GeV 2 and GeV, respectively. We have checked that these functions are consistent with the corresponding two-loop photon selfenergy functions calculated in [132]. with δ ≡ 1 −ŝ and b =ŝ − (4m 2 c /m 2 b ) 2 . The polynomials in the high-ŝ-region were obtained such as to have a double zero atŝ = 1.
The new contributions induced by P u 1,2 can be obtained analytically for all but the 2u2c interference term. We find