Insights on the current semi-leptonic B -decay discrepancies – and how B s → µ + µ − γ can help

B s → µ + µ − γ , measured at high q 2 as a partially reconstructed decay, can probe the origin of the existing discrepancies in semi-leptonic b → s and b → c decays. We perform a complete study of this possibility. We start by reassessing the alleged discrepancies, with a focus on a unified EFT description. Using the SMEFT, we find that the tauonic Wilson coefficient required by R ( D ( ∗ ) ) implies a universal muonic Wilson coefficient of precisely the size required by semi-muonic BR data and, separately, by semi-muonic angular analyses. We thus identify reference scenarios. Importantly, B s → µ + µ − γ offers a strategy to access them without being affected by the long-distance issues that hamper the prediction of semi-leptonic B decays at low q 2 . After quantifying to the best of our knowledge the B s → µ + µ − γ experimental over the long haul, we infer the B s → µ + µ − γ sensitivity to the couplings relevant to the anomalies. In the example of the real-δC 9 , 10 scenario, we find significances below 3 σ . Such figure is to be compared with other single -observable sensitivities that one can expect from e.g. BR and angular data, whether at low or high q 2 , and not affected by long-distance issues such as narrow resonances or intermediate charmed di-meson rescattering.

Importantly, this disagreement is seen only for low di-muon invariant mass squared q 2 , and is consistent across the whole ensemble, in the sense that the same Wilson-coefficient shift accommodates all discrepancies.However, interpreting this shift as due to new physics (NP) relies on a strong assumption about the possible size of long-distance contributions to the different observables concerned.These contributions are, on the one side, difficult to estimate, and on the other side largely equivalent to (i.e.parametrically interchangeable with) the NP shifts that one wants to probe.
A complete estimation of such long-distance effects is, as well-known, an issue as important as it is challenging.The existing calculations in Refs.[18,19] (building on Refs.[20,21,22]) focus on the "charm-loop"-to-γ * (q 2 ) amplitude, whose long-distance effects correspond to poles and cuts in the q 2 variable.On the other hand, Ref. [23] (see also Refs.[24,25,26,27,28]) emphasizes the importance of including contributions from B to di-meson rescatterings, that correspond to cuts in the full decay variable (q + k) 2 , where k is the momentum of the final-state K ( * ) . 1 Ideally, one should start from an amplitude that is function of q2 and of (q + k) 2 , such that cuts in the second variable reproduce B → D D * decay rates, and only then set (q + k) 2 = m 2 B .In other words, inclusion of (q + k) 2 as a variable should allow to take into account the so-called anomalous cut 2 in addition to the "usual" cut associated to the cc threshold.The state-of-the-art calculations in Refs.[18,19] allow for complex-valued helicity amplitudes (see e.g.discussion in Ref. [22]), which are nevertheless functions of q 2 only, and are denoted as H λ (q 2 ).We are lacking a proof that the theoretical and experimental input used to constrain complex H λ (q 2 ) fully encapsulates the structure due to the cuts in (q + k) 2 .It is clear that going beyond the calculations in Refs.[18,19]-that constitute a benchmark for all that is calculable with regards to this issue-is a daunting task, primarily because there is no known EFT that would allow a quantitative estimate of the anomalous-cut contributions.
There are two clear avenues towards resolving the above quandary: on the theory side, an estimate of the mentioned missing long-distance effects, dispelling any doubt that they may be mimicking NP; on the experimental side, the consideration of alternative observablesin particular, in alternative q 2 regions-that are sensitive to the very same short-distance physics without being affected by the long-distance effects in question.
The branching ratio for B s → µ + µ − γ, measured at high q 2 as a partially reconstructed decay [31] (see Ref. [32,33] for a first application at LHCb) offers an example of such an alternative observable.It actually provides a litmus test between the two mentioned explanations: NP versus long-distance effects mimicking it.In fact, at high q 2 this observable's long-distance contributions are dominated by two form factors that should be accessible by first-principle methods in the short term-whereas the helicity amplitudes that are input of the presently discrepant b → sµ + µ − data do not enter at all.
A clear test then emerges: one may predict B(B s → µ + µ − γ) assuming NP shifts as large as suggested by the currently discrepant low-q 2 b → sµ + µ − observables if the incalculable long-distance contributions to these observables are negligible.A B(B s → µ + µ − γ) measurement that confirms our prediction would circumstantially support NP shifts of that size.This paper aims at discussing the relation between NP sensitivity and amount of data required for such test.To this end, the paper is structured as follows.In Sec. 2, we re-assess the b → sµ + µ − discrepancies by performing global likelihood fits of all relevant observables and their up-to-date measurements, among the others R K ( * ) and B s → µ + µ − .From such fits we extract NP shifts to the semi-leptonic Wilson coefficients known as C bsµµ 9 and C bsµµ 10 , to be used as reference for the rest.We consider separately the cases of real and complex shifts to these Wilson coefficients, and we examine to what extent data (including b → cℓ − ν) obey a coherent effective-theory picture.In Sec. 3, we attempt an extrapolation of the experimental and theoretical uncertainties associated to B s → µ + µ − γ at high q 2 and we use such extrapolation to project the NP sensitivity of the observable, using as NP benchmarks the couplings discussed in Sec. 2. We draw our conclusions in Sec. 4. Finally, the Appendix collects additional information and plots related to the global fits in Sec. 2.

NP benchmarks
Table 1: List of the most constraining observables and their measurements implemented in the flavio v2.5.4 Python package at the date of publication.For the exclusive channels, the predictions refer to the particular set of form factors of the relevant transition.Helicity amplitudes are detailed in Refs.[57,58,59].
One can consider the list of measurements3 in Table 1 and perform a global fit, using the python software package flavio4 [64].As a first step, we focus on shifts to semi-leptonic Wilson coefficients involving dimuons, defining (k = 9, 10)

NP shift
ℓ-specific + ℓ-univ.parts full WC (NP + SM) with the SM part also abbreviated through . As highlighted in eq. ( 1), NP shifts are identified from a leading δ (at variance with much of the literature).In addition, the full NP shift generally consists of a lepton-specific component, labeled as (e),(µ),(τ ) plus a lepton-universal one.In most of our discussion we will actually focus on universal components only concerning the light leptons, i.e. on light-lepton-universal shifts, that are labeled with u(e,µ) .
With eq. ( 1) setting the notation, we show the case δC (µ) 9 vs. δC 10 in the top-left panel of Fig. 1.The fit suggests that the updated R K ( * ) measurement and the anomalies observed in the b → sµ + µ − sector-branching ratios (BRs) and angular observables-do not yield a coherent picture in a scenario where one only shifts C (µ) 9 and C (µ) 10 , i.e.where their counterparts for all other flavours are assumed SM-like. 5 Specifically, the combination of the recent B s → µ + µ − measurements by both LHCb and CMS collaborations provides a strong constraint on δC (µ) 10 , which is now consistent with zero.Besides, the R K ( * ) measurements on the one side and the discrepant b → sµ + µ − observables on the other side, each constrain the δC (µ) 9 vs. δC (µ) 10 plane obliquely, but the two respective regions overlap at no better than 2σ.These findings may be contrasted with the analysis in Refs.[66,67].
As a second case of interest, we consider a scenario that is especially appealing in the light of a UV interpretation, where NP fulfils the constraint δC = −4.262(see e.g.Ref. [65] for accurate notation on the scale and the short-distance contributions included).Also, C SM 7 = −0.303, to be used around eq. ( 2). 6 This identification follows trivially from writing the effective Hamiltonian either in the "traditional basis", ∝ C9O9 +C10O10 with O9 ∝ (sγ µ L b) ( lγµℓ) and O10 ∝ (sγ µ L b) ( lγµγ 5 ℓ), or in the chiral basis, ∝ CLLOLL, where OLL ∝ (sγ µ L b) ( lγµLℓ).Here we omit flavour indices for simplicity, as well as proportionality factors immaterial to the identification.(we refer again to eq. ( 1) and the surrounding text for the meaning of u(e, µ)).This case is displayed in the bottom-left panel of Fig. 1.In this panel the R K ( * ) constraint is trivially satisfied in the whole plane and thus not displayed.What we deem significant in this scenario is the fact that BR (in blue) and angular data (in green) constrain this WC plane in two independent directions, and the preferred regions of either set neatly overlap at a non-zero value of δC u(e,µ) 9 -while staying consistent with a SM-like, i.e. null within errors, value of δC u(e,µ) 10 . As a result, this scenario is consistent with all observables at the 1σ level.We note that the hint at δC The main difference between these two fits is in the significance of the null solution for the WC on the y axis, slightly above 1σ and respectively within 1σ in the bottom-left and bottom-right panels.The difference is clearly due to the R K ( * ) constraint, which as already mentioned is ineffectual in the bottom-left panel.
As a whole, Fig. 1 delivers two clear-cut messages from data: (a) the B s → µ + µ − update disfavors shifts to muonic δC 10 and the R K ( * ) measurement favors a solution where the muonic vs. electronic WC shifts are equal.Jointly, these pieces of information would tend to disfavor the δC LL scenario (top-right panel of Fig. 1).We will see, however, that this is true only if the shift is weak-phase-aligned with the SM-the relevant discussion is in Sec. .
We deem the last finding non-trivial.In the next section we will see that it can be understood within a SMEFT picture, that relates b → s and b → c discrepancies and that turns out to quantitatively account for both.

The SMEFT-induced connection between b → s and b → c anomalies: an update
While the δC u(e,µ) 9 shift just discussed has to be corroborated by further data (e.g.updates in the channels of Refs.[1,2]), it leaves a testable imprint already.In fact, quite generally a universal δC 9 7 tends to correlate with effects in charged-current semi-leptonic transitions, in particular b → cℓ − ν.This mechanism is to be expected as consequence of the inescapable RG running between the scale of the new dynamics and the b scale, to the extent that the new dynamics is sizeably above the EW symmetry-breaking threshold. 7We do not use δC u(e,µ) 9 to denote such shift, because the considerations in these paragraphs generally apply to a δC9 shift common to all three leptonic generations.As a rule, when "universal" is used textually rather than as u(e,µ) , we mean that it applies to all three generations.
As a matter of fact, the possibility of a low-scale universal C 9 shift arising from highscale interactions has been pointed out within the WET [68], the SMEFT [69] and in the context of renormalizable UV models [70].A particularly plausible example is that of semi-tauonic, SU (2) L -symmetric SMEFT operators, that will leave a direct imprint on R(D ( * ) ).The corresponding SMEFT WCs are denoted as [C (1),( 3) lq ] ijmn8 with ij leptonic, mn quark indices, and (1), (3) labeling the SU (2) L -singlet or triplet instances.The latter are customarily assumed to be equal in order to automatically fulfil B → K ( * ) ν ν constraints [73].We adhere to such assumption and drop the (1), (3) index, so that the SMEFT WCs will be simply denoted as C ii23 .It is then interesting to study the implications that different assumptions on ii = 11 vs. 22 vs. 33 have on the global consistency between b → s and b → c discrepancies.For example, C ii23 generates a matching contribution to δC 10 ; in turn C 3323 can also contribute a sizeable lepton-universal δC 9 shift.(In principle, every ii contributes such RG-induced lepton-universal δC 9 shift.However, ii = 33 is the least constrained in the light of data and can thus afford to provide the dominant contribution.) The above scenario can thus be realized in different declinations, that we explore in Fig. 2. We switch on C 2223 and C 3323 independently.Through RG running, each of them would induce a contribution to δC u(e,µ) 9 . However, C 2223 also generates a matching contribution to δC LL , that (in order to fulfil the R K ( * ) constraint) one would want to be of similar size as δC (µ) LL , that in turn is constrained to be nearly zero because of the B s → µ + µ − constraint.In short, the above scenario corresponds to |C 3323 | ≫ |C 2223 | > 0, and C 1123 = 0, which can be justified on grounds of hierarchical NP.
Data, however, are also compatible with the alternative scenario C 2223 = C 1123 , with |C 3323 | again hierarchically larger in magnitude.This scenario can be justified on grounds of light-lepton universality, and is displayed in the top-right panel of Fig. 2, in the same WC plane as the top-left panel.We see that the main difference is the impact of the R K ( * ) constraint, which becomes ineffectual in the top-right panel.This situation thus parallels that of the bottom-right vs. bottom-left panels of Fig. 1, and was already commented upon before the final items on page 7.
In the bottom-left panel of Fig. 2  [C Re (δC u(e,µ) 9 + δC   LL (bottom-left panel) and with the R(D ( * ) ) constraint inferred from the DM method rather than from the HFLAV average (bottom-right panel).We refer to eq. ( 1) for WET WC notation, and to the text (Sec.2.2, 2nd paragraph) for the meaning of C ii23 .appropriate to capture the most promising effects in the light of present data.The y axis corresponds to the left-handed WC shift relevant for R(D ( * ) ), namely δC (τ ) LL .In turn, the x axis is the total (i.e.lepton-specific, matching-induced plus lepton-universal, RG-induced) C 9 shift, δC u(e,µ) 9 9 , which is identical for both light leptons in view of the assumptions in the underlying SMEFT scenario.
The top-right vs. bottom-left panels are very correlated with one another-the latter is basically a rotation plus a reflection of the former.Jointly, however, they show that while the muonic δC LL direction has to a large extent been "trivialized" by B s → µ + µ − , the tauonic δC LL direction is the now-promising one, in the light of a SMEFT interpretation of the R(D ( * ) ) discrepancy.
We emphasize that the SMEFT scenarios behind the top-left panel on one side and the top-right & bottom-left panels on the other side are distinct.The fact that they yield mutually consistent results is non-trivial, and is due to the lack of constraining power in electronic-channel BR and angular data.Basically, the strongest constraint on b → se + e − at present is the one inferred indirectly from R K ( * ) .The day channel-specific data, i.e. b → se + e − BR and angular ones, will start to be constraining, they will either provide a striking confirmation of the above coherent picture, or expose an inconsistency.This inconsistency will signal that (a) the anomalous R(D ( * ) ) measurements, (b) the anomalous b → sµ + µ − BR ones, (c) the anomalous b → sµ + µ − angular ones, and (d) the electronic counterparts (whether also anomalous or not) of sets (b) + (c) are not consistent with a SMEFT picture.For reference, the counterparts of the top-left and top-right panels, but in the plane of the underlying SMEFT coefficients, are reported in the Appendix.
The above findings show once again that the wealth of semileptonic b-quark decay data offer a unique probe into possible heavy beyond-SM dynamics, and that this dynamics fulfils at the moment the basic constraints imposed by the SMEFT.We are lucky enough that this picture will be corroborated or falsified soon by data.We meanwhile reiterate that a cautious approach to the above findings is in order.In this spirit, we also show in the bottom-right panel the counterpart of the bottom-left-panel plot, but for the fact that the R(D ( * ) ) best-fit region is inferred from the Dispersive-Matrix (DM) method [74,75,76], rather than from the HFLAV average, whereby the latter represents the default choice in the rest of our numerical study.The DM method suggests a much milder R(D ( * ) ) anomaly than it emerges from the HFLAV average.The bottom-right panel of Fig. 2  For the sake of the discussion to follow, we focus on the scenario with a NP shift that is LFU in both the C 9 and C 10 directions, shown in the bottom-left panel of Fig. 1.Because LFU is imposed on both axes the constraining power of R K ( * ) vanishes, and the discrepant data shows a very clear departure from the SM expectation, with a substantial improvement over the SM hypothesis, and an agreement between the different sets of data at the 1σ level.
The best-fit intervals for this scenario are reported in , whereas shifts in C u(e,µ) 10 stay consistent with zero.We take this case as one of our NP benchmarks, referred to as the "real δC 9,10 " scenario in the following.The rest of the NP benchmarks in

Complex Wilson-coefficient shifts
It has been known for some time that the constraint on the magnitude of Wilson-coefficient shifts is loosened if their phase is not aligned with the phase of the SM contribution [77]. 9e quantify this possibility, i.e. consider scenarios where Wilson-coefficient shifts are complex, with generic phases.We do so by alternatively including or excluding CPV observables in the b → sµ + µ − sector, in order to study their possible role in favoring a particular sign for the imaginary part of the Wilson coefficients. .In compliance with the R K ( * ) measurement, every scenario enforces LFU on both the real and imaginary parts of the Wilson coefficients.The results are presented in Fig. 3, and the 1σ intervals for the scenarios that are most interesting for our purposes are also collected in Table 2.
We find that the current data are compatible with a sizeable (≃ few 10%× the size of the SM contribution, see footnote 5) imaginary component in any of the δC   2. Also, none of the scenarios hints at a non-zero imaginary part, in all cases compatible with zero below the 2σ level.These conclusions, to be qualitatively expected, follow from the fact that CP-odd observables are still underconstraining, as already noted.The figure also shows that the only scenario where all considered subsets of data are in mutual agreement (i.e.all the colored subregions overlap somewhere in the plane) is the first scenario, where one fits to Re δC u(e,µ) 9 vs. Im δC u(e,µ) 9 . This suggests, again, that δC u(e,µ) 9 is a necessary ingredient, and δC (µ) 10 = 0 a preferred requirement, to achieve mutual agreement among all data.
The first three of the scenarios collected in Table 2 perform comparably well, in particular δC u(e,µ) 9 or δC u(e,µ) LL represent the best-performing complex scenarios.Interestingly, the complex-δC u(e,µ) LL case shows that a sizeable real NP contribution-of O(20%) the SM value-is still allowed by the current B s → µ + µ − measurement.This is because plane.Also interestingly, B s → µ + µ − is likewise compatible with a sizeable imaginary part-that is entirely plausible, see footnote 9, and that existing data barely probe.As regards δC 7 , the strongest constraint comes from the wealth of b → sγ data available, plus CPV observables in the b → sµ + µ − sector.Taking also into account that B s → µ + µ − γ at high q 2 is mostly sensitive to C 9,10 (see discussion in Sec.3.2), we do not consider a δC 7 scenario in the rest of this work.
As a final remark for Sec. 2 we emphasize again, as we did in the Introduction, that a complete calculation of non-local contributions entering the currently discrepant b → sµ + µ − data may, or may not, show that the shifts collected in Table 2, in particular those involving the muonic C 9 , are actually due to SM long-distance dynamics.In these circumstances, it is meaningful to pursue observables that bear sensitivity to the very same Wilson-coefficient shifts one wants to probe, but are not affected by the same longdistance physics.The B s → µ + µ − γ branching ratio at high q 2 is one such observable.In the following we will discuss the significance one may expect for a shift as large as the real-δC 9,10 scenario (first entry of Table 2) or the complex-δC LL one (second entry of the same table) through a high-q 2 analysis of B(B s → µ + µ − γ), as a function of the data accumulated.

Experimental uncertainties
In great synthesis, measuring B s → µ + µ − γ at LHC using the partially reconstructed method [31] relies on a fit to the q 2 -differential distribution in an appropriate high-q 2 region, using as external constraints the known BRs of the purely leptonic modes B d,s → µ + µ − as well as the shapes and normalizations for certain well-defined backgrounds.It is difficult to assess the sensitivity of such procedure to B s → µ + µ − γ in the absence of a search optimized for B s → µ + µ − γ, rather than for the leptonic modes, as is the case for existing analyses.
With this important caveat in mind, the aim of the present section is to assess to the best of our knowledge the LHC prospects of measuring the B s → µ + µ − γ observable defined above.To this end, we need to make certain assumptions spelled out next.The first is that the efficiency of B s → µ + µ − γ in the considered integrated high-q 2 region is equal to B s → µ + µ − 's.This assumption relies on two opposing effects.On the one side, because B s → µ + µ − γ is not reconstructed as a peak but as a shoulder within a broad10 dimuon invariant mass window, the efficiency will have a certain q 2 dependence.Typically the efficiency will decrease moving from the B 0 s mass to lower q 2 values.In fact, lowering q 2 means a lower transverse momentum in the laboratory frame thus lowering the reconstruction and triggering efficiencies.It also means more abundant backgrounds, from more channels, to be rejected with tighter requirements.On the other side, this effect could be mitigated by developing an analysis optimised for the B s → µ + µ − γ decay, which Refs. [32,33] are not.More precisely, a dedicated B s → µ + µ − γ analysis could mitigate the q 2 dependence of the efficiency by deploying techniques, specific to partially-reconstructed decays, that were not necessary for the aims of Refs.[32,33].Explicitly, two dominant sources of background, in addition to the combinatorial one, are B → hµν decays, with h being a hadron misidentified as a muon, and B → π 0 µ + µ − , with a π 0 not reconstructed.Both decay sources need to be constrained in terms of the dimuon mass distribution as well as of the yield, in order not to limit the sensitivity to the signal.The improvements required are in the calibration of the trigger, and of the reconstruction and selection efficiencies as well as in particle (mis)-identification for muons (hadrons).These improvements are impossible to quantify reliably without a dedicated experimental study [79].
Clearly, since the background increases for smaller dimuon masses, the lower bound, q 2 min , of q 2 integration, has to be chosen as a compromise between larger statistics and larger background.In addition below a certain q 2 min value around (4 GeV) 2 cc resonances start to play an important role, lessening experimental and theoretical sensitivities alike.Within a given q 2 min choice, the actual sensitivity will rely on controlling the background within an associated error which is better than the expected signal yield.For example, the expected B s → µ + µ − γ events in the analysis of Ref. [33], were N exp (B s → µ + µ − γ) = 1.7 with the full Run-2 dataset, for q 2 min = 4.9 GeV 2 (table 7 of Ref. [33]) and a multivariate analysis BDT score > 0.25, meaning that the efficiency corresponding to the BDT requirement alone is 75%. 11This N exp (B s → µ + µ − γ) figure may be compared with the accuracy to which the leading background is controlled.From table 9 of Ref. [33], this accuracy is 5 events with the same BDT requirement.This comparison suggests that background calibration has clearly to improve, but it leaves hope, because the two numbers (1.7 events of signal on a background uncertainty of 5) are not far from each other even within an analysis far from optimized for our signal of interest.It is important to note that the latter argument considers the expected yields and their uncertainties integrated over the entire mass window, while a fit to the dimuon mass distribution gives a superior sensitivity provided the shapes of the background contributions are known.
In the light of the above considerations, we will henceforth assume that all the backgrounds are under control, i.e. that their uncertainties will eventually fall safely below the signal yield.Under this "no-background" hypothesis the B s → µ + µ − γ-signal uncertainty is dominated by the sheer amount of data collected.We accordingly assess the B s → µ + µ − γ sensitivity to the NP scenarios in Table 2 as a function of the data size.Finally, while what follows assumes the LHCb experiment as reference, the same techniques can be used also in other setups where we can expect copious amounts of B s mesons to be produced, e.g. at ATLAS and CMS.The level of the combinatorial background will however depend on the interactions pile-up and the control of the misidentified background will rely on details of the particle identification of each experiment.

Theory uncertainties
A reappraisal of the theoretical uncertainty in B(B s → µ + µ − γ) for high q 2 above narrow charmonium was presented in Ref. [80].It was shown that the main theory uncertainty arises from the vector and axial form factors V ⊥,∥ (q 2 ) of the B s → γ transition; that tensor form factors T ⊥,∥ (q 2 ) play, in comparison with vector and axial ones, a subdominant role irrespective of the parameterization used; that broad charmonium plays an entirely negligible role for q 2 ≳ 4.2 GeV.With the existing estimations of the vector and axial form factors-none of which is based exclusively on a first-principle calculation-the uncertainty in B(B s → µ + µ − γ) integrated in q 2 ∈ [4.2, 5.0] GeV is currently on the order of 50%.To resolve δC ≃ 15%, one has to control the vector and axial form factors multiplying this Wilson coefficient to an accuracy better than δC SM 9 , because the BR has the same, dominantly quadratic dependence on both C (µ) 9 and the vector and axial form factors. Controlling form factors with such accuracy is already within reach of LQCD calculations.Specifically, Ref. [81] calculates D s → γ vector and axial form factors with a quoted error around 10% or even below.An even higher precision has been achieved in the very recent Ref. [82], where some lattice data reach a relative error of few percent.Besides, these form factors are computed directly in the very same high-q 2 region of interest for the indirect B(B s → µ + µ − γ) measurement, i.e. no kinematic extrapolation is required. 12n short, a theory error as small as needed is well within reach to the extent that the calculation in Refs.[81,82] is extended to the B s case.Besides, it is realistic to assume that ripe LQCD calculations of B s → γ vector and axial form factors in the very kinematic region of interest to us will come with typical errors not larger than 5%, which is negligible in comparison with the nominal NP shift mentioned above and also with the experimental error to be expected in the foreseeable future.
For the purpose of the present study we thus mostly need realistic central values for the vector and axial B s → γ form factors, and we adopt the recent parametrization proposed in Ref. [80].In that work, the contributions from tensor and axial-tensor form factors were found to be negligible within the still large uncertainty on the vector and axial-vector form factors obtained in the same work.However, in the present case the vector and axial form-factor uncertainty is assumed to be below 5%, as already discussed, and the choice of tensor form factors does have an impact, as can be seen in Fig. 4. For example, one may compare the parameterization in Ref. [83] with setting T ⊥,∥ = 0, which means assuming a 100% uncertainty.The difference between the two choices induces a difference in the BR prediction of the same order as the NP shift one wants to access.
In short, a first-principle evaluation of the tensor form factors will also be required.But importantly, a limited accuracy, of the order 20%, will be sufficient in this case, given the subdominant nature of these contributions.In fact, a 20% error on tensor form factors, which appear in contributions whose overall nominal size does not exceed about 25% of those induced by vector or axial form factors, is sufficient to control tensor contributions with an overall theory uncertainty of about 5%, which is already below the size of the expected NP shift and the assumed uncertainty on the vector and axial form factors.The final uncertainties we assume for the form factors are shown in Fig. 5.
As concerns the short-distance side of the tensor contributions, we vary δC 7 in the 1σ interval identified from the complex fit of Fig. 3.We do not pursue a sensitivity study to possible NP contributions to C 7 , in particular because B s → µ + µ − γ at high q 2 is vastly more sensitive to δC counterparts, requires to be close to the lower q 2 endpoint.Also, sensitivity to C 7 occurs through terms ∝ Re(C 7 C (µ) * 9,10 ) and is thus only linear.Indeed, the complex NP shift to C 7 shown in Fig. 3 is too small to be resolved.
3.3 Outlook on B s → µ + µ − γ as a probe to NP In this section, we put to work the considerations made in Secs.3.1-3.2and use them to infer the NP sensitivity of a measurement of B(B s → µ + µ − γ) at high q 2 with the partially reconstructed method [31].For clarity, the main conclusions of Secs.3.1-3.2are that: (i) we may assume a theory uncertainty on B(B s → µ + µ − γ) of about 10-12%, dominated by FFs.This uncertainty can be understood intuitively, because FFs contribute quadratically to the BR, and their uncertainty is dominated by V ⊥,∥ , that in our q 2 range of interest is determined with an accuracy of 5% or less.As discussed in the previous section, other sources of theory uncertainties, including the T ⊥,∥ error, the dependence on δC 7 , and broad charmonium, are subdominant; 13 (ii) the theory error is actually negligible with respect to the experimental error.The latter is difficult to infer with any confidence.Hereafter, we consider the case where it is dominated by sheer statistics, which may be a safe and not unrealistic assumption (see Sec. 3.1 for details).and (right) integrated branching ratio of B s → µ + µ − γ in the high-q 2 region as a function of the lower bound of integration q 2 min , for various parameterizations of the tensor form factors T ∥,⊥ and two relevant theory scenarios.A 5% uncertainty is assumed for the vector and axial form factors, with central values from Ref. [80].The central values for the T ∥,⊥ form factors labelled as KMN are taken from Ref. [83].They are shown with a 20% uncertainty, following the discussion end of Sec.3.2.The C 7 Wilson coefficient is floated within the 1σ region allowed by the complex fit of Fig.As regards the assumed NP shifts we focus on two scenarios, referred to as "real δC 9,10 " and "complex δC LL ", and corresponding to the first two entries of Table 2.In either of the two cases, the shift is assumed to be LFU, 14 hence we drop the flavour index hereafter.Note that we do not consider other logically possible scenarios, to which the B s → µ + µ − γ sensitivity is either too small-e.g.scenarios involving C 7 -or is similar to the real-δC 9,10 and complex-δC LL scenarios we focus on.
Using the best-fit values in Table 2, and the theoretical uncertainties discussed in Sec.3.2 and summarized above, we extract the integrated BR for B s → µ + µ − γ in eq. ( 3).A qualification is in order about the chosen q 2 range.The photon in B s → µ + µ − γ is to be understood as "Initial State Radiation".Then, as discussed in Refs.[31,80], B(B s → µ + µ − γ) is well-defined for q 2 < 5 GeV, because above this threshold "Final State Radiation" (FSR), or bremsstrahlung from the muons, is not negligible or dominant.However, experimentally B s → µ + µ − γ and B s → µ + µ − are two components of the same fit; di-muon candidates in this fit are bremsstrahlung recovered, i.e.FSR is in practice subtracted by MonteCarlo [84]; finally, choosing [4.2, 5.0] GeV or [4.2 GeV, m B 0 s ] leads to a difference in the prediction by barely 2%.Hence, following Ref.[80], the predictions in eq. ( 3) refer to complex δC LL : (0.99 ± 0.10 We emphasize that the quoted uncertainty is theoretical only, and takes into account all other known sources, including V cb and the broad-charmonium resonances 15 and also marginalizing over δC 7 in the NP scenarios (lines 2 and 3 of eq. ( 3)).The central values in the second and third lines represent 20% shifts compared to the SM expectation (first line).The latter matches the estimate from Ref. [80], where central values for tensor FFs are taken from Ref. [83].
The sensitivity of B s → µ + µ − γ in the high-q 2 region to the two NP scenarios is displayed in Fig. 6.The leftmost panel shows that B(B s → µ + µ − γ) provides stronger constraints on the real part of δC LL , as expected from a CP-even observable.Using this NP benchmark (i.e. a δC LL leading to the last of eqs.( 3)) as the central-value prediction for the integrated branching ratio, one can compute the relative error on the B s → µ + µ − γ measurement.The pull of this observable to the SM is shown in Fig. 7.We note that, for each luminosity value in the figure, the range in the pull corresponds to the 1σ range in the Wilson-coefficient shift.In turn, the sensitivity of B s → µ + µ − γ to the real-δC 9,10 scenario is shown in the rightmost panel of Fig. 6.We see that sensitivity to this scenario is enhanced compared to the complex-δC LL one.Even if the presence of NP lowers the integrated BR-and thus  used to be before the updates, because it prefers C 10 to be SM-like within errors.Instead, data provide circumstantial support to a tauonic δC 9 = −δC 10 ≡ C LL /2 shift generated at a high scale, and leaving as imprints effects in R(D ( * ) ) on the one side, and a leptonuniversal δC 9 on the other side.SMEFT allows to make this connection between b → s and b → c effects quantitative.We find that the tauonic C LL required by R(D ( * ) ) implies a universal δC 9 of precisely the size required, separately, by the anomalous b → sµ + µ − BR data, and by the anomalous angular b → sµ + µ − analyses.Needless to say, this conclusion is to be taken with great caution, because both imprints are far from established: first, angular, and especially BR data demand updates; second, the significance of the R(D ( * ) ) anomalies is in the eye of the global-fit beholder and ranges from ∼ 2σ [74,75,76] to higher significances [85], depending on the analysis strategy and/or on the theoretical inputs used.The above-mentioned global fits allow us to identify reference scenarios, that we use as benchmarks for our stated aim of exploring the potential of B s → µ + µ − γ at high q 2 as a probe of the flavour anomalies.For this purpose, we first discuss the outlook on the total error of this observable, from both theory and experimental standpoints.Importantly, the theory accuracy of B(B s → µ + µ − γ) is not limited by the long-distance effects that inherently hinder predictions for semi-leptonic b → s modes at low-q 2 , for example effects due to B → D D * rescattering.In this respect, B s → µ + µ − γ offers a neat strategy to probe the very same short-distance physics possibly responsible for the anomalies, but in a different kinematic region.The B s → µ + µ − γ theory error at high q 2 is dominated by the form-factor component.While still large, this component is scalable, because for one thing high q 2 is the preferred kinematic region for lattice-QCD calculations.
We find that, over the long haul, the total error for B s → µ + µ − γ at high q 2 is dominated by the experimental component.Absent an analysis optimized for the B s → µ + µ − γ search, we estimate this error from the sheer statistical component.We then infer the B s → µ + µ − γ sensitivity as the distance of the SM prediction vs. the prediction within the aforementioned NP benchmarks, in units of the total error determined as described.In the example of the real-δC 9,10 scenario, we find that the pull as a function of the acquired data reaches the 2σ level at the border of the 1σ region for the Wilson-coefficient shift.
In case such sensitivity may look underwhelming, we make the following final remarks.First, the above sensitivity is to be compared with other single-observable sensitivities that one can expect from e.g.BRs and angular analyses.For observables at low q 2 , sensitivities that are quoted in the literature tacitly rely on a breakthrough in the understanding of long-distance effects.As already emphasized, this issue is absent at high q 2 in the case of B s → µ + µ − γ.Yet, we are not aware of other detailed studies of high-q 2 exclusive observables to compare our study against.An interesting (semi-)inclusive example is the recent Ref. [86].
Second, in all likelihood any NP in semi-leptonic B decays will first be established collectively, i.e. through many modes showing a coherent trend-and a persistent one with increasing statistics.Only later will such trend be consolidated by single observables getting to the canonical 5σ departures required.If this is the case, then it is of the highest importance to find as many measurables as possible that allow to confirm the trend-e.g.experimental branching ratios below the theory prediction-in observables devoid of the long-distance issues that at present plague low q 2 .This study goes in this direction.
LL /2, 6 in both the muonic and electronic channels.This is shown in the top-right panel of Fig.1,
zero holds irrespective of its relation to δC (e)10 .In the last two panels of Fig.1we assume δC

2 . 3 ;
(b) BR & angular data separately point towards a non-zero, but light-lepton-universal shift δC u(e,µ) 9 LL , which is now constrained to a SM-like value.On the other hand, the δC u(e,µ) 9 shift induced by C 3323 is much less constrained and can be numerically dominant.Quite remarkably, fixing C 3323 to account for R(D ( * ) ) yields a δC u(e,µ) 9 shift of precisely the correct size to account for, separately, b → sµ + µ − BR measurements and b → sµ + µ − angular analyses.These facts are shown in the top-left panel of Fig. 2. In this panel, we assume C 1123 = 0.In fact, a non-zero C 1123 would generate a matching contribution to δC (e) we show the implications of the same scenario (|C 3323 | ≫ |C 2223 = C 1123 |) in a different plane of WET WC combinations, that we consider more

Figure 2 :
Figure 2: Semileptonic b → s and b → c constraints in the plane δC u(e,µ) 9 vs. δC (µ) LL obtained from SMEFT WCs under the assumptions |C 3323 | ≫ |C 2223 | > 0, and C 1123 = 0 (top-left panel) or |C 3323 | ≫ |C 2223 = C 1123 | > 0 (top-right panel).The latter scenario is also shown in the plane of total δC 9 , equal for light leptons, vs. δC thus allows to address quantitatively the question to what extent a non-zero tauonic δC LL gets closer to zero should the R(D ( * ) ) discrepancy fade to the below-2σ figure suggested by the DM method.

2 ,
leading to an approximately circular shape in the Re δC u(e,µ) LL vs. Im δC u(e,µ) LL

10 .
Sensitivity to C 7 , whose SM value is much smaller than the C

Figure 4 :
Figure4: (left) Differential branching fraction in q 2 and (right) integrated branching ratio of B s → µ + µ − γ in the high-q 2 region as a function of the lower bound of inte-

Figure 5 :
Figure 5: Set of B s γ transition form factors.For the assumed V ∥,⊥ and T ∥,⊥ uncertainties, see considerations in the caption of Fig. 4 and end of Sec.3.2.

Figure 7 :
Figure 7: Pull to the SM (top) in the complex δC LL scenario and (bottom) in the real δC 9,10 scenario.The colored area on the leftmost panels represents the 1σ region spanned by the respective NP scenarios.The rightmost panels show the pull for 300 fb −1 of collected data.

Table 2 (
see first entry), and suggest a 20% NP effect in C

Table 2 :
Table 2 concern complex Wilson coefficients, to which we turn next.Reference scenarios for two-real or one-complex WC combinations.