Comprehensive Bayesian analysis of rare (semi)leptonic and radiative \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\varvec{B}$$\end{document}B decays

The available data on \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$|\Delta B| = |\Delta S| = 1$$\end{document}|ΔB|=|ΔS|=1 decays are in good agreement with the Standard Model when permitting subleading power corrections of about \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$15\,\%$$\end{document}15% at large hadronic recoil. Constraining new-physics effects in \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\mathcal {C}_{7}^{\mathrm {}}$$\end{document}C7, \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\mathcal {C}_{9}^{\mathrm {}}$$\end{document}C9, \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\mathcal {C}_{10}^{\mathrm {}}$$\end{document}C10, the data still demand the same size of power corrections as in the Standard Model. In the presence of chirality-flipped operators, all but one of the power corrections reduce substantially. The Bayes factors are in favor of the Standard Model. Using new lattice inputs for \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$B\rightarrow K^*$$\end{document}B→K∗ form factors and under our minimal prior assumption for the power corrections, the favor shifts toward models with chirality-flipped operators. We use the data to further constrain the hadronic form factors in \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$B\rightarrow K$$\end{document}B→K and \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$B\rightarrow K^*$$\end{document}B→K∗ transitions.

The goal is to measure a large number of observables accessible in the angular analysis of the decay distributions of the three-and four-body final states. With the CP-averaged observables one can test-in a model-independent fashionthe underlying short-distance couplings of the B = 1 a e-mail: frederik.beaujean@lmu.de b e-mail: bobeth@ph.tum.de c e-mail: vandyk@physik.uni-siegen.de effective theory. CP-asymmetric observables also probe new sources of CP violation beyond the SM. In B → K * (→ K π) + − decays, certain combinations of the angular observables, the "optimized observables" [20][21][22][23][24][25][26][27][28], are free of form factors to leading order in the 1/m b expansion and consequently expected to have smaller theoretical uncertainties. The same framework also provides observables that are dominated by ratios of B → K * form factors [22,[28][29][30][31], providing some additional data-driven control over these hadronic quantities.
The 1/m b expansions are important tools for the prediction of exclusive decays. At large hadronic recoil of the K ( * ) meson, QCD factorization (QCDF) yields corrections beyond naive factorization [32,33]. Effects of (qq)resonances, dominantly from charm, as well as the chromomagnetic dipole operator can be calculated using a light-cone operator product expansion (OPE) in combination with dispersion relations [34][35][36][37][38]. At low hadronic recoil, a local OPE [39,40] can be employed. Its prediction of correlations between different observables can be tested experimentally through measurement of the observables H (1) T and J 7 in B → K * + − [28]. Hadronic form factors are a major source of theoretical uncertainties in the prediction of angular observables, whereas optimized observables are sensitive to higher-order terms in the 1/m b expansions, especially at large recoil. At present, the associated uncertainties due to the unknown 1/m b contributions are estimated based on simple power-counting arguments.
Based on the framework developed in our previous work [30], we perform a global analysis using Bayesian inference, and we include a total of 28 nuisance parameters to account for theory uncertainties. Apart from new-physics parameters, our framework allows us to infer also B → K ( * ) form factors and the size of subleading contributions, thereby shedding some light on the origin of tensions between data and SM predictions. Compared to [30], we include the most recent measurements, add additional observables to the fit, and account also for recent lattice calculations of form factors [50,51] and other new theoretical results.
In Sect. 2, the model-independent framework of B = 1 decays is briefly revisited, and three scenarios of new physics (NP) are introduced. In Sect. 3, we list the updated experimental input. The results of the global analysis are presented in Sect. 4: (1) for the most important Wilson coefficients of the SM operator basis C 7,9,10 and their chirality-flipped counterparts assuming them to be real-valued, (2) for the B → K ( * ) form factors in the SM and the two NP scenarios, and (3) the size of subleading contributions. We also compare our results with recent analyses that had access to the same experimental data. In Appendix A we summarize the theoretical predictions of newly included observables and changes in the treatment of form factors and subleading contributions.

Model-independent scenarios
For the global analysis of b → s(γ , + − ) data we use a model-independent approach based on the | B| = | S| = 1 effective theory. The Hamiltonian reads with dimension-six flavor-changing operators O i and their respective short-distance couplings, the Wilson coefficients C i (μ). We evaluate the hadronic matrix elements of the operators at the scale μ = 4.2 GeV of the order of the bottomquark mass m b . We restrict our analysis to the set of operators present in the SM (i = 7, 9, 10) O 7(7 ) = m b e sσ μν P R(L) b F μν , O 9(9 ) = sγ μ P L(R) b ¯ γ μ , O 10(10 ) = sγ μ P L(R) b ¯ γ μ γ 5 (2.2) and their chirality-flipped counterparts (i = 7 , 9 , 10 ), denoted by SM . The Wilson coefficients of the four-quark and the chromomagnetic dipole operators are set to their NNLO SM values at μ = 4.2 GeV [52,53].
In principle, the scalar, pseudo-scalar, and tensor b → s + − operators can contribute to the angular distributions of B → K * (→ K π) + − and B → K + − as discussed in great detail in [28]. However, only a few measurements of sensitive observables are currently available, with rather large uncertainties. We therefore abstain from including these operators into our analysis. Instead, we focus on comparing the SM to several new-physics scenarios with regard to their ability to describe the data well.
In the model-independent approach, the SM and SM Wilson coefficients are the parameters of interest; they are assumed real valued and independent a priori. The nuisance parameters ν serve to model theory uncertainties, including CKM parameters, quark masses, and hadronic matrix elements; see Appendix A. The following scenarios of fit parameters: Expressing vague prior knowledge, we assign a flat prior distribution to the Wilson coefficients. However, each nuisance parameter ν comes with an informative prior as discussed in detail in Appendix A.
The SM values C 7,9,10 are obtained at NNLO [52,53] and depend on the fundamental parameters of the top-quark and the W -boson masses, as well as on the sine of the weak mixing angle. For new-physics models that fall into one of the scenarios SM and SM+SM , the obtained fit results of the Wilson coefficients can be subsequently used to constrain those models' fundamental parameters after accounting for the renormalization group evolution from the high matching scale down to μ ∼ m b .

Observables and experimental input
In this section we describe changes of the experimental inputs that enter our global analysis with respect to our previous work [30]. We first introduce observables which are newly added to the global analysis and refer the reader for details of their theoretical treatment to Appendix A. Afterward, we summarize those observables whose measurements have been updated, or for which additional measurements have since become available. In general we employ the full set of observables listed in Table 1 except for the last row and denote this as "full". Inclusion of the B → K * lattice points from the last row is denoted as "full (+FF)". For the sake of comparison with [47] we also repeat the analysis with a smaller subset called "selection" as specified in the same table. Generally, we model the probability distributions of experimental measurements as (multivariate) Gaussian distributions. In practice, however, experiments do not yet provide correlations, except for S and C in B → K * γ . For measurements with asymmetric uncertainties, we model the probability distribution as a split Gaussian with two different widths. For the measurement of B(B s → μ + μ − ), we use the Amoroso distribution to avoid the unphysical region B < 0 as described in [30].
In the following, all observables are understood to be CPaveraged unless noted otherwise. The dilepton invariant mass in inclusive and exclusive b → s + − decays is denoted by q 2 throughout.

New observables
Measurements of the branching ratio of the inclusive radiative decay B → X s γ are included with a lower cut on the photon energy E γ > 1.8 GeV. For the branching ratio of the inclusive semileptonic decay B → X s + − integrated over the low-q 2 region q 2 ∈ [1, 6] GeV 2 , we use B [1,6] A source of parametric uncertainty in inclusive decays arises from matrix elements of dimension-five operators, μ 2 π and μ 2 G , discussed in more detail in Appendix A.1. They appear in the heavy-quark expansion at order 2 QCD /m 2 b , and μ 2 G enters also the B * -B mass splitting [54] M B * − M B = (4.578 ± 0.035) · 10 −2 GeV, (3.5) incorporated as an additional experimental constraint. Previous experimental angular analyses of B → K * (→ K π) + − were restricted to the measurements of the longitudinal K * -polarization fraction, F L , and the lepton forward- Table 1 List of all observables in the various inclusive and exclusive b → s(γ , + − ) decays that enter the global fits with their respective kinematics and experiments that provide the measurements. The B * -B mass splitting is used to constrain matrix elements of dimension-five operators. Lattice results of B → K form factors are used to constrain their parameters, and theoretical constraints on B → K * form factors are included. For more details we refer to Sect. 3 and Appendix A . †: Note that we include only the LHCb measurements in the [1, 6] GeV 2 bin as part of the "selection" data set, but not the low-recoil bins [14.18, 16], [> 16] GeV 2 [12][13][14] - [14.18, 16], [16,18], [18,22] (2) T and the CP-asymmetry A 9 = A im [55]. Most recently, LHCb extended the angular analysis to measure A re T [16] as well as S 4,5,7,8 and their optimized analogs P 4,5,6,8 [19] in addition to the previously published results on A (2) T , S 3 , A 9 , and S 9 . The original definitions of the observables S i and P i can be found in [56] and [27], respectively. Here we include the measurements of A re T and P 4,5,6 in the q 2 -bins [1,6], [14.18, 16.0], and [>16.0] GeV 2 . We replace S 3 data from LHCb by the corresponding A (2) T results.

Updated experimental input
We use the same experimental input as in our previous analysis [30], unless the experimental collaborations provide updated measurements. For some of the observables, additional measurement by further experimental collaborations have become available; they are added to the previous ones. Both types of updates are listed below.
The measurement of the time-integrated and CP-averaged branching ratio of the leptonic decay B s → μ + μ − has been recently updated by LHCb and measured for the first time by CMS with 4.0 σ and 4.3 σ signal significance, respectively. To faithfully model the physical constraint B ≥ 0 and the reported asymmetric uncertainties of the experimental probability distribution (PDF), we use the Amoroso distribution [30]. The LHCb measurement of the B + → K + + − branching ratio (CP-averaged) [15], based on 1 fb −1 integrated luminosity, is used in the q 2 bins [1,6], [14.18, 16. [14] rather than a partial data set [57]. Very recently, LHCb reported a broad peaking structure in the branching ratio at high q 2 compatible with ψ(4160) [58] using the larger data set of 3 fb −1 . Within the picture of quarkhadron duality, an adapted larger bin [≥ 15] GeV 2 around the peak should satisfy the necessary conditions required by the theoretical framework [39,40] in the future. For the moment, we continue to use the binning provided by experiments and do not enlarge theoretical uncertainties in B → K ( * ) + − at high q 2 .
There are numerous updates on B → K * + − for which we use the three q 2 bins [1,6], [14.18, 16.0], and [> 16.0] GeV 2 . We add recent measurements of the branching ratio from CMS [17] as well as for F L and A FB from CMS [17] and ATLAS [18]. For the branching ratio, F L , and A FB we now use the updated values [16] instead of [59] in the case of LHCb, and [14] instead of [55,57] in the case of CDF.
For B 0 → K * 0 (→ K π) + − LHCb provides results of "optimized observables". Combining these with both the observables F L and A FB can lead to double counting [25] in the strict limit of vanishing lepton masses that is well justified in the NP scenarios considered in this work. 1 For this reason, we replace the LHCb measurements of A FB [1,6] by A re T [1,6] . However, we continue to use A FB rather than A re T for the low-recoil bins because neither is an optimized observable at high q 2 . It is difficult to include the LHCb measurement of A re T [14.18,16.0] , since it implies that A re T (q 2 ) is constant and attains its theoretical maximum.
Overall, the measured q 2 dependence of individual observables is in quite good agreement with the SM predictions. The largest deviation of 3.7σ is reported for the optimized observable P 5 [4.3,8.68] [19] when compared to the SM prediction [47]. We do not use any of the [4.3, 8.68] GeV 2 bins since their theory predictions receive large contributions from cc-loops [35]. The [1, 6] GeV 2 bins are less affected by these effects. The remaining uncertainty is accounted for by the parameters of subleading contributions at the level of the decay amplitudes [30] in our predictions. The SM predictions available in the literature for this bin (and the low-recoil bins) are compared with our results in Table 2. Based on our prior input, we obtain central values as in [47] deviating by 2.5σ from the measurement whereas the analysis [38] has a different central value and larger errors with a 1.0σ deviation from experiment.
A second interesting deviation appears in the optimized observable P 4 [14.18,16.0] . In the SM operator basis it is given by a ratio of form factors [31] up to strongly suppressed subleading corrections. The extrapolation of LCSR form-factor results [35] from low to high q 2 yields a much larger value compared to the measurement. This is also observed [49] with recent lattice QCD determinations of B → K * form factors at high q 2 [51]. They allow us to further constrain the form factor parameters a priori. However, these determinations do not reliably consider systematic uncertainties due to finite width effects [51] of the K * . For this reason, we provide our fit results for analyses with and without the B → K * lattice inputs. Note, however, that this issue does not affect the lattice determinations of B → K form factors.
A third deviation from the SM prediction is seen in the preliminary measurements of F L [1,6] from BaBar and ATLAS that are both too low by more than 3σ and 2σ , respectively. Table 2 Predictions based on SM(ν-only) and wide (tripled uncertainty) priors, and postdictions after the fits, for the optimized observable P 4,5,6 in various q 2 bins; see Appendix A.3 for details. We compare our results with several sources. Note that for our predictions (postdictions) the uncertainties correspond to 68 % credibility intervals that arise from variation of only the nuisance parameters (all fit parameters This stands in contrast to the published results of Belle, CDF, CMS, and LHCb, which are all in good agreement with the SM at low q 2 . The BaBar results are an average of B 0 → K * 0 + − and B + → K * + + − . While the neutral mode yields F L consistent or close to the SM, the charged mode deviates strongly in the low q 2 region and points in principle to a large isospin asymmetry for the longitudinally polarized K * branching fraction [60]. The ATLAS measurement has been performed only for the neutral mode. Although preliminary and despite the isospin average in the case of BaBar, we include both measurements in the fit.

Results
In this section, we review briefly our statistical approach and summarize our fit results in several subsections, providing measures for the goodness of fit. We describe several solutions in the subspace of the Wilson coefficients for all four scenarios introduced in Sect. 2. Using Bayes factors, we compare models with non-SM Wilson coefficients to the SM(ν-only) fit. Finally, we present results for the nuisance parameters of the form factors and subleading corrections to the B → K * + − transition amplitudes in each of the scenarios. Throughout we will compare with recent similar analyses in the literature.

Statistical approach
Our results are obtained from a Bayesian fit, similar to our previous work [30]. The main outputs are samples drawn from the posterior distribution using the EOS flavor program [61]. The samples are obtained using an algorithm that employs Markov chains, hierarchical clustering, and adaptive importance sampling (for a detailed description we refer to [62]). Throughout we denote by P(θ |D, M) the posterior, where D ∈ {full, full (+FF), selection} represents the data set, and θ all parameters (C i and nuisance parameters ν) of the model M ∈ {SM(ν-only), SM, SM+SM , SM+SM (9)} as defined in Sect. 2. The weighted posterior samples provide access to all marginal distributions and to the evidence where the integration extends over the whole prior volume V 0 spanned by the parameters θ . The likelihood and prior distribution are denoted by P(D|θ , M) and P 0 (θ |M), respectively. For M ∈ {SM, SM+SM }, the posterior has numerous well separated local maxima. Most of these have a negligible impact, and we consider only those solutions with significant posterior mass. We require the ratio R of the local evidence-integration volume V 0 restricted to contain only a single solution-to the global evidence (4.1) exceeds 0.001. We label the individual solutions as A in the SM(ν-only) scenario, A and B in the SM scenario, and A through D in the SM+SM scenario, whereas in SM+SM (9) only A appears. For each model, A ( ) denotes the solution in which the signature of (C 7 , C 9 , C 10 ) is (−, +, −) as predicted in the SM(ν-only), and B ( ) indicates flipped signs; i.e., (+, −, +).
To determine the goodness of fit, we first find the bestfit point, θ * , in each solution using Minuit [63]. Next, we calculate the pull value as in [30] for fixed M, θ * for each constraint, and finally χ 2 as the quadratic sum of pulls. From χ 2 , a p value follows assuming N dof degrees of freedom. Note that there are N experimental constraints and dim ν informative priors. For the goodness of fit, we consider each informative prior as one constraint. With K Wilson coefficients varied in M, we have degrees of freedom. For the full data set we have dim ν = 28 nuisance parameters, and we include N = 93 constraints. In addition there are 11 or 5 theory constraints on the form factors, depending on whether we include the lattice results of the B → K * form factors or not. Below we denote both setups as "full (+FF)" and "full", respectively. For the selection data set, we have N = 27 inputs-including two theory constraints-and dim ν = 23. For this data set we do not include the lattice form factor results since their theory uncertainty, when extrapolated to low q 2 , is comparable to the uncertainty of the LCSR results. We calculate the Bayes factor between two statistical models M 1 and M 2 and for a common data set D, The standard quantity to compare two models, the posterior odds, are defined as i.e., the product of the Bayes factor and the prior odds P 0 (M 1 )/P 0 (M 2 ). It is important to note that the prior of the model parameters, P 0 (θ |M), is an integral part of the model M. Therefore, the Bayes factor penalizes M 2 versus M 1 if M 2 contains extra parameters because the evidence (4.1) is just the likelihood weighted by the prior, and the average typically decreases when the same unit probability mass is smeared over a larger volume V 0 . This occurs, e.g., in the present analysis for M 1 = SM(ν-only) and M 2 = SM as the Wilson coefficients are fixed in M 1 but variable with flat priors over a large volume covering multiple solutions in M 2 . With the evidence given separately for each solution in Table 3, the reader can, for example, compute the Bayes factor between M 1 and M 2 as though only one of the solutions had been allowed a priori by reducing the (flat) prior ranges of the Wilson coefficients and scaling the evidence accordingly. We focus on the SM-like solution of each scenario that is fully contained in a hyperrectangle with edge lengths 0.4 for C 7,7 , 4.0 for C 9,9 ,10,10 .
The penalty due to extra parameters can be overcome if M 2 provides a significantly better description of the data; i.e., We stress that the evidence by itself is not meaningful because of the arbitrary likelihood normalization due to the fact that we do not have the actual events seen by an experiment but only a concise summary usually in the form of an observable's value maximizing the likelihood value plus uncertainties. Using a consistent normalization, at least ratios of evidences for identical data D-as in the Bayes factorhave a well defined interpretation.
The fit results regarding model comparison and goodness of fit are listed in Table 3, in which the local evidence is available at a relative precision of 1 % on the linear scale. For an overview of all of the following results on the Wilson coefficients, we refer to Table 4. The pull values entering the p value are compiled for each experimental constraint in Table 5 for the solution A of SM and A of the SM+SM .

Fitting the nuisance parameters
Let us begin the summary of our results with the fit of the scenario SM(ν-only); i.e., the fit of nuisance parameters by fixing Wilson coefficients to their values in the SM at the Table 4 The 68-and 95 %-credibility intervals and the local modes of the marginalized 1D posterior distributions of the Wilson coefficients at μ = 4.2 GeV, P(C i |D), i = 7, 9, 10, 7 , 9 , 10 , for nominal priors of nuisance parameters in the various scenarios. Note that for the SM+SM scenario the individual solutions cannot be disentangled within the 1D posterior distributions, unlike for the SM scenario. For comparison, the SM values of the Wilson coefficients read C SM 7 = −0.34, C SM 9 = +4.27, C SM 10 = −4.17, C SM 7 = −0.01, C SM 9 = C SM 10 = 0 Scenario SM "selection" SM "full" SM+SM "full" The main purpose is to check whether the SM(ν-only), including all theory uncertainties, provides a good description of the available data. This scenario serves as the reference point to compare with scenarios SM, SM+SM (9) and SM+SM later on. Within the SM(ν-only) scenario, we perform three fits to the data sets "full", "full (+FF)", and "selection". All fits exhibit good or satisfactory p values of 0.10, 0.04, and 0.75, respectively, and they show that the posterior is unimodal. For the values of the evidence and χ 2 , we refer to Table 3. The large p value for the data set "selection" can be explained through the smaller overall number of measurements, and especially the absence of measurements deviating substantially from their SM predictions, such as ATLAS' and BaBar's results on F L [1,6] . The measurements with the largest pull values above 2σ are all in B → K * + − as can be seen in Table 5: the aforementioned F L [1,6] from BaBar and ATLAS, B [16,19] from Belle, A FB [16,19] from ATLAS and the two optimized observables P 4 [14.18,16] and P 5 [1,6] from LHCb (see also the caption of Table 5).
We note that removing the ATLAS and BaBar measurements of F L [1,6] increases the p value substantially, from 0.10 to 0.38 and from 0.04 to 0.30 for the "full" and "full (+FF)" data sets, respectively. The smaller p value of the "full (+FF)" data set compared to the one of the "full" data set arises dominantly from a tension of the B(B → K * + − ) data at high q 2 with predictions based on B → K * lattice form factors [49,51].
We find that the impact of experimental measurements published after our previous analysis [30] does not change the main outcome; i.e., the data can be accurately described without resort to new physics beyond the SM. This result may appear surprising given the large tensions that were seen in [47]. Within our approach, however, the tension between SM prediction and measurement of P 5 [1,6] can be eased by shifts in the parameters ζ  Table 2), thereby reducing the tension and explaining the B → K * + − "anomaly". The prior and posterior distributions of ζ Lχ K * are shown in Fig. 1. We do not find any shifts Table 5 Compilation of the pull values in units of σ at the SM-like best fit points A in the SM fit (left columns) and A in the SM+SM fit (right columns), listed per experiment and observable. Only pull values for fits with the "full" data set are listed. The single CLEO measurement of B(B → K * γ ) has a pull value +0.3σ in both the SM and the SM+SM fits. The pull values for the SM(ν-only) fit are very similar to the SM fit for the "full" data set, with the exception of +2.1σ for the LHCb measurement of P 5 [1,6] . The pull values for SM+SM with the "full (+FF)" data set deviate by less than 0.2 from those given for SM+SM "full". The pull values for SM "full (+FF)" also agree with those of SM "full" except for notable deviations of ∼ 0.6σ for the high q 2 bins of the B → K * + − branching ratio in the parameters ζ Rχ K * , χ =⊥, , 0 exceeding a few percent. There are no significant differences between the "full" and "full (+FF)" fits, except for two shifts in subleading parameters. At low q 2 , ζ L0 K * reduces from −15 % to about −5 %, and at high q 2 , changes from 0 % to about −5 % due to the tension between the data and predictions using lattice B → K * form factors for the branching ratio B (see also the B predictions in [49]). We find also substantial negative sub-leading contributions to the B → K + − amplitude at low q 2 , which lowers B [1,6] to accommodate the measurement by LHCb. This is in agreement with the observations of [36].
Beyond inference of the size of subleading contributions to the amplitudes, we extract information on the B → K and B → K * hadronic form factors. We confront our results for the various fit scenarios with the prior values in Table 6. Within the "full" data set, the ratio V (0)/A 1 (0) is in over- all good agreement with the results labeled "SE2 LCSR" of [31], with less than 1σ tension for our scenarios SM(ν-only) and SM+SM , and 1σ deviation for our scenario SM. Our results for the ratio A 2 (0)/A 1 (0) are consistently higher than those of [31], with a 2σ variance which can be attributed to our usage of the constraint on A 0 (0); see (6.3). In the case of the "full (+FF)" data set, the prior values of B → K * form factors at q 2 = 0 show two to three times smaller uncertainties when taking into account the lattice results [51], whereas the shape parameters b 1 are not affected except for the form factor A 2 . Although lattice results provide a reduction of the prior uncertainty, there is still agreement with [31] within 1σ for the "full (+FF)" data set. Except for a slight increase of the variance of V (0)/A 1 (0) in scenarios SM and SM+SM , the ratios do not change qualitatively.

Fit in the SM basis
Although the SM(ν-only) fit shows that the SM provides a reasonable description of the available data, we still extend our analysis to obtain model-independent constraints on NP couplings. In the SM scenario we fit the real-valued Wilson coefficients C 7,9,10 in addition to the nuisance parameters (see (2.3)) to the data sets "full", "full (+FF)" and "selection". For all data sets we obtain two dominant solutions A and B with SM-like and flipped signs of the Wilson coefficients, and many more solutions with negligible posterior mass.
In the case of the "selection" data set, the p values of 0.88 and 0.89 are obtained for solutions A and B, respectively, depicted in Fig. 2. They are larger compared to 0.75 obtained in the SM(ν-only) scenario, indicating that the additional parameters further reduce the tension with the data by χ 2 −5. Within solution A, the fit yields a deviation from the SM value of C 9 with 68 % probability; see Table 4. However, we find no significant deviations in either C 7 or C 10 . This observation is compatible with the findings of [47], where the sign-flipped solution B had been discarded. However, in our results the 2D-marginalized posterior shows merely a 2σ deviation from the SM, in contrast to 3.2σ as in [47]. The ratio of posterior masses is R A : R B = 47 % : 53 %, slightly in favor of the flipped-sign solution.
As in the case of SM(ν-only), the p values of the "full" data set are much smaller compared to the "selection" data, but they still indicate a decent fit at 0.08 for both solutions A and B. Contrary to the "selection", solution A is now strongly favored over solution B: R A : R B = 74 % : 26 %. This underlines the importance of a combined analysis of all available experimental data rather than a selected subset.
As can be seen in Fig. 2, the SM lies within the 1σ credibility regions of all 2D-marginalized posterior distributions. With the updated experimental data, the credibility regions are reduced in size by roughly a factor of two when compared to our previous results [30]. For the 1D credibility regions with both solutions A and B we refer to Table 4 The authors of [48] do not consider a scenario of simultaneous NP contributions to C 7,9,10 , but only single-Wilsoncoefficient scenarios C 7 and C 9 , the two-Wilson-coefficient scenario C 7,9 and the full set of Wilson coefficients of Table 6 1D-marginalized posterior results at 68 % probability in comparison to the prior inputs for the various B → K * (upper rows) and B → K (middle two rows) form-factor parameters. The results are shown for the "full" (left) and "full (+FF)" (right) data set in various scenarios. The priors for the "full" data set comprise LCSR [35] inputs combined with the additional constraints (6.1)-(6.3) and B → K lattice results [50], whereas for "full (+FF)" the B → K * lattice results [51] are added. Note that the marginalization has been performed over all solutions A, B in the case of SM and A − D in the case of SM+SM  Their results show a decrease of | 7 | once allowing NP contributions to C 9,10 in the ballpark of our findings. 2 The NP contributions 9 and 10 are also found to be preferentially negative. 2 Note that in [48] Wilson coefficients are determined at the scale μ = 160 GeV but RGE effects are only of concern for 7 .
The situation of the P 5 anomaly is the same as in the SM(ν-only) fit, and the modifications to the posterior distributions of ζ L(R)χ K * , χ =⊥, , 0 are of the same type and similar size for both data sets. The same applies to the postdiction P 5 [1,6] given in Table 2. The pull value of P 5 [1,6] decreases only little from 2.1σ in the SM(ν-only) fit to 1.6σ in the SM fit when allowing NP contributions to C 7,9,10 . However, the tensions in other measurements are not eased; see Table 5.  Table 3. The posterior distributions marginalized to the 2D (C i − C i ) planes (i = 7, 9, 10) are shown in Fig. 3 with the SM point and the projection of the best-fit points in each solution A through D . Note that the projection of the best-fit point can deviate from the position of the modes of the marginalized distributions; compare to the 1D intervals in Table 4. Unlike in the SM scenario, it is not possible to disentangle the individual solutions A through D within the 1Dmarginalized posterior distributions. In order to compare our findings with [48] we choose those intervals that contain the SM-like signs for C 7,9,10 and find with 68 % probability 7 = +0.01 +0.02 −0.05 , 9 = −0.8 +0.2 −0.5 , 10 = −0.1 +0.6 −0.5 , in agreement with [48]. Our results hardly change when taking B → K * lattice results into account. The best-fit points for C 7 ,9 ,10 of [48] fall into the intervals given in Table 4, with larger deviations from the modes of the 1D posterior distributions. The SM prediction C SM 7 = −0.01 and C SM 10 = 0 is contained in the smallest 68 % region of the 1D-marginalized posterior distributions, whereas C SM 9 = 0 is not. In the 2Dmarginalized (C 9 -C 9 ) plane, the SM point is just outside of the 95 % region, dominantly due to a shift in C 9 ; see Fig. 3.
The additional NP contributions in chirality-flipped operators in scenario SM+SM can address the tension in the measurement of P 5 [1,6] . However, the previously mentioned large pull values for F L [1,6] , B [16,19] , P 4 [14.18,16] and A FB [16,19] remain almost unchanged (see Table 5). This corroborates the findings of [31,48] that the pull value of P 4 [14.18,16] cannot be pushed below 2σ .
Focusing on the solution A [cf. (4.5)], the Bayes factor becomes Thus the NP hypothesis with chirality-flipped Wilson coefficients is disfavored in comparison to the SM(ν-only) hypothesis. However, the data favor SM+SM over SM with roughly 5 : 1. Taking the lattice form factor results into account, we find P(full (+FF)|SM+SM ) P(full (+FF)|SM(ν − only)) A = 5 : 1, (4.10) a significant increase by almost 100 compared to (4.9); see the discussion in Sect. 4.5.
In the SM+SM , the size of subleading contributions to transversity amplitudes χ = 0 ( ) reduces to about −5 % (+5 %) for ζ Lχ K * , in contrast to ζ L⊥ K * , which remains large; see Fig. 1. The solutions A through D are clearly distinct when viewed in the ζ L⊥ K * − C 9 plane. The subleading parameter at high q 2 decreases back to 0 % for the "full (+FF)" data set. There are no differences between the "full" and "full (+FF)" data sets. Similarly, the subleading contributions to the B → K + − amplitude disappear. The small shifts we observe between the SM and SM+SM scenarios suggest a common property to ease the tensions between predictions and data, that is shared by the ζ Lχ K * and the chirality flipped Wilson coefficients. It is therefore desirable to better understand size, chirality structure, and q 2 -dependence of the power corrections.
In view of the significantly enlarged Bayes factor (4.9) in the scenario with chirality-flipped operators compared to (4.8) in the SM, we investigate also the variant SM+SM (9). This scenario has only two additional new-physics parameters C 9,9 compared to six in the SM+SM scenario, implying a "smaller punishment" due to the larger prior volume w.r.t. the SM(ν-only) case. The 2D-marginalized (C 9 − C 9 ) plane in Fig. 4 shows the SM point on the border of the 3σ region for the only solution A . The results do not change much when using the "full (+FF)" data set. The 1σ probability regions from the 1D-marginalized posterior distributions are which is compatible with the results of [48]. With the same parameter ranges as in (4.9), the Bayes factor with the "full" data set comparing the SM-like solution A to SM(ν-only) is slightly favoring the SM+SM (9) over the SM. With the "full (+FF)" data set, the Bayes factor increases to P(full (+FF)|SM+SM (9)) P(full (+FF)|SM(ν − only)) A = 820 : 1. (4.12)

Interpretation
We conclude this section with a remark on the model comparison. We emphasize again that, to derive the posterior odds, the Bayes factor has to be multiplied by the appropriate prior odds [see (4.4)], the determination of which is beyond the scope of the present work. Given that the standard model successfully describes the vast majority of particle physics data, our prior odds are in strong favor of the SM(ν-only). This is true even for the considered rare decays despite their particular sensitivity to new physics because the majority of standard model predictions are in the ballpark of the other experimental measurements of B = 1, 2 observables. Currently the lack of experimental evidence for righthanded weak interactions significantly reduces our prior probabilities of scenarios with chirality-flipped Wilson coefficients SM+SM and SM+SM (9) compared to SM(ν-only). Among the scenarios with chirality-flipped Wilson coefficients we would set prior odds in favor of SM+SM because SM+SM (9) is only a restricted subset of SM+SM .
It remains to be noted that the SM+SM (9) scenario can be realized in a model-independent way due to operator mixing with b → s ff four fermion operators. Such scenarios have been considered previously for f = quarks [64][65][66] as well as f = τ [67] and recently discussed in the present context of the data [68] for f = b. In the case of f = quarks also hadronic charmless decays would be affected [69]. Explicit Z -models have been discussed recently in [70][71][72][73].
We find for nearly all combinations of scenarios and data sets, except the "selection" data set, satisfactory p values of about 5-10 %. This indicates that it is possible to find a single parameter point (the best-fit point), at which the data is adequately described. The Bayes factor, in contrast, quantifies the ability to describe the data, averaged over the full parameter space. Concerning the SM scenario, the Bayes factor indicates that the additional parameters do not yield an advantage over the SM(ν-only) scenario. However, it is interesting to see that the Bayes factors are undecided or even in favor of the scenarios SM+SM and SM+SM (9) for the "full" data set. Taking lattice results into account, the Bayes factors increase roughly by a factor 100 for the "full (+FF)" data set.
We emphasize that this increase in the latter scenarios for the "full (+FF)" data set can be traced to three causes. First, the prior information based on the lattice results is in line with the information on form factor parameters as inferred from the data; see Table 6. Second, chirality-flipped Wilson coefficients can, at least partially, take on the role of subleading contributions to the B → K ( * ) + − amplitudes. Since changing the Wilson coefficients (with flat priors) is "less costly" than changing the subleading parameters (with Gaussian prior distributions), the Bayes factor penalizes the latter. Third, the goodness of fit decreases for SM(ν-only) while it stays constant for SM+SM as follows from the χ 2 values of Table 3.
Evaluating the sensitivity to the prior shape, we repeated the "full (+FF)" fits replacing Gaussian priors with flat priors of the same range-as in Fig. 1-for all subleading parameters. The Wilson coefficients are determined with the same or slightly higher precision as before. Most of the subleading parameters play a minor role in the fit, so the prior equals the posterior. The two exceptions are the large-recoil parameters ζ L⊥ K * and ζ L0 K * . Their posteriors are very similar to those of the fit with Gaussian priors shown in Fig. 1 but marginally wider. Due to the extra freedom, all models are able to better describe the data. At the best-fit point, the p values roughly double from 0.04 to 0.1 (SM(ν-only)), from 0.09 to 0.16 (SM+SM ), and from 0.12 to 0.2 (SM+SM (9)). The Bayes factors slightly shift in favor of SM(ν-only) by 2.6 (SM+SM (9)) and 4 (SM+SM ) indicating that the models with variable C i on average do not benefit as much from the added flexibility. In summary, changing the prior shape of the subleading parameters does not entail any big surprise and corroborates our main findings.

Conclusions
Our Bayesian analysis indicates that the standard model provides an adequate description of the available measurements of rare leptonic, semileptonic, and radiative B decays. Compared to our previous analysis [30], we determine the Wilson coefficients C 7,9,10 more accurately, dominantly due to the reduction of the experimental uncertainties in the exclusive decays and the addition of the inclusive decay B → X s γ .
Contrary to all similar analyses, our fits include the theory uncertainties explicitly through nuisance parameters. We observe that tensions in the angular and optimized observables in B → K * + − decays can be lifted through (10)(11)(12)(13)(14)(15)(16)(17)(18)(19)(20) % shifts in the transversity amplitudes at large recoil due to subleading contributions. These shifts are present within the SM as well as the model-independent extension of real-valued Wilson coefficients C 7,9,10 . For the scenarios introducing additional chirality-flipped coefficients C 7 ,9 ,10 , the shifts reduce to a few percent (except ζ L⊥ K * ). We find |C 9 | < 4 (4.1) and |C 10 | < 2 (3) with 68 % (95 %) probability for the right-handed couplings, which holds in the absence of scalar and tensor contributions. These constraints are insensitive to the shape (Gaussian vs. flat) of the priors of subleading corrections.
Among the information inferred from the data are constraints on the parameters of the B → K ( * ) form factors. We have performed all fits with and without the very recent lattice B → K * form factor predictions [51]. These form factors constitute very strong additional prior information leading to a tension with branching fraction measurements of B → K * + − at high q 2 [49]. It can be eased by the presence of chirality-flipped Wilson coefficients and currently disfavor new physics without them. Notably, in the fits without the lattice results the data still yields larger posterior form factor values in scenarios with chirality-flipped Wilson coefficients, such that independently of the inclusion of the lattice predictions, the posterior ranges of the Wilson coefficients differ only little in these scenarios.
The rough picture emerging from current data may be summarized as follows. The low-q 2 B → K * + − data prefer a negative new-physics contribution to C 9 [47], which is not supported by B → K + − data unless one allows a positive contribution to C 9 (or alternatively C 10 ) [48]. The additional C 9 would be in conflict with current measurements of the branching fraction of B → K * + − at high q 2 , unless one increases the B → K * form factors A 1,2 by about 15 % (see Table 6), which is well within the theoretical uncertainties of their extrapolations of light-cone sum rule predictions from low to high q 2 . Perhaps not accidentally, the central values of recent lattice predictions for these form factors at high q 2 [49] are also higher by the same amount, whereas their theoretical uncertainties are below 10 %. Including the lattice predictions, the model comparison suggests that sce-narios with chirality-flipped Wilson coefficients can provide an explanation of the data as efficient or even better than the standard model.
A substantial reduction of uncertainties can be expected for LHCb, CMS, and ATLAS measurements of B 0 → K * 0 + − and B + → K + + − once they publish the analysis of their 2012 data sets. It should also be mentioned that B → K * γ and B → K ( * ) + − results from Belle are not based on the final reprocessed data set and that BaBar's angular analysis of B → K * + − is still preliminary. It remains to be seen whether these improved analyses further substantiate the present hints of a 2σ deviation from the SM prediction in the (C 9 -C 9 ) plane.
In our opinion, however, there remain two major challenges on the theory side. The first is to improve our analytic knowledge of the 1/m b corrections to the exclusive decay amplitudes. The second is to reduce the uncertainty from hadronic form factors, especially at low q 2 . Without improvements on either, there is little prospect to discern between small NP effects and large subleading corrections. Another point of concern are potentially large duality-violating effects that render the OPE at high q 2 invalid. They have been estimated, though model-dependently, to be small [40]. In this regard, the experimental verification of certain relations [28] among angular observables in B → K * + − that are predicted by the OPE would be very desirable. In the case that some of these relations are not fulfilled, the analysis of the breaking pattern can provide information on duality violation but also on additional new-physics scalar and tensor interactions.
are discussed. Further we explain changes to the choice of priors for some of the nuisance parameters, as well as a different parametrization of B → K * form factors, always with respect to our previous analysis [30]. We also list the updated values of input parameters whose uncertainty is neglected in Table 7.
Concerning the set of common nuisance parameters-the Wolfenstein parameters of the CKM-quark-mixing matrix and the MS bottom-and charm-quark masses entering the majority of observables-we use the updated values given in Table 8 based on the most recent PDG combinations [54] and the tree-level result of the UTfit collaboration [74].
We model asymmetric uncertainty intervals for the priors with LogGamma distributions (see [30] for details), while symmetric intervals are implemented as Gaussian priors.
Appendix A.1: Inclusive decays B → X s (γ , + − ) The branching ratio of the inclusive decay B → X s γ , B(B → X s γ ) ∼ |C 7 | 2 + |C 7 | 2 , represents the most stringent constraint on the magnitude of the dipole Wilson coefficients C 7,7 . In our analysis we include the known corrections to next-to-leading order in α s [77,78] as well as α s corrections [79]. Contrary to the common normalization to the semileptonic inclusive decay B → X c ν, we express the branching ratio in terms of the averaged B meson life time for a 50:50 production ratio of B + B − to B 0B0 pairs at the ϒ(4S), τ B +/0 given in Table 7, and the bottom-quark pole mass. In order to avoid renormalon ambiguities we calculate the pole mass value from the MS mass m b (m b ) using the 3loop result [80]. The MS mass is part of our set of common nuisance parameters and its uncertainty dominates the overall theory uncertainty of inclusive decays. At order 2 QCD /m 2 b in the heavy-quark expansion hadronic matrix elements of two dimension-five operators enter. They are parametrized in terms of μ 2 π and μ 2 G for the expectation values of the kinetic and the chromomagnetic operators, respectively. The parameters μ 2 π and μ 2 G enter the fit with priors according to [81], which are listed in Table 9. The correlation between μ 2 G and the b-quark mass is accounted for by the B * -B mass splitting. When confronting the theoretical prediction of the mass splitting with the measurement (3.5), we include also the effect of dimension-six operators as described in [81]. Our SM prediction B(B → X s γ ) = (3.14 +0. 22 −0.19 ) · 10 −4 is in good agreement with the NNLO result [82]. The theory uncertainty of our prediction is determined from variation of the Wolfenstein parameters, m b (m b ), m c (m c ), μ 2 π , and μ 2 G . For the prediction of the branching ratio of the inclusive decay B → X s + − we work at NNLO in QCD and NLO in QED, including also 2 QCD /m 2 b subleading cor- rections, as described in [52,53] except for the SM-SM interference terms. We adopt the same normalization in terms of τ B +/0 and bottom-quark pole mass as described for B → X s γ . The chirality-flipped operators are included following [83] and NLO QCD corrections to matrix elements are accounted for in case they can be derived easily within the SM operator basis. The overall theory uncertainty is determined as for B → X s γ from the variation of the same nuisance parameters. We obtain as the SM prediction B(B → X s + − ) [1,6] = (1.4 ± 0.1) · 10 −6 .

Appendix A.2: Form factors and decay constants
The most important change in the treatment of form factors is the consistent use of the parametrization as in [35], for both B → K and B → K * transitions. It has the merits of (a) a convenient expansion in a small parameter z that respects unitarity, (b) correct behavior at the B K ( * ) production threshold, (c) correct asymptotic behavior for q 2 → ∞, and (d) a convenient parametrization at q 2 = 0. For B → K we have modified the prior of the nuisance parameter f + (0) of the f + form factor parametrization w.r.t. our previous analysis but kept the slope parameter b + 1 as is. This change accounts for both LCSR results [35,84] that use the same approach of B-interpolating currents and on-shell K -mesons. As a result the prior on f + (0) is wider and the tension between the SM prediction of the B → K + − branching ratio at q 2 ∈ [1, 6] GeV 2 [36] and the LHCb measurement [15] is reduced. Moreover, the recent lattice predictions [50] of the form factor f + at high q 2 are included in our analysis as part of the likelihood for technical reasons. For this purpose we reproduced lattice predictions at three values of q 2 = 17, 20, 23 GeV 2 as well as their correlation matrix based on the parametrization given in [50]; see Table 10. Due to the change of parametrization of the B → K * form factors V, A 1 and A 2 , their three respective nuisance parameters are replaced by the three form factor normalizations at q 2 = 0 (V (0), A 1,2 (0)) and three slope parameters Table 11 Reproduction of mean values, uncertainties (top) and correlation information (bottom) of lattice points based on the results in the arXiv v1 of [51] for the vector form factors V (q 2 ) and A 1,12 (q 2 ) in B → K * transitions. Note that the update of the results from the arXiv v1 to v2 introduce only minor changes to the mean values and the correlation coefficient for the A 12 lattice points q 2 (GeV) 15 19 ). The LCSR results of [35] are chosen as the priors for normalizations and slopes. We note that these priors are less precise than the results of [88] due to a novel LCSR setup involving an on-shell B meson and interpolation of the K * final state. Beyond the informative priors we also include two additional constraints on B → K * form factors at q 2 = 0. First, the ratio V (0)/A 1 (0) is constrained in the large energy limit as given by [31] (see also references therein) where the uncertainty has been estimated based on power counting. Second, we make use of the relation where A 0 (0) = 0.29 +0.10 −0.07 is the LCSR result given in [35]. We note that (6.1) and (6.3) represent additional constraints on the nuisance parameters V (0) and A 1,2 (0). The motivation for this treatment is to tighten the constraints on A 1 (0), and to avoid unphysical (i.e. negative) values of the form-factor combination A 0 . Similar to our treatment of B → K lattice results, we reproduce lattice predictions [51] for the B → K * form factors V , A 1 , and A 12 at two kinematic points q 2 = 15, 19.21 GeV 2 . We obtain mean values, variances, and the correlation coefficients between neighboring points as given in Table 11. Note that Ref. [51] does not provide information on cross-form factor correlation. We chose the number and spacing of q 2 values so that the correlation of neighboring points does not exceed 60 %.
The updated prior of the B s decay constant f B s , entering the branching ratio of B s → μ + μ − , takes into account recent lattice results; see Table 9.

Appendix A.3: Subleading 1/m b
With increasing knowledge of the B → K ( * ) form factors and measurement of optimized-i.e., form factor insensitive -observables, the treatment of subleading contributions to the amplitudes of both B → K + − and B → K * + − decays has increased in relevance. Especially their analytic q 2 dependence is currently unknown, and their determination is not within the scope of this work. However, we strive to infer the size of contributions that go beyond the known QCDF and low-recoil terms. In order to achieve this goal we keep the parametrization of subleading terms as in our previous work, except for the complex phases of the low-recoil terms. Inference of these phases is not possible in the absence of data on CP asymmetries in B → K ( * ) + − decays at low recoil [30]. We therefore remove these phases from the analysis.
The overall theory uncertainty of our predictions in the region of large recoil differ substantially from those given in [38], due to the different treatment of subleading corrections to the form factor relations and the contributions from cc resonances. We keep the parametrization as in our previous work, χ= ⊥, , 0, (6.4) and obtain similar uncertainties in predictions of observables as in [47]. We collect the experimental measurements as well as theoretical predictions from the literature and this work in Table 2 for the optimized observables P 4,5,6 . There we give predictions, i.e., before the fit, and postdictions that include experimental information, and proceed for this purpose as described in [30]. The predictions are restricted to the SM(ν-only) scenario and are based on the prior distributions of the nuisance parameters. Besides our nominal Gaussian prior choice with 1σ ranges of ±0.15 (15 % at amplitude level) for subleading parameters at large recoil, ζ L(R)χ K * ,K , and also at low recoil, we also show the results for wider 1σ range of ±0.45. We do not include the recent lattice results of B → K * form factors [51].
The central values of our predictions agree within errors with [38,47,49]. At large recoil our theoretical uncertainties are of the same size as in [47] and much smaller than in [38], which treats subleading corrections differently. Choosing wider prior ranges leads to small shifts in the central value and can double the theoretical uncertainty that is still smaller than the one in [38]. At low recoil, subleading corrections are less important and the wider prior ranges practically do not affect the overall uncertainties.
The postdictions are based on the posterior distributions of the fits for each scenario with nominal prior distributions. This includes NP effects in the Wilson coefficients in the scenarios SM and SM+SM . The overall uncertainties of the postdictions are smaller than the uncertainties of the predictions. This can be attributed to the improved knowledge of form factors and subleading nuisance parameters as witnessed by the prior-to-posterior compression in Table 6 and Fig. 1. The additional NP contributions in SM compared to SM(ν-only) do not change postdictions of P 4,5,6 noticeably. On the other hand, chirality-flipped operators in SM+SM can induce small shifts only at low recoil.