Comprehensive Bayesian Analysis of Rare (Semi)leptonic and Radiative B Decays

The available data on | ∆ B | = | ∆ S | = 1 decays are in good agreement with the Standard Model when permitting subleading power corrections of about 15% at large hadronic recoil. Constraining new-physics eﬀects in C 7 , C 9 , C 10 , the data still demand the same size of power corrections as in the Standard Model. In the presence of chirality-ﬂipped operators, all but one of the power corrections reduce substantially. The Bayes factors are in favor of the Standard Model. Using new lattice inputs for B → K ∗ form factors, the evidence of models with chirality-ﬂipped operators increases, but does not outperform the Standard Model. We use the data to further constrain the hadronic form factors in B → K and B → K ∗ transitions. in the various inclusive and exclusive b → s ( γ, (cid:96) + (cid:96) − ) decays that enter the global ﬁts with their respective kinematics and experiments that provide the measurements. The B ∗ – B mass splitting is used to constrain matrix elements of dimension ﬁve operators. Lattice results of B → K form factors are used to constrain their parameters, and theoretical constraints on B → K ∗ form factors are included. For more details we refer to sec. III and app. A. † : Note that we include only LHCb measurements in the [1 , 6] GeV 2 bin part “selection” data set, not low recoil bins.

The goal is to measure a large number of observables accessible in the angular analysis of the decay distributions of the three-and four-body final states. With the CP-averaged observables one can test -in a model-independent fashion -the underlying shortdistance couplings of the ∆B = 1 effective theory. CPasymmetric observables also probe new sources of CP violation beyond the SM. In B → K * (→ Kπ) + − decays, certain combinations of the angular observables, the "optimized observables" [20][21][22][23][24][25][26][27][28], are free of form factors to leading order in the 1/m b expansion and consequently expected to have smaller theoretical uncertainties. The same framework also provides observables that are dominated by ratios of B → K * form factors [22,[28][29][30][31], providing some additional data-driven control over these hadronic quantities.
The 1/m b expansions are important tools for the prediction of exclusive decays. At large hadronic recoil of the K ( * ) meson, QCD factorization (QCDF) yields corrections beyond naive factorization [32,33]. Effects of (qq)-resonances, dominantly from charm, as well as the chromomagnetic dipole operator can be calculated using a light-cone operator product expansion (OPE) in combination with dispersion relations [34][35][36][37][38]. At low hadronic recoil, a local OPE [39,40] can be employed. Its prediction of correlations between different observables can be tested experimentally through measurement of the observables H (1) T and J 7 in B → K * + − [28]. Hadronic form factors are a major source of theoretical uncertainties in the prediction of angular observables, whereas optimized observables are sensitive to higher-order terms in the 1/m b expansions, especially at large recoil. At present, the associated uncertainties due to the unknown 1/m b contributions are estimated based on simple powercounting arguments.
As of 2011, several global analyses of the available data -differing in the degree of sophistication, the statistical approach, and the estimation of theory uncertainties -have been performed [27,30,[41][42][43][44]. Recently, experimental updates from LHCb [5,15,16,45,46] and CDF [14] became available as well as analogous measurements from CMS [6,17] and ATLAS [18]. LHCb is the first experiment to measure optimized observables [19]. Perhaps the most outstanding LHCb result is that the measured value of P 4,5 is in some tension with the SM predictions, stimulating new global fits [47][48][49].
Based on the framework developed in our previous work [30], we perform a global analysis using Bayesian inference, and include a total of 28 nuisance parameters to account for theory uncertainties. Apart from newphysics parameters, our framework allows us to infer also B → K ( * ) form factors and the size of subleading contributions, thereby shedding some light on the origin of tensions between data and SM predictions. Compared to [30], we include the most recent measurements, add additional observables to the fit, and account also for recent lattice calculations of form factors [50,51] and other new theoretical results.
In sec. II, the model-independent framework of ∆B = 1 decays is briefly revisited, and three scenarios of new physics (NP) are introduced. In sec. III, we list the updated experimental input. The results of the global analysis are presented in sec. IV: 1) for the most important Wilson coefficients of the SM operator basis C 7,9,10 and their chirality-flipped counterparts assuming them to be real-valued, 2) for the B → K ( * ) form factors in the SM and the two NP scenarios, and 3) the size of subleading contributions. We also compare our results with recent analyses that had access to the same experimental data. In app. A we summarize the theoretical predictions of newly included observables and changes in the treatment of form factors and subleading contributions.

II. MODEL-INDEPENDENT SCENARIOS
For the global analysis of b → s(γ, + − ) data we use a model-independent approach based on the |∆B| = |∆S| = 1 effective theory. The Hamiltonian reads and their chirality-flipped counterparts (i = 7 , 9 , 10 ), denoted by SM . The Wilson coefficients of the fourquark and the chromomagnetic dipole operators are set to their NNLO SM values at µ = 4.2 GeV [52,53]. In principle, the scalar, pseudo-scalar, and tensor b → s + − operators can contribute to the angular distributions of B → K * (→ Kπ) + − and B → K + − as discussed in great detail in [28]. However, only a few measurements with rather large uncertainties of sensitive observables are currently available, and many other results by LHCb cannot be used because they have been made with the explicit assumption that those new operators do not contribute. We therefore abstain from including those operators into our analysis. Instead, we focus on comparing the SM to several new-physics scenarios with regard to their ability to describe the data well.
In the model-independent approach, the SM and SM Wilson coefficients are the parameters of interest; they are assumed real valued and independent a priori. The nuisance parameters ν serve to model theory uncertainties, including CKM parameters, quark masses, and hadronic matrix elements; see tab. V and tab. VI. We introduce the following scenarios of fit parameters SM(ν-only) : 9,10 SM values C 7 ,9 ,10 SM values ν free floating , SM : Expressing vague prior knowledge, we assign a flat prior distribution to the Wilson coefficients. However, each nuisance parameter ν comes with an informative prior as discussed in detail in app. A.
The SM values C 7,9,10 are obtained at NNLO [52,53] and depend on the fundamental parameters of the topquark and W -boson masses, as well as on the sine of the weak mixing angle. For new-physics models that fall into one of the scenarios SM and SM+SM , the obtained fit results of the Wilson coefficients can be subsequently used to constrain those models' fundamental parameters after accounting for the renormalization group evolution from the high matching scale down to µ ∼ m b .

III. OBSERVABLES AND EXPERIMENTAL INPUT
In this section we describe changes of the experimental inputs that enter our global analysis with respect to our previous work [30]. We first introduce observables which are newly added to the global analysis and refer the reader for details of their theoretical treatment to app. A. Afterward, we summarize those observables whose measurements have been updated, or for which additional measurements have since become available. In general we employ the full set of observables listed in tab. I except for the last row and denote this as "full". Inclusion of the B → K * lattice points from the last row is denoted as "full (+FF)". For the sake of comparison with [47] we also repeat the analysis with a smaller subset called "selection" as specified in the same table. Generally, we model the probability distributions of experimental measurements as (multivariate) Gaussian distributions. In practice, however, experiments do not yet provide cor-relations, except for S and C in B → K * γ. For measurements with asymmetric uncertainties, we model the probability distribution as a split Gaussian with two different widths. For the measurement of B(B s → µ + µ − ), we use the Amoroso distribution to avoid the unphysical region B < 0 as described in [30].
In the following, all observables are understood to be CP-averaged unless noted otherwise. The dilepton invariant mass in inclusive and exclusive b → s + − decays is denoted by q 2 throughout.

A. New observables
Measurements of the branching ratio of the inclusive radiative decay B → X s γ are included with a lower cut on the photon energy E γ > 1.8 GeV. For the branching ratio of the inclusive semileptonic decay B → X s + − integrated over the lowq 2 region q 2 ∈ [1, 6] GeV 2 , we use [4] (III.4) A source of parametric uncertainty in inclusive decays arises from matrix elements of dimension-five operators, µ 2 π and µ 2 G , discussed in more detail in app. A 1. They appear in the heavy-quark expansion at order Λ 2 QCD /m 2 b , and µ 2 G enters also the B * -B mass splitting [54] M B * − M B = (4.578 ± 0.035) · 10 −2 GeV , (III.5) incorporated as an additional experimental constraint. Previous experimental angular analyses of B → K * (→ Kπ) + − were restricted to the measurements of the longitudinal K * -polarization fraction, F L , and the lepton forward-backward asymmetry, A FB . The CDF collaboration was the first to measure A (2) T and the CPasymmetry A 9 = A im [55]. Most recently, LHCb extended the angular analysis to measure A re T [16] as well as S 4,5,7,8 and their optimized analogues P 4,5,6,8 [19] in addition to the previously published results on A (2) T , S 3 , A 9 , and S 9 . The original definitions of the observables S i and P i can be found in [56] and [27], respectively. Here we include the measurements of A re T and P 4,5,6 in the q 2bins [1, 6], [14.18, 16.0], and [> 16.0] GeV 2 . Compared to [30], we replace S 3 data from LHCb by the corresponding A (2) T results.

B. Updated experimental input
We use the same experimental input as in our previous analysis [30], unless the experimental collaborations provide updated measurements. For some of the observables, additional measurements by further experimental collaborations have become available; they are added to the previous ones. Both types of updates are listed below.
The measurement of the time-integrated and CPaveraged branching ratio of the leptonic decay B s → µ + µ − has been recently updated by LHCb and measured for the first time by CMS with 4.0 σ and 4.3 σ signal significance, respectively. To faithfully model the physical constraint B ≥ 0 and the reported asymmetric uncertainties of the experimental probability distribution (PDF), we use the Amoroso distribution [30]. The LHCb measurement of the B + → K + + − branching ratio (CP-averaged) [15], based on 1 fb −1 integrated luminosity, is used in the q 2 bins [1, 6], [14.18, 16. [14] rather than a partial data set [57]. Very recently, LHCb reported a broad peaking structure in the branching ratio at high q 2 compatible with ψ(4160) [58] using the larger data set of 3 fb −1 . Within the picture of quark-hadron duality, an adapted larger bin [≥ 15] GeV 2 around the peak should satisfy the necessary conditions required by the theoretical framework [39,40] in the future. For the moment, we continue to use the binning provided by experiments and do not enlarge theoretical uncertainties in B → K ( * ) + − at high q 2 .
There are numerous updates on B → K * + − for which we use the three q 2 bins [1, 6], [14.18, 16.0], and [> 16.0] GeV 2 . We add recent measurements of the branching ratio from CMS [17] as well as for F L and A FB from CMS [17] and ATLAS [18]. For the branching ratio, F L and A FB we now use the updated values [16] instead of [59] in the case of LHCb, and [14] instead of [55,57] in the case of CDF.
For B 0 → K * 0 (→ Kπ) + − , LHCb provides results of "optimized observables". Combining these with both the observables F L and A FB can lead to double counting [25] in the strict limit of vanishing lepton masses that is well justified in the NP scenarios considered in this work 1 . For this reason, we replace the LHCb measurements of A FB [1, 6] by A re T [1, 6] . However, we continue to use A FB rather than A re T for the low recoil bins because neither is an optimized observable at high q 2 . It is difficult to include the LHCb measurement of  6], [14.18, 16], [> 16] GeV 2 [12][13][14] - [14.18, 16], [16,18], [18,22]  A re T [14.18,16.0] , since it implies that A re T (q 2 ) is constant -against the generic theory prediction -and it attains the maximum value allowed by the theory.
Overall, the measured q 2 dependence of individual observables is in quite good agreement with the SM predictions. The largest deviation of 3.7σ is reported for the optimized observable P 5 [4.3,8.68] [19] when compared to the SM prediction [47]. We do not use any of the [4.3, 8.68] GeV 2 bins since their theory predictions receive large contributions from cc-loops [35]. The [1, 6] GeV 2 bins are less affected by these effects. The remaining uncertainty is accounted for by the parameters of subleading contributions at the level of the decay amplitudes [30] in our predictions. The SM predictions available in the literature for this bin (and the low-recoil bins) are compared with our results in tab. IX. Based on our prior input, we obtain central values as in [47] deviating by 2.5σ from the measurement whereas the analysis [38] has a different central value and larger errors with a 1.0σ deviation from experiment.
A second interesting deviation appears in the optimized observable P 4 [14.18,16.0] . In the SM operator basis it is given by a ratio of form factors [31] up to strongly suppressed subleading corrections. The extrapolation of LCSR form-factor results [35] from low to high q 2 yields a much larger value compared to the measurement. This is also observed [49] with recent lattice QCD determinations of B → K * form factors at high q 2 [51]. They allow us to further constrain the form-factor parameters a priori. However, these determinations do not reliably consider systematic uncertainties due to finite width effects [51] of the K * . For this reason, we provide our fit results for analyses with and without the B → K * lattice inputs. Note, however, that this issue does not affect the lattice determinations of B → K form factors.
A third deviation from the SM prediction is seen in the preliminary measurements of F L [1, 6] from BaBar and ATLAS that are both too low by more than 3σ and 2σ, respectively. This stands in contrast to the published results of Belle, CDF, CMS, and LHCb that are all in good agreement with the SM at low q 2 . The BaBar results are an average of B 0 → K * 0 + − and B + → K * + + − . While the neutral mode yields F L consistent or close to the SM, the charged mode deviates strongly in the low q 2 region and points in principle to a large isospin asymmetry for the longitudinally polarized K * branching fraction [60]. The ATLAS measurement has been performed only for the neutral mode. Although preliminary and despite the isospin average in the case of BaBar, we include both measurements in the fit.

IV. RESULTS
In this section, we review briefly our statistical approach and summarize our fit results in several subsec-tions, providing measures for the goodness of fit. We describe several solutions in the subspace of the Wilson coefficients for all four scenarios introduced in sec. II. Using Bayes factors, we compare models with non-SM Wilson coefficients to the SM(ν-only) fit. Finally, we present results for the nuisance parameters of the form factors and subleading corrections to the B → K * + − transition amplitudes in each of the scenarios. Throughout we will compare with recent similar analyses in the literature.

A. Statistical Approach
Our results are obtained from a Bayesian fit, similar to our previous work [30]. The main outputs are samples drawn from the posterior distribution using the EOS flavor program [61]. The samples are obtained using an algorithm that employs Markov chains, hierarchical clustering, and adaptive importance sampling (for a detailed description we refer to [62]).
Throughout we denote the posterior probability density by P ( θ |D, M ), where D ∈ {full, full (+FF), selection} represents the data set, and θ all parameters (C i and nuisance parameters ν) of the model M ∈ {SM(νonly), SM, SM+SM , SM+SM (9)} as defined in sec. II. The weighted posterior samples provide access to all marginal distributions and to the evidence where the integration extends over the whole prior volume V 0 spanned by the parameters θ. The likelihood and prior distribution are denoted by P (D| θ, M ) and P 0 ( θ |M ), respectively. For M ∈ {SM, SM+SM }, the posterior has numerous well separated local maxima. Most of these have a negligible impact, and we consider only those solutions with significant posterior mass. We require the ratio R of the local evidence -integration volume V 0 restricted to contain only a single solution -to the global evidence (IV.1) exceeds 0.001. We label the individual solutions as A in the SM(ν-only) scenario, A and B in the SM scenario, and A through D in the SM+SM scenario, whereas in SM+SM (9) only A appears. For each model, A ( ) denotes the solution in which the signature of (C 7 , C 9 , C 10 ) is (−, +, −) as predicted in the SM(ν-only), and B ( ) indicates flipped signs; i.e, (+, −, +).
To determine the goodness of fit, we first find the best-fit point, θ * , in each solution by running the two local gradient-free optimizers BOBYQA [63] and COBYLA [64] via NLopt [65] with the same initial point; usually the two results differ only slightly, and we accept the point with the higher posterior. Next, we calculate the pull value as in [30] for fixed M , θ * for each constraint, and finally χ 2 as the quadratic sum of pulls.
From χ 2 , a p value follows assuming N dof degrees of freedom. Note that there are N experimental constraints and dim ν informative priors. For the goodness of fit, we consider each informative prior as one constraint. With K Wilson coefficients varied in M , we have degrees of freedom. For the full data set we have dim ν = 28 nuisance parameters, and include N = 93 constraints. In addition there are eleven or five theory constraints on the form factors, depending on whether we include the lattice results of the B → K * form factors or not; see tab. VII and tab. VIII. We consider those constraints part of the prior, and do not include them in N dof . Below we denote both setups as "full (+FF)" and "full", respectively. For the "selection" data set, we have N = 20 experimental inputs, two theory constraints, and dim ν = 24. For this data set we do not include the lattice form-factor results since their theory uncertainty, when extrapolated to low q 2 , is comparable to the uncertainty of the LCSR results.
We calculate the Bayes factor between two statistical models M 1 and M 2 and for a common data set D, The standard quantity to compare two models, the posterior odds, are defined as i.e., the product of the Bayes factor and the prior odds P 0 (M 1 )/P 0 (M 2 ). It is important to note that the prior of the model parameters, P 0 ( θ|M ), is an integral part of the model M . Therefore, the Bayes factor penalizes M 2 versus M 1 if M 2 contains extra parameters because the evidence (IV.1) is just the likelihood weighted by the prior, and the average typically decreases when the same unit probability mass is smeared over a larger volume V 0 . This occurs, e.g., in the present analysis for M 1 = SM(νonly) and M 2 = SM as the Wilson coefficients are fixed in M 1 but variable with flat priors over a large volume covering multiple solutions in M 2 . With the evidence given separately for each solution in tab. II, the reader can, for example, compute the Bayes factor between M 1 and M 2 as though only one of the solutions had been allowed a priori by reducing the (flat) prior ranges of the Wilson coefficients and scaling the evidence accordingly. We focus on the SM-like solution of each scenario that is fully contained in a hyperrectangle with edge lengths 0.4 for C 7,7 , 4.0 for C 9,9,10,10 . (IV.5) These reduced ranges yield Bayes factors that minimally penalize the NP models. In other words, if   We stress that the evidence by itself is not meaningful because of the arbitrary likelihood normalization due to the fact that we do not have the actual events seen by an experiment but only a concise summary usually in the form of an observable's value maximizing the likelihood value plus uncertainties. Using a consistent normalization, at least ratios of evidences for identical data Das in the Bayes factor -have a well defined interpretation.
The fit results regarding model comparison and goodness of fit are listed in tab. II, in which the local evidence is computed at a relative precision of 1% on the linear scale. For an overview of all of the following results on the Wilson coefficients, we refer to tab. X. The pull values entering the p value are compiled for each experimental constraint in tab. XI for the solution A of SM and A of the SM+SM . The main purpose is to check whether the SM(ν-only), including all theory uncertainties, provides a good description of the available data. This scenario serves as the reference point to compare with scenarios SM, SM+SM (9), and SM+SM later on. Within the SM(ν-only) scenario, we perform three fits to the data sets "full", "full (+FF)", and "selection". All fits exhibit satisfactory or excellent p values of 0.12, 0.06 and 0.90, respectively, and show that the posterior    TABLE III. 1D-marginalized posterior results at 68% probability in comparison to the prior inputs for the various B → K * (upper rows) and B → K (middle two rows) form-factor parameters. The results are shown for the "full" (left) and "full (+FF)" (right) data set in various scenarios. The priors for the "full" data set comprise LCSR [35] inputs combined with the additional constraints (A.1) -(A.3) and B → K lattice results [50], whereas for "full (+FF)" the B → K * lattice results [51] are added. Note that the marginalization has been performed over all solutions A, B in the case of SM and A − D in the case of SM+SM .
has only one relevant mode. For the values of the evidence and χ 2 , we refer to tab. II. The large p value for the data set "selection" can be explained through the smaller overall number of measurements relative to dim ν, and especially the absence of measurements deviating substantially from their SM predictions. The measurements with the largest pull values above 2σ are all in B → K * + − as can be seen in tab. XI: the aforementioned F L [1, 6] from BaBar and ATLAS, B [16,19] from Belle, A FB [16,19] from ATLAS and the two optimized observables P 4 [14.18,16] and P 5 [1, 6] from LHCb (see also the caption of tab. XI). We note that removing the ATLAS and BaBar measurements of F L [1,6] increases the p value substantially, from 0.12 to 0.63 and from 0.06 to 0.55 for the "full" and "full (+FF)" data sets, respectively. The smaller p value of the "full (+FF)" data set compared to the one of the "full" data set arises in part from a known tension of the B(B → K * + − ) data at high q 2 with predictions based on B → K * lattice form factors [49,51].
We find that the impact of experimental measurements published after our previous analysis [30] does not change the main outcome; i.e., the data can be accurately described without resort to new physics beyond the SM. This result may appear surprising given the large tensions that were seen in [47]. Within our approach, however, the tension between SM prediction and measurement of  fig. 1. We do not find any shifts in the parameters ζ Rχ K * , χ =⊥, , 0 exceeding a few percent. There are no significant differences between the "full" and "full (+FF)" fits, except for two shifts in subleading parameters. At low q 2 , ζ L0 K * reduces from −15% to about −10%, and at high q 2 , Λ changes from 0% to about −3% due to the tension between the data and predictions using lattice B → K * form factors for the branching ratio B (see also the B predictions in [49]). The large shift in ζ L⊥ K * is mostly due to B → K * γ and reduces to a few percent once removing measurements of this decay from the fit. Choosing a q 2 -dependent parametrization of subleading corrections should alleviate this tension. We also find a negative subleading contribution to the B → K + − amplitude at low q 2 , which lowers B [1,6] to accommodate the measurement by LHCb. This is in agreement with the observations of [36].
Beyond inference of the size of subleading contributions to the amplitudes, we extract information on the B → K and B → K * hadronic form factors. We confront our results for the various fit scenarios with the prior values in tab. III. Within the "full" data set, the ratio V (0)/A 1 (0) is generally higher than the results labeled "SE2 LCSR" of [31], with less than 1σ tension for our scenarios SM and SM+SM , and 1σ deviation for our scenario SM(ν-only). Our results for the ratio A 2 (0)/A 1 (0) are consistently higher than those of [31], with a 2σ variance, which can be attributed to our usage of the constraint on A 0 (0), see (A.3). In the case of the "full (+FF)" data set, the prior values of B → K * form factors at q 2 = 0 show two to four times smaller uncer-tainties when taking into account the lattice results [51], whereas the shape parameters b 1 are not affected except for the form factor A 2 . Although lattice results provide a reduction of the prior uncertainty, there is still agreement with [31] within 1σ for the "full (+FF)" data set. Except for a slight increase of the variance of V (0)/A 1 (0) in scenarios SM and SM+SM , the ratios do not change qualitatively.

C. Fit in the SM basis
Although the SM(ν-only) fit shows that the SM provides a reasonable description of the available data, we still extend our analysis to obtain model-independent constraints on NP couplings. In the SM scenario we fit the real-valued Wilson coefficients C 7,9,10 in addition to the nuisance parameters (see eq. (II.3)) to the data sets "full", "full (+FF)" and "selection". For all data sets we obtain two dominant solutions A and B with SM-like and flipped signs of the Wilson coefficients, and many more solutions with negligible posterior mass.
In the case of the "selection" data set, a p value of 0.99 is obtained for solution A, depicted in fig. 2. It is larger compared to 0.90 obtained in the SM(ν-only) scenario, indicating that the additional parameters allow to further reduce the tension with the data by ∆χ 2 −6. Within solution A, the fit yields a deviation from the SM value of C 9 ∆ 9 = C 9 − C SM 9 −1.7 ± 0.7 , (IV.7) with 68% probability; see tab. X. However, we find no significant deviations in either C 7 or C 10 . This observation is compatible with the findings of [47], where only solution A had been kept. However, in our results the 2D-marginalized posterior shows a 2.5σ deviation in the (C 7 − C 9 ) plane from the SM, in contrast to 3.2σ as in [47].
As in the case of SM(ν-only), the p values are much smaller for the "full" data set compared to the "selection" data set, but still indicate a decent fit at 0.12 and 0.07 for both solutions A and B, respectively. Removing the F L [1,6] measurement from BaBar and ATLAS increases the p values to 0.59 and 0.53 for solutions A and B, respectively. Contrary to the "selection", solution A is now strongly favored over solution B: R A : R B = 82% : 18%. This underlines the importance of a combined analysis of all available experimental data rather than a selected subset.
As can be seen in fig. 2, the SM lies within the 1σ credibility regions of all 2D-marginalized posterior distributions. With the updated experimental data, the credibility regions are reduced in size by roughly a factor of two when compared to our previous results [30]. For the 1D credibility regions with both solutions A and B we refer to tab. X, whereas for the single solution A we find smaller 68% probability regions of ∆ 7 = 0.0 ± 0.02, ∆ 9 = −0.5 ± 0.3, ∆ 10 = −0.2 ± 0.3. This value ∆ 9 seems to contradict tab. X and fig. 2 as the SM value of C 9 is within the 68% region in both cases. But the 68% regions would only agree if solution B had either negligible or equal weight compared to solution A.
The authors of [48] do not consider a scenario of simultaneous NP contributions to C 7,9,10 , but only single-Wilson-coefficient scenarios C 7 and C 9 , the two-Wilsoncoefficient scenario C 7,9 and the full set of Wilson coefficients of SM+SM . Their results show a decrease of |∆ 7 | once allowing NP contributions to C 9,10 similar to our findings 2 . The NP contributions ∆ 9 and ∆ 10 are also found to be preferentially negative.
The situation of the P 5 anomaly is the same as in the SM(ν-only) fit, and the modifications to the posterior distributions of ζ

D. Fit in the extended SM+SM basis
We proceed with fitting the SM-like and chiralityflipped Wilson coefficients in the SM+SM scenario. Using the "full" data set we obtain a good fit with p values between 0.07 and 0.09 in four well separated solutions A through D , as can be seen in the 2D-marginalized planes in fig. 3. Here A and B denote solutions that show the same signs of the Wilson coefficients C 7,9,10 of the SM operator basis as the solutions A and B in the previous section, and C and D denote further solutions. The corresponding p values of the "full (+FF)" data set are slightly smaller for A and C , ranging from 0.05 to 0.09. Of all four solutions, A and D dominate over B and C in terms of the posterior mass: The posterior distributions marginalized to the 2D (C i − C i ) planes (i = 7, 9, 10) are shown in fig. 3 with the SM point and the projection of the best-fit points in each solution A through D . Note that the projec-tion of the best-fit point can deviate from the position of the modes of the marginalized distributions; compare to the 1D intervals in tab. X. Unlike in the SM scenario, it is not possible to disentangle the individual solutions A through D within the 1D marginalized posterior distributions. In order to compare our findings with [48] we choose those intervals that contain the SM-like solution of C 7,9,10 and find with 68% probability ∆ 7 = 0.01 ± 0.02, ∆ 9 = −0.8 ± 0.4, ∆ 10 = −0.2 ± 0.3, in agreement with [48]. Here ∆ 9 represents a 1.8σ deviation from the SM. These results hardly change when taking B → K * lattice results into account, the deviation of ∆ 9 from the SM increases only slightly to 2.0σ. The best-fit points for C 7 ,9 ,10 of [48] fall into the intervals given in tab. X, with larger deviations from the modes of the 1D posterior distributions. The SM prediction C SM 7 = −0.01 and C SM 9 ,10 = 0 is contained in the smallest 68% region of the 1D-marginalized posterior distributions. The largest deviation of the SM point in any 2D marginalized distribution is 1.6σ in the (C 9 -C 7 ) plane, dominantly due to a shift in C 9 .
The additional NP contributions in chirality-flipped operators in scenario SM+SM do not help in reducing the tension in the measurement of P 5 [1, 6] . The previously mentioned large pull values for F L [1, 6] , B [16,19] , P 4 [14.18,16] and A FB [16,19] remain almost unchanged (see tab. XI). This corroborates the findings of [31,48] that the pull value of P 4 [14.18,16] can not be pushed below 2σ.
Focusing on the solution A (cf. (IV.5)), the Bayes factor becomes P (full|SM+SM ) P (full|SM(ν-only)) A = 1 : 401 . In the SM+SM , the size of subleading contributions to transversity amplitudes χ = 0 ( ) reduces to about −5% (+5%) for ζ Lχ K * , in contrast to ζ L⊥ K * , which remains large, see fig. 1. The subleading parameter Λ at high q 2 decreases back to 0% for the "full (+FF)" data set. There are no differences between the "full" and "full (+FF)" data sets. Similarly, the subleading contributions to the B → K + − amplitude disappear. The small shifts we observe between the SM and SM+SM scenarios suggest that both ζ Lχ K * and C i can ease the tensions between predictions and data. It is therefore desirable to better understand size, chirality structure, and q 2 -dependence of the power corrections.
For comparison with [47][48][49], we investigate also the variant SM+SM (9). This scenario has only two additional new physics parameters C 9,9 compared to six in the SM+SM scenario, implying a "smaller punishment" due to the larger prior volume w.r.t. the SM(ν-only) case. The 2D-marginalized (C 9 − C 9 ) plane in fig. 4 shows the SM point on the border of the 2.0σ region for the only solution A when using the "full (+FF)" data set. However, this changes to 1.4σ when discarding the B → K * lattice form factor results. The 1σ probability regions from the 1D marginalized posterior distributions for "full" data set are which is compatible with the results of [48]. For the "full (+FF)" data set, we find ∆ 9 = −0.7 ± 0.3 , ∆ 9 = +0.6 ± 0.4 .
With the same parameter ranges as in (IV.9), the Bayes factor with the "full" data set comparing the SM-like Credibility regions obtained from the fit in the SM+SM (9) model. We show the results of the "full" data set at 68.3% (dark red) and 95.4% (red) probability. The regions from the fit of the "full (+FF)" data set are overlaid by blue lines at 68.3% (solid) and 95.4% (dashed) probability. The projection of the SM point is represented by the black diamond, whereas the black and blue crosses mark the best-fit points.

E. Interpretation
We conclude this section with a remark on the model comparison. We emphasize again that, to derive the posterior odds, the Bayes factor has to be multiplied by the appropriate prior odds (see (IV.4)), the determination of which is beyond the scope of the present work. Given that the standard model successfully describes the vast majority of particle physics data, our prior odds are in strong favor of the SM(ν-only). This is true even for the considered rare decays despite their particular sensitivity to new physics because the majority of standard model predictions are close to the other experimental measurements of ∆B = 1, 2 observables.
Currently the lack of experimental evidence for righthanded weak interactions significantly reduces our prior probabilities of scenarios with chirality-flipped Wilson coefficients SM+SM and SM+SM (9) compared to SM(ν-only). Among the scenarios with chirality-flipped Wilson coefficients we would set prior odds in favor of SM+SM because SM+SM (9) is only a restricted subset of SM+SM .
It remains to be noted that the SM+SM (9) scenario can be realized in a model-independent way due to operator mixing with b → sff four-fermion operators. Such scenarios have been considered previously for f = quarks [66][67][68] as well as f = τ [69] and recently discussed in the present context of the data [70] for f = b. In the case of f = quarks also hadronic charmless decays would be affected [71]. Explicit Z -models have been discussed recently in [72][73][74][75].
For nearly all combinations of scenarios and data sets, we find satisfactory p values of about 5% to 14%, and in the case of "selection", p ≥ 90%. This indicates that it is possible to find a single parameter point (the best-fit point), at which the data is adequately described. The Bayes factor, in contrast, quantifies the ability to describe the data on average over the full parameter space. Concerning the SM scenario, the Bayes factor indicates that the additional parameters do not yield an advantage over the SM(ν-only) scenario. However, it is interesting to see that the Bayes factors are undecided in the case of scenario SM+SM (9) for the "full(+FF)" data set. Taking lattice results of B → K * form factors into account, the Bayes factors increase roughly by a factor 3 for the "full (+FF)" data set in scenarios SM+SM and SM+SM (9), but not in scenario SM.
Evaluating the sensitivity to the prior shape, we repeated the "full (+FF)" fits replacing Gaussian priors with flat priors of the same range -as in fig. 1 -for all subleading parameters. The Wilson coefficients are determined with the same or slightly higher precision as before. Most of the subleading parameters play a minor role in the fit, so the posterior is similar to the prior. Small shifts of a Gaussian prior become larger shifts -in the same direction -of a flat prior. The two exceptions are the large-recoil parameters ζ L⊥ K * and ζ L0 K * . Their posteriors are very similar to those of the fit with Gaussian priors shown in fig. 1 but marginally wider. Due to the extra freedom, all models are able to better describe the data. At the best-fit point, the p values roughly double from 0.06 to 0.13 in the SM(ν-only), and from 0.08 to 0.17 in the SM+SM (9). The Bayes factor slightly shifts in favor of SM(ν-only) compared to SM+SM (9) from 1 : 1 to 1 : 1.6, indicating that the models with variable C i on average do not benefit as much from the added flexibility. In summary, changing the prior shape of the subleading parameters does not entail any big surprise and corroborates our main findings.

V. CONCLUSIONS
Our Bayesian analysis indicates that the standard model provides an adequate description of the available measurements of rare leptonic, semileptonic, and radia-tive B decays. Compared to our previous analysis [30], we determine the Wilson coefficients C 7,9,10 more accurately, dominantly due to the reduction of the experimental uncertainties in the exclusive decays and the addition of the inclusive decay B → X s γ.
Contrary to all similar analyses, our fits include the theory uncertainties explicitly through nuisance parameters. We observe that tensions in the angular and optimized observables in B → K * + − decays can be lifted through (10)(11)(12)(13)(14)(15)(16)(17)(18)(19)(20)% shifts in the transversity amplitudes at large recoil due to subleading contributions. These shifts are present within the SM as well as the modelindependent extension of real-valued Wilson coefficients C 7,9,10 . For the scenarios introducing additional chiralityflipped coefficients C 7 ,9 ,10 , the shifts reduce to a few percent (except ζ L⊥ K * ). We find |C 9 ,10 | 5 at 95% probability, see fig. 3 and tab. X, for the right-handed couplings, which holds in the absence of scalar and tensor contributions. These constraints are insensitive to the shape (Gaussian vs. flat) of the priors of subleading corrections.
Among the information inferred from the data are constraints on the parameters of the B → K ( * ) form factors. We have performed all fits with and without the very recent lattice B → K * form factor predictions [51]. In both cases, the posterior ranges of the Wilson coefficients C 7,7 ,10,10 are essentially the same apart from minor shifts in C 9,9 . Again in both cases, the posteriors of the B → K * form-factor parameters are very similar. This comes as a surprise given the large difference in prior uncertainties but implies that the combination of measurements supports the lattice input, even independently of the scenario.
The rough picture emerging from current data may be summarized as follows. The low-q 2 B → K * + − data prefer a negative new-physics contribution to C 9 [47], which is not supported by B → K + − data unless one allows a positive contribution to C 9 (or alternatively C 10 ) [48]. Our Bayesian analysis shows strong support for the standard model SM(ν-only) compared to additional new physics in Wilson coefficients C 7,9,10 in the SM-scenario and/or chirality-flipped C 7 ,9 ,10 in the SM+SM -scenario in terms of Bayes factors. Only a reduced scenario SM+SM (9) of the two Wilson coefficients C 9,9 comes close to the standard model. Including the B → K * form-factor lattice predictions, the model comparison suggests that scenario SM+SM (9) can provide an explanation of the data as efficient as in the standard model with a Bayes factor of 1 : 1.
A substantial reduction of uncertainties can be expected for LHCb, CMS, and ATLAS measurements of B 0 → K * 0 + − and B + → K + + − once they publish the analysis of their 2012 data sets. It should also be mentioned that B → K * γ and B → K ( * ) + − results from Belle are not based on the final reprocessed data set and that BaBar's angular analysis of B → K * + − is still preliminary. It remains to be seen whether these improved analyses further substantiate the present hints of a 1 to 2σ deviation from the SM prediction in C 9 .
In our opinion, however, there remain two major challenges on the theory side. The first is to improve our analytic knowledge of the 1/m b corrections to the exclusive decay amplitudes. The second is to reduce the uncertainty from hadronic form factors, especially at low q 2 . Without improvements on either, there is little prospect to discern between small NP effects and large subleading corrections. Another point of concern are potentially large duality-violating effects that render the OPE at high q 2 invalid. They have been estimated, though model-dependently, to be small [40]. In this regard, the experimental verification of certain relations [28] among angular observables in B → K * + − that are predicted by the OPE would be very desirable. In the case that some of these relations are not fulfilled, the analysis of the breaking pattern can provide information on duality violation but also on additional new-physics scalar and tensor interactions. ACKNOWLEDGMENTS D.v.D is grateful to Thorsten Feldmann, Tobias Huber, Soumitra Nandi and the late Nikolai Uraltsev for useful discussions and suggestions. C.B. thanks David Straub for discussions. We are grateful to Christoph Langenbruch for pointing out a small numerical error within EOS. We acknowledge the support by the DFG Cluster of Excellence "Origin and Structure of the Universe". The analyses have been carried out on the computing facilities of the Computational Center for Particle and Astrophysics (C2PAP) and on the HORUS cluster at Siegen University to perform the fits. We further wish to thank Andrzej Buras, Thorsten Feldmann, Christian Hambrock, Gudrun Hiller, Alexander Khodjamirian and Stefan Schacht for comments on the manuscript. This work is supported by the Deutsche Forschungsgemeinschaft (DFG) within research unit FOR 1873 ("QFET"). C.B. received partial support from the ERC Advanced Grant project "FLAVOUR" (267104).

Appendix A: Theoretical Treatment
This appendix details the theoretical predictions for newly added observables, and their respective nuisance parameters are discussed. Further we explain changes to the choice of priors for some of the nuisance parameters, as well as a different parametrization of B → K * form factors, always with respect to our previous analysis [30]. We also list the updated values of input parameters whose uncertainty is neglected in tab. IV.
Concerning the set of common nuisance parameters -the Wolfenstein parameters of the CKM-quark-mixing matrix and the MS bottom-and charm-quark masses entering the majority of observables -we use the updated values given in tab. V based on the most recent PDG  combinations [54] and the tree-level result of the UTfit collaboration [76]. We model asymmetric uncertainty intervals for the priors with LogGamma distributions (see [30] for details), while symmetric intervals are implemented as Gaussian priors.
The branching ratio of the inclusive decay B → X s γ, B(B → X s γ) ∼ |C 7 | 2 + |C 7 | 2 , represents the most stringent constraint on the magnitude of the dipole Wilson coefficients C 7,7 . In our analysis we include the known corrections to next-to-leading order in α s [79,80] as well as α s Λ 2 QCD /m 2 b corrections [81]. Contrary to the common normalization to the semi-leptonic inclusive decay B → X c ν, we express the branching ratio in terms of the averaged B meson life time for a 50:50 production ratio of B + B − to B 0B0 pairs at the Υ(4S), τ B +/0 given in tab. IV, and the bottom-quark pole mass. In order to avoid renormalon ambiguities we calculate the pole mass value from the MS mass m b (m b ) using the 3-loop result [82]. The MS mass is part of our set of common nuisance parameters and its uncertainty dominates the overall theory uncertainty of inclusive decays. At order Λ 2 QCD /m 2 b in the heavy quark expansion hadronic matrix elements of two dimension-five operators enter. They are parametrized in terms of µ 2 π and µ 2 G for the expectation values of the kinetic and the chromomagnetic operators, respectively. The parameters µ 2 π and µ 2 G enter the fit with priors according to [83], which are listed in tab. VI. The correlation between µ 2 G and the b-quark mass is accounted for by the B * -B mass splitting. When confronting the theoretical prediction of the mass splitting with the measurement (III.5), we include also the effect of dimension-6 operators as described in [83]. Our SM prediction B(B → X s γ) = (3.14 +0. 22 −0.19 ) · 10 −4 is in good agreement with the NNLO result [84]. The theory uncertainty of our prediction is determined from variation of the Wolfenstein parameters, m b (m b ), m c (m c ), µ 2 π , and µ 2 G . For the prediction of the branching ratio of the inclusive decay B → X s + − we work at NNLO in QCD and NLO in QED, including also Λ 2 QCD /m 2 b subleading corrections, as described in [52,53] except for the SM-SM interference terms. We adopt the same normalization in terms of τ B +/0 and bottom-quark pole mass as described for B → X s γ. The chirality-flipped operators are included following [85] and NLO QCD corrections to matrix elements are accounted for in case they can be derived easily within the SM operator basis. The overall theory uncertainty is determined as for B → X s γ from the variation of the same nuisance parameters. We obtain as the SM prediction B(B → X s + − ) [1, 6] = (1.4 ± 0.1) · 10 −6 .

Form factors and decay constants
The most important change in the treatment of form factors is the consistent use of the parametrization as in [35], for both B → K and B → K * transitions. It has the merits of a) a convenient expansion in a small parameter z that respects unitarity, b) correct behavior at the BK ( * ) production threshold, c) correct asymptotic behavior for q 2 → ∞, and d) a convenient parametrization at q 2 = 0.
For B → K we have modified the prior of the nuisance parameter f + (0) of the f + form factor parametrization w.r.t our previous analysis but kept the slope parameter b + 1 as is. This change accounts for both LCSR results [35,86] that use the same approach of B-interpolating currents and on-shell K-mesons. As a result the prior on f + (0) is wider and the tension between the SM prediction of the B → K + − branching ratio at q 2 ∈   [51] for the vector form factors V (q 2 ) and A1,12(q 2 ) in B → K * transitions. Note that the update of the results from the arXiv v1 to v2 introduce only minor changes to the mean values and the correlation coefficient for the A12 lattice points.
[1, 6] GeV 2 [36] and the LHCb measurement [15] is reduced. Moreover, the recent lattice predictions [50] of the form factor f + at high q 2 are included in our analysis as part of the likelihood for technical reasons. For this purpose we reproduced lattice predictions at three values of q 2 = 17, 20, 23 GeV 2 as well as their correlation matrix based on the parametrization given in [50], see tab. VII. (The q 2 values and number of points are chosen such that the correlation of neighboring points does not exceed 80%). This constraint is included in the likelihood by means of a multivariate Gaussian. Due to the change of parametrization of the B → K * form factors V, A 1 and A 2 , their three respective nuisance parameters are replaced by the three form-factor normalizations at q 2 = 0 (V (0), A 1,2 (0)) and three slope parameters (b V,A1,2 1 ). The LCSR results of [35] are chosen as the priors for normalizations and slopes. We note that these priors are less precise than the results of [90] due to a novel LCSR setup involving an on-shell B meson and interpolation of the K * final state. Beyond the informative priors we also include two additional constraints on B → K * form factors at q 2 = 0. First, the ratio V (0)/A 1 (0) is constrained in the large energy limit as given by [31] (see also references therein) where the uncertainty has been estimated based on power counting. Second, we make use of the relation where A 0 (0) = 0.29 +0.10 −0.07 is the LCSR result given in [35]. We note that (A.1) and (A.3) represent additional constraints on the nuisance parameters V (0) and A 1,2 (0). The motivation for this treatment is to tighten the constraints on A 1 (0), and avoid unphysical (i.e. negative) values of the form-factor combination A 0 . Similar to our treatment of B → K lattice results, we reproduce lattice predictions [51] for the B → K * form factors V , A 1 and A 12 at two kinematic points q 2 = 15, 19.21 GeV 2 . We obtain mean values, variances, and the correlation coefficients between neighboring points as given in tab. VIII. Note that ref. [51] does not provide information on cross-form factor correlation. We chose the number and spacing of q 2 values so that the correlation of neighboring points does not exceed 60%.
The updated prior of the B s decay constant f Bs , entering the branching ratio of B s → µ + µ − , takes into account recent lattice results, see tab. VI.

Subleading 1/m b
With increasing knowledge of the B → K ( * ) form factors and measurement of optimized -i.e., form factor  insensitive -observables, the treatment of subleading contributions to the amplitudes of both B → K + − and B → K * + − decays has increased in relevance. Especially their analytic q 2 dependence is currently unknown, and their determination is not within the scope of this work. However, we strive to infer the size of contributions that go beyond the known QCDF and low recoil terms. In order to achieve this goal we keep the parametrization of subleading terms as in our previous work, except for the complex phases of the low recoil terms. Inference of these phases is not possible in the absence of data on CP asymmetries in B → K ( * ) + − decays at low recoil [30]. We therefore remove these phases from the analysis. The overall theory uncertainty of our predictions in the region of large recoil differ substantially from those given in [38], due to the different treatment of subleading corrections to the form factor relations and the contributions from cc resonances. We keep the parametrization as in our previous work, and obtain similar uncertainties in predictions of observables as in [47].
We collect the experimental measurements as well as theoretical predictions from the literature and this work in tab. IX for the optimized observables P 4,5,6 . There we give predictions, i.e., before the fit, and postdictions that include experimental information, and proceed for this purpose as described in [30]. The predictions are restricted to the SM(ν-only) scenario and are based on the prior distributions of the nuisance parameters. Besides our nominal Gaussian prior choice with 1σ ranges of ±0.15 (15% at amplitude level) for subleading parameters at large recoil, ζ L(R)χ K * ,K , and also at low recoil, we also show the results for wider 1σ range of ±0.45. We do not include the recent lattice results of B → K * form factors [51].
The central values of our predictions agree within errors with [38], [47], and [49]. At large recoil our theoretical uncertainties are of the same size as in [47] and much smaller than in [38], who treat subleading corrections differently. Choosing wider prior ranges leads to small shifts in the central value and can double the theoretical uncertainty that is still smaller than the one in [38]. At low recoil, subleading corrections are less important and the wider prior ranges practically do not affect the overall uncertainties.
The postdictions are based on the posterior distributions of the fits for each scenario with nominal prior distributions. This includes NP effects in the Wilson coefficients in the scenarios SM and SM+SM . The overall uncertainties of the postdictions are smaller than the uncertainties of the predictions. This can be attributed to the improved knowledge of form factors and subleading nuisance parameters as witnessed by the prior-to-posterior compression in tab. III and fig. 1. The additional NP contributions in SM compared to SM(ν-only) do not change postdictions of P 4,5,6 noticeably. On the other hand, chirality-flipped operators in SM+SM can induce shifts at large and low recoil in P 4,5 . [2] J. Lees      Only pull values for fits with the "full" data set are listed. The single CLEO measurement of B(B → K * γ) has a pull value +0.3σ in both the SM and SM+SM fits. The pull values for the SM(ν-only) fit deviate by less than 0.3σ from those of the SM fit for the "full" data set, with the exception of +2.3σ for the LHCb measurement of P 5 [1, 6] . The pull values in scenario SM+SM with the "full (+FF)" data set deviate by less than 0.4σ from those given for SM+SM "full", except for the LHCb measurements of A with +0.6σ and P 5 [1,6] with +1.2σ. The pull values for SM "full (+FF)" deviate by less then 0.3σ from those of SM "full" except for P 5 [1,6] with a slight increase of +0.4σ.