Comprehensive Bayesian Analysis of Rare (Semi)leptonic and Radiative B Decays

The available data on $|\Delta B| = |\Delta S| = 1$ decays are in good agreement with the Standard Model when permitting subleading power corrections of about 15% at large hadronic recoil. Constraining new-physics effects in $\mathcal{C}_7$, $\mathcal{C}_9$, $\mathcal{C}_{10}$, the data still demand the same size of power corrections as in the Standard Model. In the presence of chirality-flipped operators, all but one of the power corrections reduce substantially. The Bayes factors are in favor of the Standard Model. Using new lattice inputs for $B\to K^*$ form factors, the evidence of models with chirality-flipped operators increases, but does not outperform the Standard Model. We use the data to further constrain the hadronic form factors in $B\to K$ and $B\to K^*$ transitions.

The goal is to measure a large number of observables accessible in the angular analysis of the decay distributions of the three-and four-body final states. With the CP-averaged observables one can test -in a modelindependent fashion -the underlying short-distance couplings of the ∆B = 1 effective theory. CP-asymmetric observables also probe new sources of CP violation beyond the SM. In B → K * (→ Kπ) + − decays, certain combinations of the angular observables, the "optimized observables" [20][21][22][23][24][25][26][27][28], are free of form factors to leading order in the 1/m b expansion and consequently expected to have smaller theoretical uncertainties. The same framework also provides observables that are dominated by ratios of B → K * form factors [22,[28][29][30][31], providing some additional data-driven control over these hadronic quantities.
The 1/m b expansions are important tools for the prediction of exclusive decays. At large hadronic recoil of the K ( * ) meson, QCD factorization (QCDF) yields corrections beyond naive factorization [32,33]. Effects of (qq)-resonances, dominantly from charm, as well as the chromomagnetic dipole operator can be calculated using a light-cone operator product expansion (OPE) in combination with dispersion relations [34][35][36][37][38]. At low hadronic recoil, a local OPE [39,40] can be employed. Its prediction of correlations between different observables can be tested experimentally through measurement of the observables H (1) T and J 7 in B → K * + − [28]. Hadronic form factors are a major source of theoretical uncertainties in the prediction of angular observables, whereas optimized observables are sensitive to higher-order terms in the 1/m b expansions, especially at large recoil. At present, the associated uncertainties due to the unknown 1/m b contributions are estimated based on simple powercounting arguments.
As of 2011, several global analyses of the available data -differing in the degree of sophistication, the statistical approach, and the estimation of theory uncertainties -have been performed [27,30,[41][42][43][44]. Recently, experimental updates from LHCb [5,15,16,45,46] and CDF [14] became available as well as analogous measurements from CMS [6,17] and ATLAS [18]. LHCb is the first experiment to measure optimized observables [19]. Perhaps the most outstanding LHCb result is that the measured value of P 4,5 is in some tension with the SM predictions, stimulating new global fits [47][48][49].
Based on the framework developed in our previous work [30], we perform a global analysis using Bayesian inference, and include a total of 28 nuisance parameters to account for theory uncertainties. Apart from newphysics parameters, our framework allows us to infer also B → K ( * ) form factors and the size of subleading contributions, thereby shedding some light on the origin of tensions between data and SM predictions. Compared to [30], we include the most recent measurements, add additional observables to the fit, and account also for recent lattice calculations of form factors [50,51] and other new theoretical results.
In sec. II, the model-independent framework of ∆B = 1 decays is briefly revisited, and three scenarios of new physics (NP) are introduced. In sec. III, we list the up-arXiv:1310.2478v2 [hep-ph] 13 Feb 2014 dated experimental input. The results of the global analysis are presented in sec. IV: 1) for the most important Wilson coefficients of the SM operator basis C 7,9,10 and their chirality-flipped counterparts assuming them to be real-valued, 2) for the B → K ( * ) form factors in the SM and the two NP scenarios, and 3) the size of subleading contributions. We also compare our results with recent analyses that had access to the same experimental data. In app. A we summarize the theoretical predictions of newly included observables and changes in the treatment of form factors and subleading contributions.

II. MODEL-INDEPENDENT SCENARIOS
For the global analysis of b → s(γ, + − ) data we use a model-independent approach based on the |∆B| = |∆S| = 1 effective theory. The Hamiltonian reads with dimension-six flavor-changing operators O i and their respective short-distance couplings, the Wilson coefficients C i (µ). We evaluate the hadronic matrix elements of the operators at the scale µ = 4.2 GeV of the order of the bottom quark mass m b . We restrict our analysis to the set of operators present in the SM (i = 7, 9, 10) and their chirality-flipped counterparts (i = 7 , 9 , 10 ), denoted by SM . The Wilson coefficients of the fourquark and the chromomagnetic dipole operators are set to their NNLO SM values at µ = 4.2 GeV [52,53]. In principle, the scalar, pseudo-scalar and tensor b → s + − operators can contribute to the angular distributions of B → K * (→ Kπ) + − and B → K + − as discussed in great detail in [28]. However, only a few measurements of sensitive observables are currently available, with rather large uncertainties. We therefore abstain from including these operators into our analysis. Instead, we focus on comparing the SM to several new physics scenarios with regard to their ability to describe the data well.
In the model-independent approach, the SM and SM Wilson coefficients are the parameters of interest; they are assumed real valued and independent a priori. The nuisance parameters ν serve to model theory uncertainties, including CKM parameters, quark masses, and hadronic matrix elements; see tab. V and tab. VI. We introduce the following scenarios of fit parameters SM(ν-only) : 9,10 SM values C 7 ,9 ,10 SM values ν free floating , SM : Expressing vague prior knowledge, we assign a flat prior distribution to the Wilson coefficients. However, each nuisance parameter ν comes with an informative prior as discussed in detail in app. A.
The SM values C 7,9,10 are obtained at NNLO [52,53] and depend on the fundamental parameters of the topquark and W -boson masses, as well as on the sine of the weak mixing angle. For new physics models that fall into one of the scenarios SM and SM+SM , the obtained fit results of the Wilson coefficients can be subsequently used to constrain those models' fundamental parameters after accounting for the renormalization group evolution from the high matching scale down to µ ∼ m b .

III. OBSERVABLES AND EXPERIMENTAL INPUT
In this section we describe changes of the experimental inputs that enter our global analysis with respect to our previous work [30]. We first introduce observables which are newly added to the global analysis and refer the reader for details of their theoretical treatment to app. A. Afterward, we summarize those observables whose measurements have been updated, or for which additional measurements have since become available. In general we employ the full set of observables listed in tab. I except for the last row and denote this as "full". Inclusion of the B → K * lattice points from the last row is denoted as "full (+FF)". For the sake of comparison with [47] we also repeat the analysis with a smaller subset called "selection" as specified in the same table.
In the following, all observables are understood to be CP-averaged unless noted otherwise. The dilepton invariant mass in inclusive and exclusive b → s + − decays is denoted by q 2 throughout.   [1,6] GeV 2 bin as part of the "selection" data set, but not the low recoil bins.

A. New observables
Measurements of the branching ratio of the inclusive radiative decay B → X s γ B 1.8 GeV = (3.36 ± 0.13 ± 0.25) · 10 −4 , [1] (III.1) are included with a lower cut on the photon energy E γ > 1.8 GeV. For the branching ratio of the inclusive semileptonic decay B → X s + − integrated over the lowq 2 region q 2 ∈ [1, 6] GeV 2 , we use A source of parametric uncertainty in inclusive decays arises from matrix elements of dimension-five operators, µ 2 π and µ 2 G , discussed in more detail in app. A 1. They appear in the heavy-quark expansion at order Λ 2 QCD /m 2 b , and µ 2 G enters also the B * -B mass splitting [54] M B * − M B = (4.578 ± 0.035) · 10 −2 GeV , (III.5) incorporated as an additional experimental constraint. Previous experimental angular analyses of B → K * (→ Kπ) + − were restricted to the measurements of the longitudinal K * -polarization fraction, F L , and the lepton forward-backward asymmetry, A FB . The CDF collaboration was the first to measure A (2) T and the CPasymmetry A 9 = A im [55]. Most recently, LHCb extended the angular analysis to measure A re T [16] as well as S 4,5,7,8 and their optimized analogues P 4,5,6,8 [19] in addition to the previously published results on A (2) T , S 3 , A 9 , and S 9 . The original definitions of the observables S i and P i can be found in [56] and [27], respectively. Here we include the measurements of A re T and P 4,5,6 in the q 2bins [1,6], [14. 18,16.0], and [> 16.0] GeV 2 . We replace S 3 data from LHCb by the corresponding A (2) T results.

B. Updated experimental input
We use the same experimental input as in our previous analysis [30], unless the experimental collaborations provide updated measurements. For some of the observables, additional measurement by further experimental collaborations have become available; they are added to the previous ones. Both types of updates are listed below.
The measurement of the time-integrated and CPaveraged branching ratio of the leptonic decay B s → µ + µ − has been recently updated by LHCb and measured with 4.0 σ and 4.3 σ signal significance, respectively. To faithfully model the physical constraint B ≥ 0 and the reported asymmetric uncertainties of the experimental probability distribution (PDF), we use the Amoroso distribution [30]. The LHCb measurement of the B + → K + + − branching ratio (CP-averaged) [15], based on 1 fb −1 integrated luminosity, is used in the q 2 bins [1,6], rather than a partial data set [57]. Very recently, LHCb reported a broad peaking structure in the branching ratio at high q 2 compatible with ψ(4160) [58] using the larger data set of 3 fb −1 . Within the picture of quark-hadron duality, an adapted larger bin [≥ 15] GeV 2 around the peak should satisfy the necessary conditions required by the theoretical framework [39,40] in the future. For the moment, we continue to use the binning provided by experiments and do not enlarge theoretical uncertainties in There are numerous updates on B → K * + − for which we use the three q 2 bins [1,6], [14. 18,16.0], and [> 16.0] GeV 2 . We add recent measurements of the branching ratio from CMS [17] as well as for F L and A FB from CMS [17] and ATLAS [18]. For the branching ratio, F L and A FB we now use the updated values [16] instead of [59] in the case of LHCb, and [14] instead of [55,57] in the case of CDF.
For B 0 → K * 0 (→ Kπ) + − LHCb provides results of "optimized observables". Combining these with both the observables F L and A FB can lead to double counting [25] in the strict limit of vanishing lepton masses that is well justified in the NP scenarios considered in this work 1 . For this reason, we replace the LHCb measurements of A FB [1,6] by A re T [1,6] . However, we continue to use A FB rather than A re T for the low recoil bins because neither is an optimized observable at high q 2 . It is difficult to include the LHCb measurement of A re T [14. 18,16.0] , since it implies that A re T (q 2 ) is constant and attains its theoretical maximum.
Overall, the measured q 2 dependence of individual observables is in quite good agreement with the SM predictions. The largest deviation of 3.7σ is reported for the optimized observable P 5 [4.3,8.68] [19] when compared to the SM prediction [47]. We do not use any of the [4.3, 8.68] GeV 2 bins since their theory predictions receive large contributions from cc-loops [35]. The [1,6] GeV 2 bins are less affected by these effects. The remaining uncertainty is accounted for by the parameters of subleading contributions at the level of the decay amplitudes [30] in our predictions. The SM predictions available in the literature for this bin (and the low-recoil bins) are compared with our results in tab. IX. Based on our prior input, we obtain central values as in [47] deviating by 2.5σ from the measurement whereas the analysis [38] has a different central value and larger errors with a 1.0σ deviation from experiment.
A second interesting deviation appears in the optimized observable P 4 [14. 18,16.0] . In the SM operator basis it is given by a ratio of form factors [31] up to strongly suppressed subleading corrections. The extrapolation of LCSR form-factor results [35] from low to high q 2 yields a much larger value compared to the measurement. This is also observed [49] with recent lattice QCD determinations of B → K * form factors at high q 2 [51]. They allow us to further constrain the form factor parameters a priori. However, these determinations do not reliably consider systematic uncertainties due to finite width effects [51] of the K * . For this reason, we provide our fit results for analyses with and without the B → K * lattice inputs. Note, however, that this issue does not affect the lattice determinations of B → K form factors.
A third deviation from the SM prediction is seen in the preliminary measurements of F L [1,6] from BaBar and ATLAS that are both too low by more than 3σ and 2σ, respectively. This stands in contrast to the published results of Belle, CDF, CMS, and LHCb that are all in good agreement with the SM at low q 2 . The BaBar results are an average of B 0 → K * 0 + − and B + → K * + + − . While the neutral mode yields F L consistent or close to the SM, the charged mode deviates strongly in the low q 2 region and points in principle to a large isospin asymmetry for the longitudinally polarized K * branching fraction [60]. The ATLAS measurement has been performed only for the neutral mode. Although preliminary and despite the isospin average in the case of BaBar, we include both measurements in the fit.

IV. RESULTS
In this section, we review briefly our statistical approach and summarize our fit results in several subsections, providing measures for the goodness of fit. We describe several solutions in the subspace of the Wilson coefficients for all four scenarios introduced in sec. II. Using Bayes factors, we compare models with non-SM Wilson coefficients to the SM(ν-only) fit. Finally, we present results for the nuisance parameters of the form factors and subleading corrections to the B → K * + − transition amplitudes in each of the scenarios. Throughout we will compare with recent similar analyses in the literature.

A. Statistical Approach
Our results are obtained from a Bayesian fit, similar to our previous work [30]. The main outputs are samples drawn from the posterior distribution using the EOS flavor program [61]. The samples are obtained using an algorithm that employs Markov chains, hierarchical clustering and adaptive importance sampling (for a detailed description we refer to [62]).
Throughout we denote by P ( θ |D, M ) the posterior, where D ∈ {full, full (+FF), selection} represents the data set, and θ all parameters (C i and nuisance parameters ν) of the model M ∈ {SM(ν-only), SM, SM+SM , SM+SM (9)} as defined in sec. II. The weighted posterior samples provide access to all marginal distributions and to the evidence where the integration extends over the whole prior volume V 0 spanned by the parameters θ. The likelihood and prior distribution are denoted by P (D| θ, M ) and P 0 ( θ |M ), respectively. For M ∈ {SM, SM+SM }, the posterior has numerous well separated local maxima. Most of these have a negligible impact, and we consider only those solutions with significant posterior mass. We require the ratio R of the local evidence -integration volume V 0 restricted to contain only a single solution -to the global evidence (IV.1) exceeds 0.001. We label the individual solutions as A in the SM(ν-only) scenario, A and B in the SM scenario, and A through D in the SM+SM scenario, whereas in SM+SM (9) only A appears. For each model, A ( ) denotes the solution in which the signature of (C 7 , C 9 , C 10 ) is (−, +, −) as predicted in the SM(ν-only), and B ( ) indicates flipped signs; i.e, (+, −, +).
To determine the goodness of fit, we first find the best-fit point, θ * , in each solution using Minuit [63]. Next, we calculate the pull value as in [30] for fixed M , θ * for each constraint, and finally χ 2 as the quadratic sum of pulls. From χ 2 , a p value follows assuming N dof degrees of freedom. Note that there are N experimental constraints and dim ν informative priors. For the goodness of fit, we consider each informative prior as one constraint. With K Wilson coefficients varied in M , we have degrees of freedom. For the full data set we have dim ν = 28 nuisance parameters, and include N = 93 constraints. In addition there are eleven or five theory constraints on the form factors, depending on whether we include the lattice results of the B → K * form factors or not. Below we denote both setups as "full (+FF)" and "full", respectively. For the selection data set, we have N = 27 inputs -including two theory constraints -and dim ν = 23. For this data set we do not include the lattice form factor results since their theory uncertainty, when extrapolated to low q 2 , is comparable to the uncertainty of the LCSR results.
We calculate the Bayes factor between two statistical models M 1 and M 2 and for a common data set D, The standard quantity to compare two models, the posterior odds, are defined as i.e., the product of the Bayes factor and the prior odds . It is important to note that the prior of the model parameters, P 0 ( θ|M ), is an integral part of the model M . Therefore, the Bayes factor penalizes M 2 versus M 1 if M 2 contains extra parameters because the evidence (IV.1) is just the likelihood weighted by the prior, and the average typically decreases when the same unit probability mass is smeared over a larger volume V 0 . This occurs, e.g., in the present analysis for M 1 = SM(νonly) and M 2 = SM as the Wilson coefficients are fixed in M 1 but variable with flat priors over a large volume covering multiple solutions in M 2 . With the evidence given separately for each solution in tab. II, the reader can, for example, compute the Bayes factor between M 1 and M 2 as though only one of the solutions had been allowed a priori by reducing the (flat) prior ranges of the Wilson coefficients and scaling the evidence accordingly. However, the penalty due to extra parameters can be overcome if M 2 provides a significantly better description of the data; i.e., higher likelihood values. In conclusion, B(D|M 1 , M 2 ) > 1 implies that the data favor M 1 . We stress that the evidence by itself is not meaningful because of the arbitrary likelihood normalization due to the fact that we do not have the actual events seen by an experiment but only a concise summary usually in the form of an observable's value maximizing the likelihood value plus uncertainties. Using a consistent normalization, at least ratios of evidences for identical data D -as in the Bayes factor -have a well defined interpretation.
The fit results regarding model comparison and goodness of fit are listed in tab. II, in which the local evidence is available at a relative precision of 1% on the linear scale. For an overview of all of the following results on the Wilson coefficients, we refer to tab. X. The pull values entering the p value are compiled for each experimental constraint in tab. XI for the solution A of SM and A of the SM+SM .

B. Fitting the Nuisance Parameters
Let us begin the summary of our results with the fit of the scenario SM(ν-only); i.e., the fit of nuisance   The main purpose is to check whether the SM(ν-only), including all theory uncertainties, provides a good description of the available data. This scenario serves as the reference point to compare with scenarios SM, SM+SM (9) and SM+SM later on. Within the SM(ν-only) scenario, we perform three fits to the data sets "full", "full (+FF)", and "selection". All fits exhibit good or satisfactory p values of 0.10, 0.04 and 0.75, respectively, and show that the posterior is unimodal. For the values of the evidence and χ 2 , we refer to tab. II. The large p value for the data set "selection" can be explained through the smaller overall number of measurements, and especially the absence of measurements deviating substantially from their SM predictions, such as ATLAS' and BaBar's results on F L [1,6] . The measurements with the largest pull values above 2σ are all in B → K * + − as can be seen in tab. XI: the aforementioned F L [1,6] from BaBar and ATLAS, B [16,19] from Belle, A FB [16,19] from ATLAS and the two optimized observables P 4 [14. 18,16] and P 5 [1,6] from LHCb (see also the caption of tab. XI).
We note that removing the ATLAS and BaBar measurements of F L [1,6] increases the p value substantially, from 0.10 to 0.38 and from 0.04 to 0.30 for the "full" and "full (+FF)" data sets, respectively. The smaller p value of the "full (+FF)" data set compared to the one of the "full" data set arises dominantly from a tension of the B(B → K * + − ) data at high q 2 with predictions based on B → K * lattice form factors [49,51].
We find that the impact of experimental measurements published after our previous analysis [30] does not change the main outcome; i.e., the data can be accurately described without resort to new physics beyond the SM. This result may appear surprising given the large tensions that were seen in [47]. Within our approach, however, the tension between SM prediction and measurement of P 5 [1,6] can be eased by shifts in the parameters ζ Lχ K * , χ =⊥, , 0 that parametrize the size of subleading contri-butions at large recoil in B → K * + − , see app. A 3 for their definition. The shifts of about −(15-20)% to ζ Lχ K * , χ =⊥, 0 and about +10% to ζ L K * are compatible with the power-counting expectation Λ QCD /m b . They suffice to increase the most likely value of P 5 [1,6] from −0.34 (nominal prior) to −0.23 (see tab. IX), thereby reducing the tension and explaining the B → K * + − "anomaly". The prior and posterior distributions of ζ Lχ K * are shown in fig. 1. We do not find any shifts in the parameters ζ Rχ K * , χ =⊥, , 0 exceeding a few percent. There are no significant differences between the "full" and "full (+FF)" fits, except for two shifts in subleading parameters. At low q 2 , ζ L0 K * reduces from −15% to about −5%, and at high q 2 , Λ changes from 0% to about −5% due to the tension between the data and predictions using lattice B → K * form factors for the branching ratio B (see also the B predictions in [49]). We find also substantial negative subleading contributions to the B → K + − amplitude at low q 2 , which lowers B [1,6] to accommodate the measurement by LHCb. This is in agreement with the observations of [36].
Beyond inference of the size of subleading contributions to the amplitudes, we extract information on the B → K and B → K * hadronic form factors. We confront our results for the various fit scenarios with the prior values in tab. III. Within the "full" data set, the ratio V (0)/A 1 (0) is in overall good agreement with the results labeled "SE2 LCSR" of [31], with less than 1σ tension for our scenarios SM(ν-only) and SM+SM , and 1σ deviation for our scenario SM. Our results for the ratio A 2 (0)/A 1 (0) are consistently higher than those of [31], with a 2σ variance, which can be attributed to our usage of the constraint on A 0 (0), see (A.3). In the case of the "full (+FF)" data set, the prior values of B → K * form factors at q 2 = 0 show two to three times smaller uncertainties when taking into account the lattice results [51], whereas the shape parameters b 1 are not affected except for the form factor A 2 . Although lattice results provide a reduction of the prior uncertainty, there is still agreement with [31] within 1σ for the "full (+FF)" data set. Except for a slight increase of the variance of V (0)/A 1 (0) in scenarios SM and SM+SM , the ratios do not change qualitatively.

C. Fit in the SM basis
Although the SM(ν-only) fit shows that the SM provides a reasonable description of the available data, we still extend our analysis to obtain model-independent constraints on NP couplings. In the SM scenario we fit the real-valued Wilson coefficients C 7,9,10 in addition to the nuisance parameters (see eq. (II.3)) to the data sets "full", "full (+FF)" and "selection". For all data sets we obtain two dominant solutions A and B with SM-like and flipped signs of the Wilson coefficients, and many more solutions with negligible posterior mass.
In the case of the "selection" data set, the p values of 0.88 and 0.89 are obtained for solutions A and B, respectively, depicted in fig. 2. They are larger compared to 0.75 obtained in the SM(ν-only) scenario, indicating that the additional parameters allow to further reduce the tension with the data by ∆χ 2 −5. Within solution A, the fit yields a deviation from the SM value of C 9 with 68% probability; see tab. X. However, we find no significant deviations in either C 7 or C 10 . This observation is compatible with the findings of [47], where the signflipped solution B had been discarded. However, in our results the 2D-marginalized posterior shows merely a 2σ deviation from the SM, in contrast to 3.2σ as in [47]. The ratio of posterior masses is R A : R B = 47% : 53%, slightly in favor of the flipped-sign solution.
As in the case of SM(ν-only), the p values are much smaller compared to the "selection" data, but still indicate a decent fit at 0.08 for both solutions A and B. Contrary to the "selection", solution A is now strongly favored over solution B: R A : R B = 74% : 26%. This underlines the importance of a combined analysis of all available experimental data rather than a selected subset.
As can be seen in fig. 2, the SM lies within the 1σ credibility regions of all 2D-marginalized posterior distributions. With the updated experimental data, the credibility regions are reduced in size by roughly a factor of two when compared to our previous results [30]. For the 1D credibility regions with both solutions A and B we refer to tab. X, whereas for the single solution A we find smaller 68% probability regions of The authors of [48] do not consider a scenario of simultaneous NP contributions to C 7,9,10 , but only single-Wilsoncoefficient scenarios C 7 and C 9 , the two-Wilson-coefficient scenario C 7,9 and the full set of Wilson coefficients of SM+SM . Their results show a decrease of |∆ 7 | once allowing NP contributions to C 9,10 in the ballpark of our findings 2 . The NP contributions ∆ 9 and ∆ 10 are also found to be preferentially negative.
The situation of the P 5 anomaly is the same as in the SM(ν-only) fit, and the modifications to the posterior distributions of ζ L(R)χ K * , χ =⊥, , 0 are of the same type and similar size for both data sets. The same applies to the postdiction P 5 [1,6] given in tab. IX. The pull value of P 5 [1,6] decreases only little from 2.1σ in the SM(ν-only) fit to 1.6σ in the SM fit when allowing NP contributions to C 7,9,10 . However, the tensions in other measurements are not eased, see tab. XI.  Assuming the prior range of each C i , i = 7, 9, 10, were shrunk to one half of the nominal range in the SM scenario such that solution A is still fully contained, the volume changes as V 0 → V 0 /8 and a fit would yield a In the absence of substantial improvements in the handling of subleading contributions to the B → K ( * ) + − amplitudes, we are forced to conclude that the SM interpretation of the data is more economical than the hypothesis of New Physics contributions to the SM Wilson coefficients.
The inclusion of lattice B → K * form factors in the "full (+FF)" analysis shifts the ratio of the probability mass of solutions A and B towards B, R A : R B = 0.70 : 0.30. The p values at the best fit points decrease to 0.03 and 0.02, respectively. Omitting F L from BaBar and ATLAS, the p values jump to 0.25 in A and B. The 1D-marginalized 68% probability regions of the Wilson coefficients in solution A remain the same when compared to the "full" data set, except for the central value of ∆ 10 , which moves even closer to the SM. Finally, the Bayes factor of solution A would be almost unchanged 1 : 107.

D. Fit in the extended SM+SM basis
We proceed with fitting the SM-like and chiralityflipped Wilson coefficients in the SM+SM scenario. Using the "full" data set we obtain a good fit with p values between 0.07 and 0.09 in four well separated solutions A through D , best seen in the 2D-marginalized (C 7 − C 7 ) plane in fig. 3. Here A and B denote solutions that show the same signs of the Wilson coefficients C 7,9,10 of the SM operator basis as the solutions A and B in the previous section, and C and D denote further solutions. The corresponding p values of the "full (+FF)" data set hardly differ, ranging from 0.06 to 0.09. Of all four solutions, A and B dominate over C and D in terms of the posterior mass: R A : R B : R C : R D = 39% : 41% : 5% : 15% .
The influence of the lattice results of the B → K * form factors roughly increases the posterior mass of solution A by 10% at the expense of B , while leaving C and D as is, see table tab. II.
The posterior distributions marginalized to the 2D (C i − C i ) planes (i = 7, 9, 10) are shown in fig. 3 with the SM point and the projection of the best-fit points in each solution A through D . Note that the projection of the best-fit point can deviate from the position of the modes of the marginalized distributions; compare to the 1D intervals in tab. X. Unlike in the SM scenario, it is not possible to disentangle the individual solutions A through D within the 1D marginalized posterior distributions. In order to compare our findings with [48] we choose those intervals that contain the SM-like signs for C 7,9,10 and find with 68% probability ∆ 7 = +0.01 +0.02 −0.05 , ∆ 9 = −0.8 +0.2 −0.5 , ∆ 10 = −0.1 +0.6 −0.5 , in agreement with [48]. Our results hardly change when taking B → K * lattice results into account. The best-fit points for C 7 ,9 ,10 of [48] fall into the intervals given in tab. X, with larger deviations from the modes of the 1D posterior distributions. The SM prediction C SM 7 = −0.01 and C SM 10 = 0 is contained in the smallest 68% region of the 1D marginalized posterior distributions, whereas C SM 9 = 0 is not. In the 2D marginalized (C 9 -C 9 ) plane, the SM point is just outside of the 95% region, dominantly due to a shift in C 9 , see fig. 3.
Focusing on the solution A and dividing the allowed range of each C i by four compared to the range listed in sec. II, the Bayes factor becomes In the SM+SM , the size of subleading contributions to transversity amplitudes χ = 0 ( ) reduces to about −5% (+5%) for ζ Lχ K * , in contrast to ζ L⊥ K * , which remains large, see fig. 1. The solutions A through D are clearly distinct when viewed in the ζ L⊥ K * − C 9 plane. The subleading parameter Λ at high q 2 decreases back to 0% for the "full (+FF)" data set. There are no differences between the "full" and "full (+FF)" data sets. Similarly, the subleading contributions to the B → K + − amplitude disappear. The small shifts we observe between the SM and SM+SM scenarios suggest a common property to ease the tensions between predictions and data, that is shared by the ζ Lχ K * and the chirality flipped Wilson coefficients. It is therefore desirable to better understand size, chirality structure and q 2 -dependence of the power corrections.
In view of the significantly enlarged Bayes factor (IV.8) in the scenario with chirality-flipped operators compared to (IV.7) in the SM, we investigate also the variant SM+SM (9). This scenario has only two additional new physics parameters C 9,9 compared to six in the SM+SM scenario, implying a "smaller punishment" due to the larger prior volume w.r.t. the SM(ν-only) case.  The 2D-marginalized (C 9 − C 9 ) plane in fig. 4 shows the SM point on the border of the 3σ region for the only solution A . The results do not change much when using the "full (+FF)" data set. The 1σ probability regions from the 1D marginalized posterior distributions are which is compatible with the results of [48]. With the same parameter ranges as in (IV.8), the Bayes factor with the "full" data set comparing the SM-like solution A to SM(ν-only) is P (full|SM+SM (9)) P (full|SM(ν-only)) A = 9 : 1 , (IV.10) slightly favoring the SM+SM (9) over the SM. With the "full (+FF)" data set, the Bayes factor increases to P (full (+FF)|SM+SM (9)) P (full (+FF)|SM(ν-only)) A = 933 : 1 . (IV.11)

E. Interpretation
We conclude this section with a remark on the model comparison. We emphasize again that, to derive the posterior odds, the Bayes factor has to be multiplied by the appropriate prior odds (see (IV.4)), the determination of which is beyond the scope of the present work. Given that the standard model successfully describes the vast majority of particle physics data, our prior odds are in strong favor of the SM(ν-only). This is true even for the considered rare decays despite their particular sensitivity to new physics because the majority of standard model predictions are in the ballpark of the other experimental measurements of ∆B = 1, 2 observables.
Currently the lack of experimental evidence for righthanded weak interactions significantly reduces our prior probabilities of scenarios with chirality-flipped Wilson coefficients SM+SM and SM+SM (9) compared to SM(νonly). Among the scenarios with chirality-flipped Wilson coefficients we would set prior odds in favor of SM+SM because SM+SM (9) is only a restricted subset of SM+SM .
It remains to be noted that the SM+SM (9) scenario can be realized in a model-independent way due to operator mixing with b → sff four fermion operators. Such scenarios have been considered previously for f = quarks [64][65][66] as well as f = τ [67] and recently discussed in the present context of the data [68] for f = b. In the case of f = quarks also hadronic charmless decays would be affected [69]. Explicit Z -models have been discussed recently in [70][71][72][73].
We find for nearly all combinations of scenarios and data sets, except the "selection" data set, satisfactory p values of about 5% to 10%. This indicates that it is possible to find a single parameter point (the best-fit point), at which the data is adequately described. The Bayes factor, in contrast, quantifies the ability to describe the data, averaged over the full parameter space. Concerning the SM scenario, the Bayes factors indicates that the additional parameters do not yield an advantage over the SM(ν-only) scenario. However, it is interesting to see that the Bayes factors are undecided or even in favor of the scenarios SM+SM and SM+SM (9) for the "full" data set. Taking lattice results into account, the Bayes factors increase roughly by a factor 100 for the "full (+FF)" data set.
We emphasize that this increase in the latter scenarios for the "full (+FF)" data set can be traced to three causes. First, the prior information based on the lattice results is in line with the information on form factor parameters as inferred from the data, see tab. III. Second, chirality-flipped Wilson coefficients can, at least partially, take on the role of subleading contributions to the B → K ( * ) + − amplitudes. Since changing the Wilson coefficients (with flat priors) is "less costly" than changing the subleading parameters (with Gaussian prior distributions), the Bayes factor penalizes the latter. Third, the goodness of fit decreases for SM(ν-only) while it stays constant for SM+SM as follows from the χ 2 values of tab. II.

V. CONCLUSIONS
Our Bayesian analysis indicates that the standard model provides an adequate description of the available measurements of rare leptonic, semileptonic, and radiative B decays. Compared to our previous analysis [30], we determine the Wilson coefficients C 7,9,10 more accurately, dominantly due to the reduction of the experimental uncertainties in the exclusive decays and the addition of the inclusive decay B → X s γ.
Among the information inferred from the data are constraints on the parameters of the B → K ( * ) form factors. We have performed all fits with and without the very recent lattice B → K * form factor predictions [51]. These form factors constitute very strong additional prior information leading to a tension with branching fraction measurements of B → K * + − at high q 2 [49]. It can be eased with the presence of chirality-flipped Wilson coefficients and currently disfavor new physics without them. Notably, in the fits without the lattice results the data still yields larger posterior form factor values in scenarios with chirality-flipped Wilson coefficients, such that independently of the inclusion of the lattice predictions, the posterior ranges of the Wilson coefficients differ only little in these scenarios.
The rough picture emerging from current data may be summarized as follows. The low-q 2 B → K * + − data prefer a negative new-physics contribution to C 9 [47], which is not supported by B → K + − data unless one allows a positive contribution to C 9 (or alternatively C 10 ) [48]. The additional C 9 would be in conflict with current measurements of the branching fraction of B → K * + − at high q 2 , unless one increases the B → K * form factors A 1,2 by about 15% (see tab. III), which is well within the theoretical uncertainties of their extrapolations of light-cone sum rule predictions from low to high q 2 . Perhaps not accidentally, the central values of recent lattice predictions for these form factors at high q 2 [49] are also higher by the same amount, whereas their theoretical uncertainties are below 10%. Including the lattice predictions, the model comparison suggests that scenarios with chirality-flipped Wilson coefficients can provide an explanation of the data as efficient or even better than the standard model.
A substantial reduction of uncertainties can be expected for LHCb, CMS, and ATLAS measurements of B 0 → K * 0 + − and B + → K + + − once they publish the analysis of their 2012 data sets. It should also be mentioned that B → K * γ and B → K ( * ) + − results from Belle are not based on the final reprocessed data set and that BaBar's angular analysis of B → K * + − is still preliminary. It remains to be seen whether these improved analyses further substantiate the present hints of a 2σ deviation from the SM prediction in the (C 9 -C 9 ) plane.
In our opinion, however, there remain two major challenges on the theory side. The first is to improve our analytic knowledge of the 1/m b corrections to the exclusive decay amplitudes. The second is to reduce the uncertainty from hadronic form factors, especially at low q 2 . Without improvements on either, there is little prospect to discern between small NP effects and large subleading corrections. Another point of concern are potentially large duality-violating effects that render the OPE at high q 2 invalid. They have been estimated, though model-dependently, to be small [40]. In this regard, the experimental verification of certain relations [28] among angular observables in B → K * + − that are predicted by the OPE would be very desirable. In the case that some of these relations are not fulfilled, the analysis of the breaking pattern can provide information on duality violation but also on additional new-physics scalar and tensor interactions. ACKNOWLEDGMENTS D.v.D is grateful to Thorsten Feldmann, Tobias Huber, Soumitra Nandi and the late Nikolai Uraltsev for useful discussions and suggestions. C.B. thanks David Straub for discussions. We acknowledge the support by the DFG Cluster of Excellence "Origin and Structure of the Universe". The analyses have been carried out on the computing facilities of the Computational Center for Particle and Astrophysics (C2PAP) and on the HORUS cluster at Siegen University to perform the fits. We further wish to thank Andrzej Buras, Thorsten Feldmann, Christian Hambrock, Gudrun Hiller, Alexander Khodjamirian and Stefan Schacht for comments on the manuscript. This work is supported by the Deutsche Forschungsgemeinschaft (DFG) within research unit FOR 1873 ("QFET"). C.B. received partial support from the ERC Advanced Grant project "FLAVOUR" (267104).

Appendix A: Theoretical Treatment
This appendix details the theoretical predictions for newly added observables, and their respective nuisance parameters are discussed. Further we explain changes to the choice of priors for some of the nuisance parameters, as well as a different parametrization of B → K * form factors, always with respect to our previous analysis [30]. We also list the updated values of input parameters whose uncertainty is neglected in tab. IV.
Concerning the set of common nuisance parameters -the Wolfenstein parameters of the CKM-quark-mixing matrix and the MS bottom-and charm-quark masses entering the majority of observables -we use the updated values given in tab. V based on the most recent PDG combinations [54] and the tree-level result of the UTfit collaboration [74].
We model asymmetric uncertainty intervals for the priors with LogGamma distributions (see [30] for details), while symmetric intervals are implemented as Gaussian priors. The branching ratio of the inclusive decay B → X s γ, B(B → X s γ) ∼ |C 7 | 2 + |C 7 | 2 , represents the most stringent constraint on the magnitude of the dipole Wilson coefficients C 7,7 . In our analysis we include the known corrections to next-to-leading order in α s [77,78] as well as α s Λ 2 QCD /m 2 b corrections [79]. Contrary to the common normalization to the semi-leptonic inclusive decay B → X c ν, we express the branching ratio in terms of  the averaged B meson life time for a 50:50 production ratio of B + B − to B 0B0 pairs at the Υ(4S), τ B +/0 given in tab. IV, and the bottom-quark pole mass. In order to avoid renormalon ambiguities we calculate the pole mass value from the MS mass m b (m b ) using the 3-loop result [80]. The MS mass is part of our set of common nuisance parameters and its uncertainty dominates the overall theory uncertainty of inclusive decays. At order Λ 2 QCD /m 2 b in the heavy quark expansion hadronic matrix elements of two dimension-five operators enter. They are parametrized in terms of µ 2 π and µ 2 G for the expectation values of the kinetic and the chromomagnetic operators, respectively. The parameters µ 2 π and µ 2 G enter the fit with priors according to [81], which are listed in tab. VI. The correlation between µ 2 G and the b-quark mass is accounted for by the B * -B mass splitting. When confronting the theoretical prediction of the mass splitting with the measurement (III.5), we include also the effect of dimension-6 operators as described in [81]. Our  SM prediction B(B → X s γ) = (3.14 +0.22 −0.19 ) · 10 −4 is in good agreement with the NNLO result [82]. The theory uncertainty of our prediction is determined from variation of the Wolfenstein parameters, m b (m b ), m c (m c ), µ 2 π , and µ 2 G . For the prediction of the branching ratio of the inclusive decay B → X s + − we work at NNLO in QCD and NLO in QED, including also Λ 2 QCD /m 2 b subleading corrections, as described in [52,53] except for the SM-SM interference terms. We adopt the same normalization in terms of τ B +/0 and bottom-quark pole mass as described for B → X s γ. The chirality-flipped operators are included following [83] and NLO QCD corrections to matrix elements are accounted for in case they can be derived easily within the SM operator basis. The overall theory uncertainty is determined as for B → X s γ from the variation of the same nuisance parameters. We obtain as the SM prediction B(B → X s + − ) [1,6] = (1.4 ± 0.1) · 10 −6 .

Form factors and decay constants
The most important change in the treatment of form factors is the consistent use of the parametrization as in [35], for both B → K and B → K * transitions. It has the merits of a) a convenient expansion in a small parameter z that respects unitarity, b) correct behavior at the BK ( * ) production threshold, c) correct asymptotic behavior for q 2 → ∞, and d) a convenient parametrization at q 2 = 0.
For B → K we have modified the prior of the nuisance parameter f + (0) of the f + form factor parametrization w.r.t our previous analysis but kept the slope parameter b + 1 as is. This change accounts for both LCSR results [35,84] that use the same approach of B-interpolating currents and on-shell K-mesons. As a result the prior on f + (0) is wider and the tension between the SM prediction of the B → K + − branching ratio at q 2 ∈ [1, 6] GeV 2 [36] and the LHCb measurement [15] is reduced. Moreover, the recent lattice predictions [50] of the form factor f + at high q 2 are included in our analysis as part of the likelihood for technical reasons. For this purpose we reproduced lattice predictions at three values of q 2 = 17, 20, 23 GeV 2 as well as their correlation matrix based on the parametrization given in [50], see tab. VII. (The q 2 values and number of points are chosen such that the correlation of neighboring points does not exceed 80%). This constraint is included in the likelihood by means of a multivariate Gaussian.
Due to the change of parametrization of the B → K * form factors V, A 1 and A 2 , their three respective nuisance parameters are replaced by the three form factor normalizations at q 2 = 0 (V (0), A 1,2 (0)) and three slope parameters (b V,A1, 2 1 ). The LCSR results of [35] are chosen as the priors for normalizations and slopes. We note that these priors are less precise than the results of [88] due to a novel LCSR setup involving an on-shell B me-son and interpolation of the K * final state. Beyond the informative priors we also include two additional constraints on B → K * form factors at q 2 = 0. First, the ratio V (0)/A 1 (0) is constrained in the large energy limit as given by [31] (see also references therein) where the uncertainty has been estimated based on power counting. Second, we make use of the relation is the LCSR result given in [35]. We note that (A.1) and (A.3) represent additional constraints on the nuisance parameters V (0) and A 1,2 (0). The motivation for this treatment is to tighten the constraints on A 1 (0), and avoid unphysical (i.e. negative) values of the form-factor combination A 0 . Similar to our treatment of B → K lattice results, we reproduce lattice predictions [51] for the B → K * form factors V , A 1 and A 12 at two kinematic points q 2 = 15, 19.21 GeV 2 . We obtain mean values, variances, and the correlation coefficients between neighboring points as given in tab. VIII. Note that ref. [51] does not provide information on cross-form factor correlation. We chose the number and spacing of q 2 values so that the correlation of neighboring points does not exceed 60%.
The updated prior of the B s decay constant f Bs , entering the branching ratio of B s → µ + µ − , takes into account recent lattice results, see tab. VI.

Subleading 1/m b
With increasing knowledge of the B → K ( * ) form factors and measurement of optimized -i.e., form factor insensitive -observables, the treatment of subleading contributions to the amplitudes of both B → K + − and B → K * + − decays has increased in relevance. Especially their analytic q 2 dependence is currently unknown, and their determination is not within the scope of this work. However, we strive to infer the size of contributions that go beyond the known QCDF and low recoil terms. In order to achieve this goal we keep the parametrization of subleading terms as in our previous work, except for the complex phases of the low recoil terms. Inference of these phases is not possible in the absence of data on CP asymmetries in B → K ( * ) + − decays at low recoil [30]. We therefore remove these phases from the analysis.
The overall theory uncertainty of our predictions in the region of large recoil differ substantially from those given in [38], due to the different treatment of subleading corrections to the form factor relations and the contributions from cc resonances. We keep the parametrization as in our previous work, and obtain similar uncertainties in predictions of observables as in [47].
We collect the experimental measurements as well as theoretical predictions from the literature and this work in tab. IX for the optimized observables P 4,5,6 . There we give predictions, i.e., before the fit, and postdictions that include experimental information, and proceed for this purpose as described in [30]. The predictions are restricted to the SM(ν-only) scenario and are based on the prior distributions of the nuisance parameters. Besides our nominal Gaussian prior choice with 1σ ranges of ±0.15 (15% at amplitude level) for subleading parameters at large recoil, ζ L(R)χ K * ,K , and also at low recoil, we also show the results for wider 1σ range of ±0.45. We do not include the recent lattice results of B → K * form factors [51].
The central values of our predictions agree within errors with [38], [47], and [49]. At large recoil our theoretical uncertainties are of the same size as in [47] and much smaller than in [38], who treat subleading corrections differently. Choosing wider prior ranges leads to small shifts in the central value and can double the theoretical uncertainty that is still smaller than the one in [38]. At low recoil, subleading corrections are less important and the wider prior ranges practically do not affect the overall uncertainties.
The postdictions are based on the posterior distributions of the fits for each scenario with nominal prior distributions. This includes NP effects in the Wilson coefficients in the scenarios SM and SM+SM . The overall uncertainties of the postdictions are smaller than the uncertainties of the predictions. This can be attributed to the improved knowledge of form factors and subleading nuisance parameters as witnessed by the prior-to-posterior compression in tab. III and fig. 1. The additional NP contributions in SM compared to SM(ν-only) do not change postdictions of P 4,5,6 noticeably. On the other hand, chirality-flipped operators in SM+SM can induce small shifts only at low recoil.      Only pull values for fits with the "full" data set are listed. The single CLEO measurement of B(B → K * γ) has a pull value +0.3σ in both the SM and SM+SM fits. The pull values for the SM(ν-only) fit are very similar to the SM fit for the "full" data set, with the exception of +2.1σ for the LHCb measurement of P 5 [1,6] . The pull values for SM+SM with the "full (+FF)" data set deviate by less than 0.2 from those given for SM+SM "full". The pull values for SM "full (+FF)" also agree with those of SM "full" except for notable deviations of ∼ 0.6σ for the high q 2 bins of the B → K * + − branching ratio.