New physics in $b\to s$ transitions after LHC run 1

We present results of global fits of all relevant experimental data on rare $b \to s$ decays. We observe significant tensions between the Standard Model predictions and the data. After critically reviewing the possible sources of theoretical uncertainties, we find that within the Standard Model, the tensions could be explained if there are unaccounted hadronic effects much larger than our estimates. Assuming hadronic uncertainties are estimated in a sufficiently conservative way, we discuss the implications of the experimental results on new physics, both model independently as well as in the context of the minimal supersymmetric standard model and models with flavour-changing $Z'$ bosons. We discuss in detail the violation of lepton flavour universality as hinted by the current data and make predictions for additional lepton flavour universality tests that can be performed in the future. We find that the ratio of the forward-backward asymmetries in $B \to K^* \mu^+\mu^-$ and $B \to K^* e^+e^-$ at low dilepton invariant mass is a particularly sensitive probe of lepton flavour universality and allows to distinguish between different new physics scenarios that give the best description of the current data.


Introduction
Rare decays based on the flavour-changing neutral current b → s transition are sensitive probes of physics beyond the Standard Model (SM). In recent years, a plethora of observables, including branching ratios, CP and angular asymmetries in inclusive and exclusive B decay modes, has been measured at the B factories and at LHC experiments. This wealth of data allows to investigate the helicity structure of flavour-changing interactions as well as possible new sources of CP violation.
In 2013, the observation by LHCb of a tension with the SM in B → K * µ + µ − angular observables [1] has received considerable attention from theorists and it was shown that the tension could be softened by assuming the presence of new physics (NP) [2][3][4][5]. In 2014, another tension with the SM has been observed by LHCb, namely a suppression of the ratio R K of B → Kµ + µ − and B → Ke + e − branching ratios at low dilepton invariant mass [6]. Assuming new physics in B → Kµ + µ − only, a consistent description of these anomalies seems possible [7][8][9][10]. Finally, also branching ratio measurements of B → K * µ + µ − and B s → φµ + µ − decays published recently [11,12] seem to be too low compared to the SM predictions when using state-of-the art form factors from lattice QCD or light-cone sum rules (LCSR) [13][14][15][16].
While the ratio R K is theoretically extremely clean, predicted to be 1 to an excellent accuracy in the SM [17], the other observables mentioned are plagued by sizable hadronic uncertainties. On the one hand, they require the knowledge of the QCD form factors; on the other hand, even if the form factors were known exactly, there would be uncertainties from contributions of the hadronic weak Hamiltonian that violate quark-hadron duality and/or break QCD factorization. These two sources of theoretical uncertainty have been discussed intensively in the recent literature [16,18,19] (see also the earlier work [20][21][22][23]).
Understanding how large these hadronic effects could be is crucial to disentangle potential new physics effects from underestimated non-perturbative QCD effects, if significant tensions from the SM expectations are observed in the data. The main aim of our present analysis is thus to perform a global analysis of all relevant experimental data to answer the following questions, 1. Is there a significant tension with SM expectations in the current data on b → s transitions?
2. Assuming the absence of NP, which QCD effects could have been underestimated and how large would they have to be to bring the data into agreement with predictions, assuming they are wholly responsible for an apparent tension?
3. Assuming the QCD uncertainties to be estimated sufficiently conservatively, what do the observations imply for NP, both model-independently and in specific NP models?
Our work builds on our previous global analyses of NP in b → s transitions [3,24,25], but we have built up our analysis chain from scratch to incorporate a host of improvements. Compared to our previous analyses and to comparable recent studies in the literature [2,4,5,8,9], the novel features of our analysis are as follows.
• In our global fits, we take into account all the correlations of theoretical uncertainties between different observables and between different bins. This has become crucial to assess the global significance of any tension, as the experimental data are performed in more and more observables in finer and finer bins.
• We assess the impact of different choices for the estimates of theoretical uncertainties on the preferred values for the Wilson coefficients.
• We use the information on B → K * and B s → φ form factors from the most precise LCSR calculation [13,16], taking into account all the correlations between the uncertainties of different form factors and at different q 2 values. This is particularly important to estimate the uncertainties in angular observables that involve ratios of form factors.
Our paper is organized as follows. In section 2, we define the effective Hamiltonian and discuss the most important experimental observables, detailing our treatment of theoretical uncertainties. In section 3, we perform the numerical analysis. We start by investigating which sources of theoretical uncertainties, if underestimated, could account for the tension even within the SM. We then proceed with a model-independent analysis beyond the SM, studying the allowed regions for the NP Wilson coefficients. In section 4, we discuss what the model-independent findings imply for the Minimal Supersymmetric Standard Model as well as for models with a new heavy neutral gauge boson. We summarize and conclude in section 5. Several appendices contain all our SM predictions for the observables of interest, details on our treatment of form factors, and plots of constraints on Wilson coefficients.

Effective Hamiltonian
The effective Hamiltonian for b → s transitions can be written as and we consider NP effects in the following set of dimension-6 operators, Of the complete set of dimension-6 operators invariant under the strong and electromagnetic gauge groups, this set does not include • Four-quark operators (including current-current, QCD penguin, and electroweak penguin operators). These operators only contribute to the observables considered in this analysis through mixing into the operators listed above and through higher order corrections.
Moreover, at low energies they are typically dominated by SM contributions. Consequently, we expect the impact of NP contributions to these operators on the observables of interested to be negligible. 1 • Chromomagnetic dipole operators. In the radiative and semi-leptonic decays we consider, their Wilson coefficients enter at leading order only through mixing with the electromagnetic dipoles and thus enter in a fixed linear combination, making their discussion redundant.
• Tensor operators. Our rationale for not considering these operators is that they do not appear in the dimension-6 operator product expansion of the Standard Model [27][28][29]. Consequently, they are expected to receive only small NP contributions unless the scale of new physics is very close to the electroweak scale, which is in tension with the absence of new light particles at the LHC.
• Scalar operators of the form (sP A b)(¯ P B ). The operators with AB = LL or RR do not appear in the dimension-6 operator product expansion of the Standard Model either. While the ones with AB = LR and RL do appear at dimension 6, their effects in semi-leptonic decays are completely negligible once constraints from B s → µ + µ − are imposed [29].

Observables
The differential decay distribution of B → Kµ + µ − in terms of the dimuon invariant mass squared q 2 and the angle between the K and µ − gives access to two angular observables, the so-called flat term F H and the forward-backward asymmetry A FB , in addition to the differential decay rate (or branching ratio). The observables A FB and F H only deviate significantly from zero in the presence of scalar or tensor operators [17]. Due to the argument given above, we do not consider NP contributions to these operators in semi-leptonic decays. While the direct CP asymmetry has been measured recently as well [30], we do not include it in our analysis since it is suppressed by small strong phases and therefore does not provide constraints on new physics at the current level of experimental accuracy. Consequently, the only observable we need to consider is the (CP-averaged) differential branching ratio of the charged B decay, and analogously for the neutral B decay.

Theoretical uncertainties
The theoretical analysis of the B → Kµ + µ − observables is complicated not only by the need to know the B → K form factors, but also by the fact that the "naive" factorization of the amplitude into a hadronic and a leptonic part is violated by contributions from the hadronic weak Hamiltonian, connecting to the lepton pair through a photon. Concretely, in the limit of vanishing lepton mass 2 , the decay rate can be written as where λ(a, b, c) = a 2 + b 2 + c 2 − 2(ab + bc + ac) , Here, f + and f T are the full QCD form factors and h K includes the non-factorizable contributions from the weak effective Hamiltonian. An additional form factor, f 0 , enters terms that are suppressed by the lepton mass. We now discuss our treatment of these quantities, which represent the main source of theoretical uncertainties in the B → Kµ + µ − observables.
For the form factors, we perform a combined fit of the recent lattice computation by the HPQCD collaboration [31], valid at large q 2 , and form factor values at q 2 = 0 obtained from light-cone sum rules (LCSR) [32,33], to a simplified series expansion. Details of the fit are discussed in appendix A. The results are 3-parameter (4-parameter) fit expressions for the form factors f +,T (f 0 ) as well as the full 10 × 10 covariance matrix. We retain the correlations among these uncertainties throughout our numerical analysis.
Concerning h K (q 2 ), we emphasize the following contributions.
• Virtual corrections to the matrix elements of the four-quark operators O 1 and O 2 . We include them to NNLL accuracy using the results of ref. [34].
• Contributions from weak annihilation and hard spectator scattering. These have been estimated in QCD factorization to be below a percent [33] and we neglect them.
• Soft gluon corrections to the virtual charm quark loop at low q 2 . This effect was computed recently in LCSR with B meson distribution amplitudes in ref. [20] and was found to be "unimportant at least up to q 2 ∼ 5 − 6 GeV 2 ." (See also [22]).
• Violation of quark-hadron duality at high q 2 , above the open charm threshold, due to the presence of broad charmonium resonances. Employing an OPE in inverse powers of the dilepton invariant mass, this effect has been found to be under control at a few percent in ref. [21].
Concerning the last two items, the uncertainties due to these effects have to be estimated in a consistent and conservative manner to draw robust conclusions about the compatibility of experimental measurements with the SM predictions. We do this by parametrizing our ignorance of sub-leading corrections to h K in the following way, where we used the leading contribution to the amplitude F V as an overall normalization factor. To obtain the theory uncertainties, we vary the strong phases φ a,b,c within (−π, π]. At low q 2 , since the main contribution is expected to come from the soft gluon correction to the charm loop, we vary a within [0, 0.02] and b within [0, 0.05]. In this way, the central value of the effect discussed in [20,22] is contained within our 1σ error band. Although (10) is just a very crude parametrization of the (unknown) q 2 dependence at low q 2 , we believe it is sufficiently general at the current level of experimental precision. At high q 2 , the presence of broad charmonium resonances means that h K (q 2 ) varies strongly with q 2 , but since we will only consider observables integrated over the whole high-q 2 region, we can ignore this fact and the parameter c simply parametrizes the violation of the OPE result. We estimate it by varying c within [0, 0.05], which corresponds to an uncertainty on the rate more than twice the uncertainty quoted in [21], to be conservative. In section 3.2, we will also discuss the consequences of increasing the ranges for these parameters.
The angular decay distribution ofB 0 →K * 0 µ + µ − contains in general 12 angular coefficient functions. In the presence of CP violation, the 12 angular coefficients of the CP-conjugate decay B 0 → K * 0 µ + µ − represent another 12 independent observables [35]. However, since scalar contributions are negligible in our setup and one can neglect the muon mass to a good approximation, there are only 9 independent observables in each decay. Moreover, the absence of large strong phases implies that several of the observables are hardly sensitive to new physics. In practice, the observables that are sensitive to new physics are • the CP-averaged differential branching ratio dBR/dq 2 , • the CP-averaged K * longitudinal polarization fraction F L and forward-backward asymmetry A FB , • the CP-averaged angular observables S 3,4,5 , • the T-odd CP-asymmetries A 7,8,9 .
All of these observables can be expressed in terms of angular coefficients and are functions of q 2 . Alternative bases have been considered in the literature (see e.g. [36][37][38][39][40]). Choosing different normalizations can reduce the sensitivity of the observables to the hadronic form factors, at least in the heavy quark limit and for naive factorization. In our analysis, the choice of basis is irrelevant for the impact of hadronic uncertainties, as we consistently take into account all the correlations between theoretical uncertainties. The only dependence on the choice of basis is then due to unknown experimental correlations. Where available, we also take correlations of experimental uncertainties into account.
In the case of B → K * γ, we consider the following observables: the branching ratio of B ± → K * ± γ, the branching ratio of B 0 → K * 0 γ, the direct CP asymmetry A CP and the mixing-induced CP asymmetry S K * γ in B 0 → K * 0 γ. Since we take all known correlations between the observables into account in our numerical analysis, including the branching ratios of the charged and neutral B decays is to a very good approximation equivalent to including one of these branching ratios and the isospin asymmetry.

Theoretical uncertainties
Similarly to the B → Kµ + µ − decay, the main challenges of B → K * µ + µ − are the form factors and the contributions of the hadronic weak Hamiltonian.
For the form factors, we use the preliminary results of a a combined fit [16] to a LCSR calculation of the full set of seven form factors [13] with correlated uncertainties as well as lattice results for these form factors [14]. This leads to strongly reduced uncertainties in angular observables.
The non-factorizable contributions from the hadronic weak Hamiltonian are more involved in B → K * µ + µ − compared to B → Kµ + µ − for several reasons. First, it contributes to three helicity amplitudes instead of just one; Second, the presence of the photon pole at q 2 = 0 enhances several of the contributions at low q 2 ; Third, since we do not only consider branching ratios but also a host of angular observables where form factor uncertainties partly cancel, we require a higher theoretical accuracy in the h λ . Concretely, we include the following contributions.
• The NNLL contributions to the matrix elements of O 1,2 as in the case of B → Kµ + µ − .
• At low q 2 , weak annihilation beyond the heavy quark limit as obtained from LCSR [43].
• At low q 2 , contributions from the matrix element of the chromomagnetic operator as obtained from LCSR [44].
As in B → Kµ + µ − , there are additional, sub-leading contributions, such as the soft gluon corrections to the charm loop [18,20,22,45]. We parametrize them at low q 2 by a correction relative to the leading contribution to the helicity amplitudes proportional to C eff 7 , The parameters a λ and b λ are allowed to be different for each of the three helicity amplitudes, λ = +, −, 0. We vary the a λ and b λ in the following ranges, Again, with this choice the effect discussed in [20,22] is within our 1σ uncertainty band. Although the normalization of the correction is arbitrary and could have also been written as a relative correction to C 9 , we choose C 7 as normalization in B → K * µ + µ − since the leading contribution proportional to C 9 vanishes at q 2 = 0 and does not contribute to B → K * γ. It is due to this choice that we need to allow for larger a 0 , b 0 since the C eff 7 contribution is not enhanced in the λ = 0 amplitude.
At high q 2 , as in the case of B → Kµ + µ − , we do not have to consider a q 2 dependent correction as we are only considering observables integrated over the full high q 2 region. Analogous to B → Kµ + µ − , we parametrize the sub-leading uncertainties by a relative correction to C 9 . To be conservative, we allow it to be up to 7.5% in magnitude, independently for the three helicity amplitudes, with an arbitrary strong phase.

Direct CP asymmetry in B → K * γ
While direct CP asymmetries in the B decays considered by us are suppressed by small strong phases and so typically do not lead to strong constraints on NP, the direct CP asymmetry in B → K * γ is a special case since the measurements by the B factories and LHCb are so precise that this suppression could be overcome. The world average reads 3 Allowing for general NP contributions in C 7 , we find the following central value for the asymmetry, where we have neglected contributions from NP in C 7 and C 8 . We observe that the experimental bound (13) can constrain an imaginary part of the Wilson coefficient C 7 at the m b scale at the level of 0.1, which is still allowed by all other measurements as we will see. The problem with using this observable as a constraint on NP is that it is proportional to a strong phase that appears only at sub-leading order and is afflicted with a considerable uncertainty. With our error treatment described above, we find an overall relative uncertainty of 20% in the presence of a large imaginary C 7 . However, to be conservative, we will not include A CP (B 0 → K * 0 γ) in our global fits, but we will discuss the impact of including it separately in section 3.3.

2.4.
The decay B s → φµ + µ − is similar to the B → K * µ + µ − decay and our treatment of theory uncertainties is analogous. In particular, we use the form factors and their correlated uncertainties that have been obtained in [16] from the combined fit of lattice and LCSR results. The sub-leading non-factorizable corrections are parametrized as in the case of B → K * µ + µ − , and the coefficients a λ , b λ and c λ are varied in the same ranges. We assume the uncertainty in these coefficients to be 90% correlated between B s → φµ + µ − and B → K * µ + µ − since we do not see a physical reason why they should be drastically different.
An important difference with respect to B → K * µ + µ − is that the B s → φµ + µ − decay is not self-tagging. Therefore, the only observables among the ones mentioned at the beginning of section 2.3.1 that are experimentally accessible in a straightforward way at a hadron collider are • the differential branching ratio dBR/dq 2 , • the CP-averaged angular observables F L and S 4 , • the angular CP asymmetry A 9 .
An additional novelty is the impact of the sizable B s width difference. As shown in [16], this effect is small in the SM and we have checked that it is also negligible in the presence of NP at the current level of experimental precision, unless the Wilson coefficients assume extreme values that are already excluded by other constraints.

Fit methodology
More and more experimental data on b → sµ + µ − transitions becomes available and many observables are measured with a fine binning. Therefore, in order to determine the values of the Wilson coefficients preferred by the data it becomes more and more important to include the correlation of theoretical uncertainties between different observables as well as between different bins of the same observable. One possibility to achieve this is to perform a global Bayesian analysis where all the uncertainties are parametrized by nuisance parameters that are marginalized over by sophisticated numerical tools like Markov Chain Monte Carlos. This approach has been applied recently e.g. in [4]. A drawback of this approach is that it is timeconsuming and the computing time increases with the number of parameters. Here, we follow a different approach. We construct a χ 2 function that only depends on the Wilson coefficients and take into account the theoretical and experimental uncertainties in terms of covariance matrices, Here, O exp are the experimentally measured central values of all observables of interest, O th are the corresponding theory predictions that depend on the (NP contributions to the) Wilson coefficients, C exp is the covariance matrix of the experimental measurements and C th is the covariance matrix of the theory predictions that contains the theory uncertainties and their correlations. In writing (15), we have made two main approximations. First, we have assumed all the experimental and theoretical uncertainties to be Gaussian. Second, we have neglected the dependence of the theory uncertainties on the new physics contributions to the Wilson coefficients. This means that the theory uncertainties and their correlations have been evaluated for the Wilson coefficients fixed to their SM values. We believe that this assumption is well justified in view of the fact that no drastic deviations from the SM expectations have been observed so far. The only possible exception are observables that vanish in the SM but could receive NP contributions much larger than the current experimental bounds. As we will discuss below, the only such observable at present is the direct CP asymmetry in B → K * γ. We determine C th by evaluating all observables of interest for a large set of the parameters parametrizing the theory uncertainties, randomly distributed according to the uncertainties and correlations described above. In this way, we retain not only correlated uncertainties between different observables, but also between different bins of the same observable. We find these correlations to have a large impact on our numerical results. Concerning C exp , we symmetrize the experimental error bars and use the experimental correlations when available. Where they are not available, we include a rough guess of the correlations by assuming the statistical uncertainties to be uncorrelated and the systematic uncertainties to be fully correlated for measurements of the same observable by a single experiment. We have checked that the treatment of experimental correlations has only a small impact on the overall fit at the current level of experimental accuracy.
We do not include the additional results on b → s transitions from BaBar [58,59] and Belle [60,61], as they are only available as an average of µ + µ − and e + e − modes. As already mentioned in section 2, in the fit we do not explicitly include isospin asymmetries, but instead use results on the charged and neutral modes separately. As we take into account all known error correlations, this approach is essentially equivalent. We would like to stress that for none of the observables, we use low q 2 bins that extend into the region above the perturbative charm threshold q 2 > 6 GeV, where hadronic uncertainties cannot be estimated reliably. This applies in particular to the bin [4.3, 8.68] GeV 2 that has been used in several fits in the past [2,5,9]. For the B 0 → K * 0 µ + µ − observables at low q 2 , we choose the smallest available bins satisfying this constraint, since they are most sensitive to the non-trivial q 2 dependence of the angular observables. For B s → φµ + µ − , we use the [1, 6] GeV 2 bin, since the branching ratio does not vary strongly with q 2 and since the statistics is limited. In the high q 2 region, we always consider the largest q 2 bins available that extend to values close to the kinematical end point. All the experimental measurements used in our global fits are listed in appendix B along with their theory predictions.

Compatibility of the data with the SM
Evaluating (15) with the Wilson coefficients fixed to their SM values, we obtain the total χ 2 of the SM. Including both b → sµ + µ − and b → se + e − observables, we find χ 2 SM ≡ χ 2 ( 0) = 106.1 for 81 independent measurements. This corresponds to a p-value of 3.6%. Including only b → sµ + µ − observables, we find χ 2 SM = 97.2 for 78 independent measurements, corresponding to a p-value of 6.9%. In table 1, we list the observables with the largest deviation from the SM expectation. The full list of observables entering the χ 2 , together with the SM predictions and experimental measurements, is given in appendix B. We note that some of these observables have strongly correlated uncertainties and that for two of the observables, A FB and F L , there is some tension between different experiments. Still, there does seem to be a systematic suppression of branching ratios in different decay modes and we will see in section 3.3 that the quality of the fit can be improved substantially in the presence of new physics. An important questions is whether these tensions could be due to underestimated theory uncertainties and we will investigate this question in the following paragraphs. It should be kept in mind that none of these sources of uncertainties can account for violation of lepton flavour universality.

Underestimated hadronic effects?
We will see in section 3.3 that the agreement of the theory predictions with the experimental data is improved considerably assuming non-standard values for the Wilson coefficient C 9 . Decay obs. q 2 bin SM pred. measurement pull BaBar −1.8 Table 1: Observables where a single measurement deviates from the SM by 1.8σ or more. The full list of observables is given in appendix B.
Since this coefficient corresponds to a left-handed quark current and a leptonic vector current, it is conceivable that a NP effect in C 9 is mimicked by a hadronic SM effect that couples to the lepton current via a virtual photon, e.g. the charm loop effects at low q 2 and the resonance effects at high q 2 as discussed in section 2 (see e.g. [18]). In our numerical analysis, in addition to the known non-factorizable contributions taken into account as described in section 2, subleading effects of this type are parametrized by the parameters a i , b i , c i in (10), (11), and analogously for B s → φµ + µ − . Since they parametrize unknown sub-leading uncertainties, the central values of these parameters are 0 in our SM predictions. Any underestimation of a non-perturbative QCD effect (not related to form factors) should then manifest itself as a drastic reduction of the χ 2 for a sizable value of one of the parameters, when treating them as completely free. To investigate this question, we have constructed a χ 2 function analogous to (15), but writing the central values O th as functions of the parameters a i , b i , c i instead of the Wilson coefficients.
In fig. 1, we show the reduction of the χ 2 compared to our SM central value under variation of pairs of these parameters, while treating two of them at a time as free parameters and fixing all the others to 0. We show the cases of varying the coefficients entering the B → K + − amplitude at low and high q 2 (top); the coefficients entering the λ = − and λ = 0 B → K * + − helicity amplitudes at low q 2 (bottom left) and high q 2 (bottom right). Corrections to the λ = + helicity amplitude are expected to be suppressed [23] and we checked explicitly that they have a weak impact. On the green dashed contours, the χ 2 is the same as for the central value, so there is no improvement of the fit. In the green shaded area, the fit is improved, with the solid contours showing ∆χ 2 ≡ χ 2 − χ 2 SM = 1, 4, 9, etc. In the unshaded region to the other side of the dashed contour, the fit is worsened compared to the central value. The blue circles show our 1 and 2σ assumptions for the uncertainties on the parameters in question, as discussed in section 2. We stress that these assumptions have not been used as priors to determine the green contours. We make the following observations.
• The χ 2 can be reduced by up to 4 when pushing the parameter b K , parametrizing subleading corrections in B → Kµ + µ − at low q 2 , to the border of our estimated uncertainty. The fit does not improve significantly when changing the parameter c K from 0, i.e. when assuming large violations of quark-hadron duality in the global (integrated) high q 2 observables in B → Kµ + µ − , unless b K is shifted at the same time.
• A simultaneous positive shift in the sub-leading corrections to the λ = − and 0 helicity amplitudes in B → K * µ + µ − can significantly reduce the χ 2 as well. ∆χ 2 = 9 requires a shift in both parameters that is four times larger than our error estimate.
• Corrections to quark-hadron duality in the global high q 2 observables in B → K * µ + µ − do not lead to a reduction of the χ 2 by more than 1.
We conclude that the agreement of the data with the predictions cannot be improved by assuming (unexpectedly) large violations of quark-hadron duality in integrated observables at high q 2 alone, while sizable corrections to B → Kµ + µ − and B → K * µ + µ − at low q 2 could improve the agreement with the data. We stress however that fig. 1 should not be misinterpreted as a determination of the size of subleading QCD effects from the data. Indeed, the regions where the χ 2 is significantly reduced correspond to values that are larger than any known hadronic effect.
We will see in section 3.3 that a good fit to the data can be obtained assuming a large negative NP contribution to the Wilson coefficient C 9 . We find it instructive to consider the size of the sub-leading parameters that would make them "mimic" a NP effect. Experimentally, it would be difficult to distinguish between the cases i) where C 9 = C SM 9 + ∆ 9 and all as well as and all other a i , b i , c i equal to zero. This pattern of effects is indeed similar to what is seen in fig. 1. Distinguishing such a scenario from a NP effect is straightforward if the NP effect is not lepton-flavour universal. If it is lepton-flavour universal, a correlated analysis of exclusive and inclusive observables, of the q 2 dependence, and of consistency relations among observables valid in the SM (see e.g. [62]) could help to disentangle QCD and NP.

Underestimated parametric uncertainties?
While the angular observables in B → K * µ + µ − are almost free from parametric uncertainties 4 , the apparent systematic suppression of branching ratios could also be due to an underestimated overall parametric uncertainty. The uncertainties of the B u,d,s meson lifetimes quoted by the PDG [63] are well below 1% and are therefore very unlikely to be responsible. The dominant parametric uncertainty is the CKM factor |V tb V * ts | 2 to which all branching ratios are proportional and which itself is dominated by the uncertainty of the measurement of |V cb |. The relative uncertainty of all b → s branching ratios due to |V cb | is twice the relative uncertainty of |V cb |. In our numerical analysis, we use which leads to an uncertainty of 4.9% on the branching ratios. In fact there is a long standing tension between determinations of |V cb | from inclusive and exclusive decays. The PDG [63] quotes which are at a 2.5σ tension with each other. Choosing the inclusive value instead of (19) would increase the central values of all our branching ratios by 6.5% and would worsen the agreement with the data. Choosing the exclusive value instead would lead to a reduction of the branching ratios by 6.7%. To see whether this has an impact on the significance of the tensions, we multiply all branching ratios by a scale factor η BR and fit this scale factor to the data. We find η BR = 0.81 ± 0.08, i.e. a 19% reduction of the branching ratios with respect to our central values is preferred. This central value would correspond to |V cb | = 3.7 × 10 −2 , which is in tension with both the inclusive and exclusive determinations.
We conclude that underestimated parametric uncertainties are unlikely to be responsible for the observed tensions in the branching ratio measurements. Needless to say, the angular observables and R K would be unaffected by a shift in |V cb | anyway.

Underestimated form factor uncertainties?
The tensions between data and SM predictions could also be due to underestimated uncertainties in the form factor predictions from LCSR, lattice, or both. A first relevant observation in this respect is that the tensions in table 1 include observables in decays involving B → K, B → K * , and B s → φ transitions, both at low q 2 (where LCSR calculations are valid) and at high q 2 (where the lattice predictions are valid). Explaining all of them would imply underestimated uncertainties in several completely independent theoretical form factor determinations.
In the case of B → Kµ + µ − and B s → φµ + µ − , tensions are present only in branching ratios, which seem to be systematically below the SM predictions. This could be straightforwardly explained if the form factor predictions were systematically too high.
The case of B → K * µ + µ − is less trivial due to the tensions in angular observables, which cannot simply be due to an overall rescaling of the form factors. To investigate this case, we have parametrized all seven B → K * form factors by a two-parameter z expansion 5 and constructed a χ 2 function analogous to (15), but writing the central values O th as functions of the 14 z expansion parameters instead of the Wilson coefficients. We have then conducted a global Markov Chain Monte Carlo fit of all 14 parameters to the data and compared the obtained posterior probability distribution to the priors (obtained from a combined fit to LCSR and lattice results). We have found that the most significant shift, i.e. preference for a nonstandard value, occurred in the form factor 6 A 12 . In fig. 2, we show the improvement in the χ 2 obtained when changing the A 12 form factor, while fixing all the other form factors to their  fig. 1. We observe that an improvement of ∆χ 2 ∼ 4 can be obtained if the value at q 2 = 0 is significantly lower than what is obtained from LCSR. This improvement is quite limited compared to the improvement obtained in the presence of NP discussed below or in the presence of large non-form factor corrections discussed above.
Finally, an important observation in the case of B → K * µ + µ − angular observables is that the tensions are only present at low q 2 , where the seven form factors can be expressed in terms of two independent "soft" form factors up to power corrections of naive order Λ QCD /m b . It is then possible to construct angular observables that do not depend on the soft form factors, but only on the power corrections [40]. The tensions can then be seen even without any input from LCSR or lattice, but estimating the power corrections by dimensional analysis [19]. This shows that an explanation of the tensions by underestimated form factor uncertainties would imply a violation of the form factor relations in the heavy quark limit that is much larger than what LCSR calculations predict.

New physics in a single Wilson coefficient
We now investigate whether new physics could account for the tension of the data with the SM predictions. We start by discussing the preferred ranges for individual Wilson coefficients assuming our nominal size of hadronic uncertainties. We determine the 1σ (2σ) ranges by computing ∆χ 2 = 1 (4) while fixing all the other coefficients to their SM values. We also set the imaginary part of the respective coefficient to 0. In addition to the Wilson coefficients C ( ) 7,9,10 , we also consider the case where the NP contributions to C ( ) 9 and C ( ) 10 are equal up to a sign, since this pattern of effects is generated by SU (2) L -invariant four fermion operators in the dimension-6 SM effective theory.
Our results are shown in table 2. We summarize the most important points. Coeff.
best fit 1σ 2σ  • A negative NP contribution to C 9 , approximately −30% of C SM 9 , leads to a sizable decrease in the χ 2 . The best fit point corresponds to a p-value of 21.5%, compared to 6.9% for the SM. This was already found in fits of low-q 2 angular observables only [2] and in global fits not including data released this year [3,5,19], as well as in a recent fit to a subset of the available data [9]. We find that the significance of this solution has increased substantially. This is due in part to the reduced theory uncertainties, in particular the form factors, as well as due to the new measurements by LHCb.
• A significant improvement is also obtained in the SU (2) L invariant direction C NP 9 = −C NP 10 , corresponding to an operator with left-handed muons.
• A positive NP contribution to C 10 alone can also improve the fit, although to a lesser extent.
• NP contributions to individual right-handed Wilson coefficients hardly lead to improvements of the fit.
While table 2 assumed the Wilson coefficients to be real, i.e. aligned in phase with the SM, in general the NP contributions to the Wilson coefficients are complex numbers. Since measurements in semi-leptonic decays are currently restricted to CP-averaged observables or direct CP asymmetries that are suppressed by small strong phases 7 , the constraints on the imaginary parts are generally weaker than on the real parts, since they do not interfere with the SM contribution.
An interesting special case is the direct CP asymmetry in B → K * γ. As discussed in section 2.3.3, this observable is precisely measured and very sensitive to the imaginary part of C 7 , but we do not include it in our default χ 2 since it is proportional to a strong phase that is afflicted with a considerable uncertainty. In fig. 3, we show how the allowed region for the NP contribution to C 7 would change by including this observable. The red (green) contours correspond to the 1 and 2σ regions (∆χ 2 = 2.3 and 6 while fixing all other coefficients to their SM values) allowed by the global fit including A CP (B 0 → K * 0 γ) with a relative uncertainty of 50% (25%), while the blue contours correspond to the fit without the CP asymmetry. We observe that the constraint on the imaginary part of C 7 improves by a factor of ∼ 2 even with our conservative estimate for the theory error.
The global constraints in the complex planes of all Wilson coefficients are shown in fig. 11 of appendix C.

Constraints on pairs of Wilson coefficients
We proceed by analysing the constraints in scenarios where two Wilson coefficients are allowed to differ from their SM values. In this section we exemplarily allow for real NP in either C 9 and C 9 or C 9 and C 10 . With our nominal values for the theory uncertainties, the best fit values for the Wilson coefficients and the corresponding ∆χ 2 read in the two cases The best fit points correspond to p-values of 21.6% and 19.5%, respectively. This is comparable to the 21.5% obtained in section 3.3 in the scenario with new physics only in C 9 . In fig. 4, we show the allowed regions in the Re(C NP 9 )-Re(C 9 ) and Re(C NP 9 )-Re(C NP 10 ) planes. The blue contours correspond to the 1 and 2σ regions (∆χ 2 = 2.3 and 6 while fixing all other coefficients to their SM values) allowed by the global fit. In addition, we also show the 2σ allowed regions for two scenarios with inflated theory uncertainties. For the green short-dashed contours, we have doubled all the form factor uncertainties. For the red short-dashed contours, we have doubled all the hadronic uncertainties not related to form factors, i.e. the ones that are parametrized as in (10) and (11). We observe that the negative value preferred for C NP 9 is above the 2σ level even for these conservative assumptions. We also observe that C 9 and C NP 10 are preferentially positive, although they deviate from 0 less significantly than C NP 9 . The corresponding plots for all interesting combinations of real Wilson coefficients are collected in fig. 12 of appendix C, together with the ∆χ 2 values of the corresponding best fit points.
It is also interesting to investigate which observables drive the tensions. In fig. 5, we compare the global constraints in the Re(C NP 9 )-Re(C 9 ) and Re(C NP 9 )-Re(C NP 10 ) planes to the constraints one gets only using branching ratios (green) or only using B → K * µ + µ − angular observables (red). We observe that the angular observables strongly prefer a negative C 9 but are not very sensitive to C 9 or C 10 . The branching ratio constraints have an approximate flat direction C NP 9 ∼ −C 9 and show a preference for C NP 10 > 0 in particular if C NP 9 > 0. In fact, from branching ratios alone, one could get a good fit to the data with SM-like C 9 and C NP 10 > 0.

Testing lepton flavour universality
So far, in our numerical analysis we have only considered the muonic b → sµ + µ − modes and the lepton flavour independent radiative b → sγ modes to probe the Wilson coefficients C  (4) only muons are considered. In this section we will extend our analysis and include also semileptonic operators that contain electrons. In particular, we will allow new physics in the Wilson coefficients C e 9 and C e 10 and confront them with the available data on B → Ke + e − from LHCb [6] and B → X s e + e − from BaBar [57].
As mentioned already in the introduction, the recent measurement of the ratio R K of B → Kµ + µ − and B → Ke + e − branching ratios in the q 2 bin [1, 6] GeV 2 by LHCb [6] shows a 2.6σ )-Re(C 9 ) plane (left) and the Re(C NP 9 )-Re(C NP 10 ) plane (right). The blue contours correspond to the 1 and 2σ best fit regions from the global fit. The green and red contours correspond to the 1 and 2σ regions if only branching ratio data or only data on B → K * µ + µ − angular observables is taken into account.
tension with the SM prediction The theoretical error of the SM prediction is completely negligible compared to the current experimental uncertainties. The tension between the SM prediction and the experimental data is driven by the reduced B → Kµ + µ − branching ratio, while the measured B → Ke + e − branching ratio is in good agreement with the SM. In our extended global fit we do not use the R K measurement directly but instead include the B → Kµ + µ − and B → Ke + e − branching rations separately, taking into account the correlations of their theory uncertainties. As the theory uncertainties of BR(B → Kµ + µ − ) and BR(B → Ke + e − ) are essentially 100% correlated, our approach is to a good approximation equivalent to using R K . In fig. 6 we show the result of two fits that allow for new physics in C µ 9 and C e 9 (left plot) and new physics along the SU (2) L invariant directions C µ 9 = −C µ 10 and C e 9 = −C e 10 . Recall that in section 3.3 we found that new physics in these scenarios gives the by far best description of the experimental b → sµ + µ − data. As expected, we again find that a C µ 9 significantly smaller than in the SM is clearly preferred by the fits. The best fit regions for C µ 9 and C µ 9 = −C µ 10 approximately coincide with the regions found for C 9 and C 9 = −C 10 in section 3.3. The Wilson coefficients C e 9 and C e 9 = −C e 10 on the other hand are perfectly consistent with the SM prediction. Lepton flavour universality, i.e. C µ 9 = C e 9 and C µ 10 = C e 10 as indicated by the diagonal line in the plots is clearly disfavoured by the data. Our results are consistent with similar findings in recent fits to part of the available experimental data [8,9].
Working under the assumption that the electron modes are indeed SM like, we can make predictions for ratios of observables that test lepton flavour universality using the best fit regions for the muonic Wilson coefficients from our global fit. We consider ratios of branching decays, both at low and high q 2 . Moreover, we also predict ratios of the B → K * angular observables F L , A FB and S 5 at low and high q 2 . The results are shown in table 3. The four columns correspond to the following scenarios: • new physics only in C µ 9 ; • new physics in C µ 9 and C µ 9 ; • new physics along the SU (2) L invariant direction C µ 9 = −C µ 10 ; • new physics independently in C µ 9 and C µ 10 .
The Standard Model prediction for all the shown ratios is 1, with negligible uncertainties. In all scenarios all branching ratio ratios are predicted around 0.75 both at low and high dimuon invariant mass. A similar ratio is seen for S 5 at low q 2 . Only very small deviations from the SM are predicted for S 5 and A FB at high q 2 as well as F L at low and high q 2 . The most interesting observable turns out to be the ratio of the forward-backward asymmetries in B → K * µ + µ − and B → K * e + e − in the q 2 bin [4, 6] GeV [4,6] A FB (B → K * e + e − ) [4,6] .
Assuming that the electron mode is SM like, R A FB is extremely sensitive to the value of C µ 9 . For the considered values of C µ 9 it deviates drastically from the SM prediction and a precise measurement would even allow to distinguish between the considered scenarios.

Constraints on new physics models
The results from the model-independent fit of the Wilson coefficients in the effective Hamiltonian can be interpreted in the context of new physics models. Here we discuss implications for the minimal supersymmetric standard model (MSSM) and models that contain massive Z gauge bosons with flavour-changing couplings.

General MSSM
Experimental data on flavour-changing neutral current processes generically lead to strong constraints on new sources of flavour violation that can be present in the MSSM [64,65].
In particular, the experimental information on rare b → sµ + µ − decays can be used to put constraints on flavour-violating trilinear couplings in the up squark sector, that are only poorly constrained otherwise [66][67][68][69][70]. In principle, the general MSSM also allows for lepton-flavour non-universality effects and we will comment to which extend the R K measurement can be accommodated.
The flavour-changing trilinears give contributions to the effective Hamiltonian in (1) at the one loop level. Contributions can arise from boxes, photon penguins, and Z penguins and example Feynman diagrams are shown in fig. 7. A straightforward flavour spurion analysis shows the following points: • contributions to C 7,8 , are suppressed by m s /m b with respect to contributions to C 7,8 ; • contributions to C 9,10 are suppressed by m s m b /m 2 t with respect to contributions to C 9,10 ; • contributions proportional to A tc are suppressed by m c /m t compared to contributions proportional to A ct .
We therefore concentrate on the Wilson coefficients C 7 , C 8 , C 9 , and C 10 in the presence of a non-zero A ct . To illustrate the main parameter dependence, in the following we give simple approximate expressions for the Wilson coefficients that are obtained at leading order in an expansion in m 2 EW /m 2 SUSY . The most important SUSY masses involved are the Wino mass M 2 , the Higgsino mass µ, the left-handed slepton mass m˜ , the stop masses mt L and mt R , as well as the left-handed charm squark mass mc L . The largest effects in b → s transitions can obviously be achieved if the SUSY spectrum is as light as possible. To keep the expressions compact, we set for simplicity M 2 = µ = m˜ ≡ M , mt L = mc L ≡ m L , mt R ≡ m R . We also work in the limit M m R m L which is least constrained by collider searches and therefore allows to maximize the new physics contributions to the Wilson coefficients. Note also that a light Higgsino and light stops are well motivated by naturalness arguments [71][72][73]. For the dipole coefficients we find in a leading log approximation The contributions to C 7 and C 8 from A ct arise first at the dimension 8 level, i.e. they are suppressed by m 4 EW /m 4 SUSY . The last terms in (26a) and (26b) are the leading irreducible MFV contributions to C 7 and C 8 from Higgsino stop loops. They arise already at dimension 6 and are typically much larger than the contributions proportional to A ct .
For the box contributions, C box 9,10 , and the photon penguin contribution, C γ 9 , to the semileptonic operators we find where s W = sin θ W and θ W is the weak mixing angle. Again we find that these contributions arise at the dimension 8 level. For a TeV scale SUSY spectrum, they are completely negligible. In the considered scenario, only the Z penguin contributions, C Z 9,10 , arise already at the dimension 6 level. We find This suggests that there are regions of MSSM parameter space, where a contribution to C Z 10 of O(1) is indeed possible. MSSM contributions to C Z 9 on the other hand are suppressed by the accidentally small vector coupling of the Z boson to leptons, (4s 2 W − 1) ∼ −0.08, and therefore negligible.
Recalling the model independent results from section 3, a positive new physics contribution to the Wilson coefficient C NP 10 O(1), can improve the agreement with the current experimental b → sµ + µ − data significantly (albeit to a lesser extent than NP in C 9 ). Negative NP contributions to C 10 on the other hand are strongly disfavoured with the current data. We use these results to probe regions of MSSM parameter space with sizable flavour-changing trilinear couplings.
Bounds on flavour-changing trilinear couplings can also be obtained from vacuum stability considerations. As is well known, sizable trilinear couplings can lead to charge and color breaking minima in the MSSM scalar potential [74,75]. Requiring that the electro-weak minimum be the deepest gives upper bounds on the trilinear couplings. Taking into account non-zero expectation values for the left and right handed stops, the left handed charm squark, as well as the up-type Higgs, we find the following necessary condition to ensure absolute stability of the electro-weak vacuum [76,77] (|A t | + |A ct | tan θ) 2 3 + tan 2 θ m 2 This inequality has to hold for all values of θ, that parametrizes the angle in field space between the left handed top and charm squarks. In the limit θ = 0 one recovers a well known bound on A t given e.g. in [74]; for θ = π/2 one recovers the bound on A ct found in [75].
In principle, additional constraints on A ct can be obtained from the experimental bounds on electric dipole moments (EDMs). In particular, if A ct and A t contain a relative phase, a strange quark EDM and chromo EDM will be induced analogous to the new physics contributions to C 7 and C 8 . However, predicting an experimentally accessible EDM of a hadronic system, like the neutron, given a strange quark EDM or chromo EDM involves large theoretical uncertainties [78,79]. Due to these uncertainties, existing EDM bounds do not give appreciable constraints in our setup. Note also that bounds on the charm quark chromo EDM [80] do not constrain the parameter space of our scenario. A sizable charm quark chromo EDM would be generated in the presence of both A ct and A tc couplings, but here we only consider a non-zero A ct .
We now describe the SUSY spectrum that we chose to illustrate the bounds on the trilinear couplings from the b → sµ + µ − data. The soft masses for the left-handed stop and charm squark are set to a common value mt L = mc L = 1 TeV. The soft mass of the right-handed stop is set to mt R = 500 GeV. All other squarks and sleptons as well as the gluino are assumed to be heavy, with masses of 2 TeV. Concerning the trilinear couplings, we only consider nonzero A t and A ct . Due to these trilinear couplings, the lightest up-squark mass eigenstate can have a mass mt 1 < 500 GeV and is potentially subject to strong bounds from direct stop searches. Higgsinos, Winos and Binos are assumed to have mass parameters mB = 250 GeV, mW = 300 GeV, µ = 350 GeV. In that way the mass of the lightest neutralino is given by mχ0 1 225 GeV and the mass of the lightest chargino is mχ± 1 250 GeV. Such a charginoneutralino spectrum is heavy enough to avoid the bounds from the direct stop searches [81][82][83][84] 8 as well as bounds from electro-weakino searches [86,87]. Finally, we set tan β = 3 to minimize contributions to the dipole Wilson coefficients. bounds in the A t -A ct plane, assuming real trilinears. Right: bounds in the Re(A ct ) -Im(A ct ) plane, assuming a fixed A t = −1.5 TeV. The red region is excluded by the b → sµ + µ − data by more than 2σ with respect to the SM. In the blue region the agreement between the theory predictions and the experimental b → sµ + µ − data is improved by more then 1σ. Outside the dashed contours there exist charge and color breaking minima in the MSSM scalar potential that are deeper than the electro-weak minimum. In the black corners, the lightest up-squark mass eigenstate is the LSP.
In fig. 8 we show bounds on the trilinear couplings that can be derived from the b → sµ + µ − data in the described scenario. We evaluate all MSSM 1-loop contributions to the Wilson coefficients C ( ) 7,8,9,10 and compute the χ 2 as defined in (15) as a function of the trilinear couplings. For the numerical evaluation of the Wilson coefficients in the MSSM, we use an adapted version of the SUSY_FLAVOR code [88][89][90]. The plot on the left hand side of fig. 8 shows constraints in the A t -A ct plane, assuming real trilinears. The plot on the right hand side shows constraints in the Re(A ct ) -Im(A ct ) plane, for a fixed A t = −1.5 TeV. The red region is excluded by the b → sµ + µ − data by more than 2σ with respect to the SM (χ 2 > χ 2 SM + 6). In the blue region the agreement between the theory predictions and the experimental b → sµ + µ − data is improved by more than 1σ with respect to the SM (χ 2 < χ 2 SM − 2.3). In the black corners, the lightest up-squark mass eigenstate is lighter than the lightest neutralino. Outside the dashed contours there exist charge and color breaking minima in the MSSM scalar potential that are deeper than the electro-weak minimum. Note that the regions outside of these vacuum stability contours are not necessarily excluded. Even though a deep charge and color breaking minimum exists in these regions, the electro-weak vacuum might be meta-stable with a live time longer than the age of the universe. Studies show that requiring only meta-stability, relaxes the stability bounds on the trilinear couplings slightly [91][92][93][94][95]. A detailed analysis of vacuum meta-stability is beyond the scope of the present work.

Lepton flavour non-universality in the MSSM
The Z penguin effects discussed above are lepton flavour universal, i.e. they lead to the same effects in b → se + e − and b → sµ + µ − decays. Breaking of e-µ universality as hinted by the R K measurement can only come from box contributions as they involve sleptons of different flavours. If there are large mass splittings between the first and second generations of sleptons, or more precisely, if the selectrons are decoupled but smuons are kept light, Wino box diagrams (and to a lesser extent also Bino box diagrams) can contribute to C µ 9 and C µ 10 but not to C e 9 and C e 10 . Box contributions are, however, typically rather modest in size. As discussed above, boxes that are induced by flavour-changing trilinears arise only at the dimension 8 level and are completely negligible. Non-negligible box contributions (at the dimension 6 level) are only possible in the presence of flavour violation in the squark soft masses. However, even allowing for maximal mixing of left-handed bottom and strange squarks, it was found in [3] that Winos and smuons close to the LEP bound of ∼ 100 GeV as well as bottom and strange squarks with masses of few hundred GeV would be required to obtain contributions to C µ 9 and C µ 10 of 0.5, that could give R K ∼ 0.75. A careful collider analysis would be required to ascertain if there are holes in the LHC searches for stops [81][82][83][84], sbottoms [96][97][98], sleptons [86,99,100] and electro-weakinos [86,87] that would allow such an extremely light spectrum. We also note that a sizable splitting between the left-handed smuon and selectron masses required to break e-µ universality is only possible if the slepton mass matrix is exactly diagonal in the same basis as the charged lepton mass matrix, since even a tiny misalignment would lead to an excessive µ → eγ decay rate.

Flavour changing Z bosons
A massive Z gauge boson with flavour-changing couplings to quarks is an obvious candidate that can lead to large effects in b → s decays [3,[101][102][103][104][105][106]. Instead of discussing a complete model that contains such a Z boson, we will take a bottom up approach and ask which properties a Z has to have in order to explain the discrepancies observed in the b → s data. To this end we treat the mass of the Z as well as its couplings to SM quarks and leptons as free parameters. Following the notation of [107], we parametrize the Z couplings as In the presence of ∆ bs L/R and ∆ µµ L/R couplings, the Z boson will contribute to the Wilson coefficients C ( ) 9 and C ( ) 10 at tree level. As the primed Wilson coefficients hardly improve the agreement of the experimental b → sµ + µ − data with the theory predictions, we will not consider them here and set the right-handed bs couplings to zero, ∆ bs R = 0. The Z couplings ∆ bs L and ∆ µµ L/R are subject to various constraints that bound the maximal effect a Z prime can have in C 9 and C 10 . In particular, a Z boson with flavour-changing b ↔ s couplings will inevitably also contribute to B s -B s mixing at the tree level. One finds the following modification of the mixing amplitude where v = 246 GeV is the Higgs vev, and the SM loop function is given by S 0 2.3. If we allow for maximally 10% new physics contribution to the mixing amplitude, i.e. |M 12 /M SM 12 −1| < 0.1, we obtain the following stringent bound on the Z mass and the flavour-changing coupling, Concerning the couplings of the Z to leptons, we will start with the least constrained case, where the Z only couples to muons, but not to electrons and consider a coupling to lefthanded muons only. Subsequently, we will discuss how our conclusions change if we assume a vector-like coupling to muons or a lepton-flavour universal coupling.

Z with coupling to left-handed muons
The only non-zero coupling to charged leptons we consider here is ∆ µµ L . Such a Z is very poorly constrained. Over a very broad range of Z masses, the strongest constraint on ∆ µµ L comes from neutrino trident production [105,108], i.e. the production of a muon pair in the scattering of a muon-neutrino in the Coulomb field of a heavy nucleus 9 . The relative correction of the trident cross section in the presence of the considered Z is given by We use the CCFR measurement of the trident cross section, σ CCFR /σ SM = 0.82 ± 0.28 [109], to set bounds on the Z mass and its coupling to muons. At the 2σ level we find Combining this result with the bound on the flavour-changing quark coupling from B s mixing, eq. (32), we can derive an upper bound on the possible size of new physics contributions to the Wilson coefficients C 9 and C 10 that can be achieved in the considered setup. For the Wilson coefficients we have This implies |C NP 9 | = |C NP 10 | < 5.4 .
The best fit values in the C NP 9 = −C NP 10 scenario found in section 3.3 are well within this bound.
Although the explanation of the tensions in b → sµ + µ − transitions does not require a coupling of the Z to first generation quarks, it is interesting to investigate what happens in models where such couplings are present, which could lead to Z signals at the LHC. Fixing the Wilson coefficients C 9 and C 10 to their best fit values and assuming the flavour-changing  [112], assuming a Z → µ + µ − branching ratio of 50%. The region above the red curve is excluded by searches for quark-lepton contact interactions [111]. The upper plot axis shows the minimal value of the Z coupling to left-handed muons (37), required to obtain the best fit values for C 9 and C 10 .
coupling to have its maximal value (32) allowed by B s mixing, we find a lower bound on the muon coupling, Adopting the lower end of this range, ATLAS and CMS searches for quark-lepton contact interactions [110,111] can be used to put an upper bound on the Z coupling to the left-handed first-generation quark doublet. Using the CMS results [111], we find M Z |∆ qq L | 11 TeV (7 TeV) (38) for constructive (destructive) interference with the SM q LqL → µ + µ − amplitude. Comparing this to (32), we conclude that models with a rough scaling |∆ bs L | ∼ |V tb V * ts ∆ qq L | are compatible with these bounds.
For a Z mass between 200 GeV and 3.5 TeV, also LHC searches for resonances [112,113] in the dimuon mass spectrum can be used to put an upper bound on the Z coupling to first-generation quarks as a function of M Z . In fig. 9 we show the bound on ∆ qq L using the results from the ATLAS search [112] (shaded blue region). For the branching ratios of the Z we assume BR(Z → µ + µ − ) = BR(Z → ν µνµ ) = 1 2 , which approximately holds as long as the ∆ µµ L coupling is sufficiently large compared to couplings to other states. The bound from resonance searches could be weaker if the Z has e.g. a sizable branching ratio into a dark sector. In the same plot, we also show the bound from quark-lepton contact interaction searches from CMS [111], assuming (37) (red line). Below 3.5 TeV, we show this bound as a dashed line, because for such light Z masses the contact interaction approximation becomes invalid.
We conclude that, in order to lead to visible effects in b → sµ + µ − transitions, a heavy Z with M Z 3 TeV can have weak-interaction strength couplings to first-generation quarks without being in conflict with the bounds from contact interactions. Such a heavy Z must have strong couplings to muons (∆ µµ L 1). A lighter Z can be weakly coupled to muons, but requires a suppression of the coupling to first-generation quarks by roughly two orders of magnitude to avoid the bounds from direct searches.

Z with vector-like coupling to muons
If the couplings of the Z to muons are purely vector-like we can define ∆ µµ L = ∆ µµ R ≡ ∆ µµ V /2. In this case, the correction to the neutrino trident cross section reads and we obtain the following bound using the CCFR measurement Now the NP contribution to the Wilson coefficient C 10 vanishes, while for C 9 one has Again, one finds that sizable effects are possible: adopting the maximum allowed values for the couplings (40) and (32), we find |C NP 9 | < 9.3. The bounds on first-generation quark couplings from contact interaction and dimuon resonance searches are qualitatively similar to the left-handed case.

Z with universal coupling to leptons
If the Z coupling to leptons is flavour-universal, stringent bounds on ∆ can be obtained from LEP2 searches for four lepton contact interactions [114]. Depending on whether the coupling is to left-handed leptons only or is vector-like, we find where, for the last step, the flavour-changing coupling has been assumed to saturate the upper bound in (32) coming from B s mixing. We observe that the effects in b → sµ + µ − transitions are now much more limited, but, in particular for left-handed couplings, can still come close to the best-fit values in section 3.3 (of course, the anomaly in R K cannot be explained in this scenario.) Interestingly, this also implies that the effect in B s mixing is necessarily close to the current experimental bounds. Future improvements of the B s mixing constraints will then allow to test the lepton flavour universal scenario.
Concerning collider searches, the new feature of the lepton universal case is that there is an absolute lower bound on the Z mass from LEP2, M Z > 209 GeV. LHC bounds on the coupling to first generation quarks, on the other hand, are qualitatively similar to the non-universal case discussed above.

Summary and conclusions
Several recent results on rare B decays by the LHCb collaboration show tensions with standard model predictions. Those include discrepancies in angular observables in the B → K * µ + µ − decay, a suppression in the branching ratios of B → K * µ + µ − and B s → φµ + µ − , as well as a hint for the violation of lepton flavour universality in the form of a B → Kµ + µ − branching ratio that is suppressed not only with respect to the SM prediction but also with respect to B → Ke + e − . In this paper we performed global fits of the experimental data within the SM and in the context of new physics.
For our SM predictions we use state-of-the-art B → K, B → K * and B s → φ form factors taking into account results from lattice and light cone sum rule calculations. All relevant nonfactorizable corrections to the B → Kµ + µ − , B → K * µ + µ − and B s → φµ + µ − amplitudes that are known are included in our analysis. Additional unknown contributions are parametrized in a conservative manner, such that existing estimates of their size are within the 1σ range of our parametrization. We take into account all the correlations of theoretical uncertainties between different observables and between different bins of dilepton invariant mass. As experimental data is available for more and more observables in finer and finer bins, the theory error correlations have strong impact on the result of the fits.
Making use of all relevant experimental data on radiative, leptonic and semi-leptonic b → s decays we find that there is on overall tension between the SM predictions and the experimental results. Assuming the absence of new physics, we investigated to which extent nonperturbative QCD effects can be responsible for the apparent disagreement. We find that large non-factorizable corrections, a factor of 4 above our error estimate, could improve the agreement for the B → K * µµ angular observables and the branching ratios considerably. Alternatively, the branching ratio predictions could also be brought into better agreement with the experimental data, if the involved form factors were all systematically below the theoretical determinations from the lattice and from LCSR. On the other hand, we find that non-standard values of the form factors could at most lead to a modest improvement of B → K * µ + µ − angular observables. In both cases however, the hint for violation of lepton flavour universality cannot be explained.
Assuming that in our global fits the hadronic uncertainties are estimated in a sufficiently conservative way, we discussed the implications of the experimental results on new physics. Effects from new physics at short distances can be described model independently by an effective Hamiltonian and the experimental data can be used to obtain allowed regions for the new physics contributions to the Wilson coefficients. We find that the by far largest decrease in the χ 2 can be obtained either by a negative new physics contribution to C 9 (with C NP 9 ∼ −30% × C SM 9 ), or by new physics in the SU (2) L invariant direction C NP 9 = −C NP 10 , (with C NP 9 ∼ −12% × C SM 9 ). A positive NP contribution to C 10 alone would also improve the fit, although to a lesser extent.
Concerning the hint for violation of lepton flavour universality, we observe that new physics exclusively in the muonic decay modes leads to an excellent description of the data. We do not find any preference for new physics in the electron modes. We provide predictions for other lepton flavour universality test. We find that the ratio R A FB of the forward-backward asymmetries in B → K * µ + µ − and B → K * e + e − at low dilepton invariant mass is a particularly sensitive probe of new physics in C µ 9 . A precise measurement of R A FB would allow to distinguish the new physics scenarios that give the best description of the current data.
Finally we also discussed the implications of the model independent fits for the minimal supersymmetric standard model and models that contain Z gauge bosons with flavour-changing couplings. In the MSSM, large flavor changing trilinear couplings in the up-squark sector can give sizable contributions to the Wilson coefficient C 10 and we identified regions of MSSM parameter space that are favoured or disfavoured by the current experimental data. Heavy Z bosons can have the required properties to explain the discrepancies observed in the b → s data. If the Z couples to muons but not to electrons (as preferred by the data), it is only weakly constrained by indirect probes. On the other hand, if the Z couplings to leptons are flavour universal, LEP constraints on 4 lepton contact interactions imply that an explanation of the b → s discrepancies results in new physics effects in B s mixing of at least ∼ 10%. In all scenarios, the couplings of the Z to first generation quarks are strongly constrained by ATLAS and CMS measurements of dilepton production.
We look forward to the updated experimental results using the full LHCb data set, which will be crucial in helping to establish or to refute the exciting possibility of new physics in b → s transitions.

C. Constraints on pairs of Wilson coefficients
Figs. 11 and 12 shows the constraints in the planes of the complex Wilson coefficients or of various pairs of real Wilson coefficients. The blue contours correspond to the 1 and 2σ regions allowed by the global fit. The green short-dashed and the red short-dashed contours correspond to the 2σ allowed regions for scenarios with doubled form factor uncertainties and doubled uncertainties related to sub-leading non-factorizable corrections, respectively. The ∆χ 2 of the best fit point with respect to the SM is also given in the plots.  fig. 4