Optimal observable analysis for the decay $b \to s$ plus missing energy

The decay $b\to s\nu\bar\nu$ has been a neglected sibling of $b\to s\ell^+\ell^-$ because neutrinos pass undetected and hence the process offers less number of observables. We show how the decay $b\to s~+$~invisible(s) can shed light, even with a limited number of observables, on possible new physics beyond the Standard Model and also show, quantitatively, the reach of future $B$ factories like SuperBelle to uncover such new physics. Depending on the operator structure of new physics, different channels may act as the best possible probe. We show, using the Optimal Observable technique, how almost the entire parameter space allowed till now can successfully be probed at a high luminosity $B$ factory.


I. INTRODUCTION
While the semileptonic decay B → K ( * ) + − mediated by the flavor-changing neutral current (FCNC) transition b → s has received a lot of attention as a sensitive probe of new physics (NP) beyond the Standard Model (SM), much less light has been shed on the analogous process b → sνν; either the exclusive channels B → K ( * ) νν or the semi-inclusive one B → X s νν. There are three main reasons for this. First, the decay is yet to be observed; there is only an upper limit to the branching ratio (BR) of such processes [1,2]. This is not unexpected considering that the experimental sensitivity is about one order of magnitude above the SM predictions. Second, the number of observables are less than the processes involving charged leptons, because the neutrinos escape undetected. Third, the theoretical uncertainties like those coming from the form factors are more serious than relatively cleaner channels like K → πνν.
One can pose counter-arguments too. For example, the Belle upgrade with a much enhanced integrated luminosity (or any other future e + e − B factory) will almost definitely observe this process even if there is no NP involved. The limited number of observables can actually make the analysis cleaner. As we will show quantitatively, one can successfully use these few observables not only to differentiate some well-motivated NP models from the SM but also to have a glimpse of the possible operator structure of those models. This remains true even when one takes into account all the theoretical uncertainties like the form factors, elements of the Cabibbo-Kobayashi-Maskawa (CKM) matrix, the running quark masses, or the higher-order corrections.
The conception that whatever may be inferred from the neutrino channels can also be inferred in a much cleaner * zaineb.calcuttawala@gmail.com † anirban.kundu.cu@gmail.com ‡ soumitra.nandi@gmail.com § sunando.patra@gmail.com way involving charged lepton final states, because of the SU (2) L conjugate nature of the corresponding operators, is also not entirely correct. Consider, for example, an SU (2) L singlet current of the form ab L a L γ µ Q b L , involving both quark and lepton doublets. that couples to a vector leptoquark. The charged lepton final states, obviously, come only from the anomalous top decays and not from B decays. Thus, the neutrino channels are worthy to be studied on their own right. There may be other light invisible particles in the final state (we will show an example later) which have nothing to do with charged leptons. As neutrinos or other such invisibles go undetected, this channel offers an effective probe for such models. Thus, one must treat the decay channel b → s + invisible(s) as an independent source of information from b → s + − , although there can be correlations in some beyond-SM models.
As an example of what we have just said, let us note that apart from neutrinos coming from non-SM operators, the FCNC transition b → s can involve light invisible scalars in the final state. If they are singlet under all the SM gauge groups, they can very well be a candidate for cold dark matter (DM). Although there are strong constraints on such light DM particles from the direct detection experiments like LUX, XENON or PANDAX [3], one may avoid them if the DM is nonthermal in origin. The only limit comes from the invisible decay width of the Higgs boson at the Large Hadron Collider (LHC), which one can easily keep within the tolerable limit of ∼ 10% if the Higgs-DM coupling is small. Thus, an analysis of the decay b → s + invisibles will also act as a complementary probe to the DM direct search.
We will use the Optimal Observable (OO) technique for the analysis. The OO technique helps one to identify observables where the NP can be differentiated from the SM with highest confidence level. Of course, which observables are to be used depends on which part of the parameter space the NP falls. While this technique has been more widely used for collider studies [4,5], Ref. [6] shows how this can be applied to semileptonic B decays as well. Note that in the absence of any data, one must work only with statistical uncertainties. The systematic uncertainties will somewhat relax the reaches and the confidence levels.
The OO technique not only shows the regions of the parameter space for NP where differentiation from the SM will be easy but also the variables that one should look at to have such a successful differentiation. In other words, a simultaneous study of all the relevant observables can effectively pin down the region of the parameter space where any beyond-SM physics may lie. In the next Section, we will provide a sketchy discussion of the OO technique. In Section III, we discuss two most popular NP models; one with neutrinos in the final state but the effective operator basis augmented by some NP operators; and the second with light scalars in the final state along with SM neutrinos. Sections IV and V discuss our results for these two models respectively. We summarize and conclude in Section VI.

II. THE OPTIMAL OBSERVABLE TECHNIQUE
This section is rather sketchy and follows the notation of Ref. [4,5] Suppose there is an observable O which depends on the variable φ as where c i s are model-dependent coefficients, like the Wilson coefficients (WC), and f i (φ) are known functions of φ. For our case, φ can be identified with the momentum transfer (to the invisible particles) squared, q 2 = (p B − p K ( * ) ) 2 , where p a denotes the four-momentum of the particle a. To get c i , one can fold with weighting functions w i (φ) such that There happens to be a unique choice of w i (φ) such that the statistical error in c i s are minimized. For this choice, the covariance matrix V , defined as is at a stationary point with respect to the variation of φ: δV ij = 0. This happens if we choose where In this case, For only this choice of weighting functions, the covariance matrix is where N is the total number of events, given by the integrated cross-section times total luminosity times the efficiencies. This result holds even if there are applied cuts. The minimum of statistical uncertainty in the extraction of a parameter gives the maximum significance of that parameter over the others. Therefore, using this technique we can test the significance of a specific NP model over the other models, including the SM. In other words, given the data, one can say with what significance some observable may differentiate a particular type of NP from the SM. This significance, as one should emphasize here, depends on the observable chosen, on the parameters of the NP model, and on the integrated luminosity, all of which are intuitively obvious.
As an example, suppose one is looking at the branching fractions of a B meson to several final states. For the final state f , the branching fraction can be expressed as where Γ is the total decay width. The statistical uncertainties in c i s extracted from the branching fractions can be written as [6] |δc As given in Eq. (9), the errors are also related to the total production cross section σ P ( = σ B→f /B(B → f )), and the effective luminosity L eff = L int s , where L int and s are the integrated luminosity and reconstruction efficiency respectively. When the number of nonzero NP parameters is small, the analysis can also be done by defining a quantity analogous to χ 2 , such as The c 0 i s are called the seed values, which can be considered as model inputs. Thus, they are the values of c i s with the parameter values chosen for the reference model. V ij s are defined in Eq. (7).
As mentioned earlier, our goal is to study the decay b → s + invisible(s), which includes the exclusive modes like B → K ( * ) νν 1 . In such decays, the major sources of uncertainties are the hadronic form-factors, like F B→K ( * ) (q 2 ). Thus, it is important to differentiate between SM and any possible NP taking into account all these uncertainties, and check whether the future experimental statistics will allow a clear separation of the two. Let us now explain briefly how to use a χ 2 statistic test to pinpoint such a differentiation.
There are a few things one should take note of while interpreting the results of the OO technique.
(1) A regular χ 2 statistic is a function of parameters of a model and 'measures' the deviation of those from observed values of some experimental quantities. The one we need to concoct, however, should take one specific model (e.g. SM) as reference, in place of experimental results, and should be a function of some parameters, each set of values of which indicates one comparisonworthy model (e.g. NP). For example, if we have a set of new operators O i with c i being the corresponding WCs, c i = 0 ∀ i is the SM and any other set is a NP model that may be compared with the SM. The χ 2 should also be a measure of 'separation' (deviation) between any NP model and the reference one. In the rest of this analysis, the reference-model will always be SM. By construction, we ensure that χ 2 | SM = 0 and χ 2 = n 2 denotes a separation of n σ from the SM.
(2) Projections of the constant χ 2 = 1 contours on each parameter axis will give us the corresponding δc i s. Following the point above, the constructed χ 2 has no measurements or data points in it and thus the δc i s obtained are not statistical uncertainties on the c i s; it is not even the predictions for them. These are uncertainties of the reference model, or, in other words, a measure of the region in parameter space where the reference model is indistinguishable from models parametrized by surrounding points in the parameter space. Thus, points on the 1σ contour parametrize models that can be distinguished from the SM at 1σ level only.
(3) When varying the χ 2 over the allowed parameter space, V −1 ij also depends on the parameter values through O(φ), which comes in the denominator of M ij (Eq. (5)). This is known as the seed dependence.
(4) The covariance matrix V ij as well as the χ 2 are obtained using the central values of all the parameters. So any separation obtained after the analysis, though qualitatively correct, has to be modified after inclusion of the SM errors.
(5) If considered, theoretical uncertainties in O(φ) will in turn introduce an uncertainty in the χ 2 . In other words, in presence of the uncertainties, the n σ contours will become bands of nonzero width in the parametric space.
For completeness, we will also provide the decay rate distributions of the processes under consideration, and test their usefulness in differentiating the NP models from the SM.

A. Only neutrinos as invisible
The first NP model treats neutrinos as the only carrier of missing energy in b → s decays. We will also take, for simplicity, not only no lepton flavor violation (LFV) but also lepton flavor universality (LFU). This means all the three flavors of νν pairs are produced in equal number even by the NP operators, and there are no ν i ν j final states with i = j. Note that both these assumptions can be violated in specific NP models.
The effective Hamiltonian for b → sν i ν i can be written as Note that O SM and O V1 are identical only with our assumption of LFU and no LFV. NP with LFU can mean, in an extreme case, that only one flavor of neutrino will be present; with LFV, the two neutrinos can be of different flavor. Similar considerations apply for O V2 . Under our simplifying assumptions, one can write Eq. (11) as (13) in terms of the scaled Wilson coefficients defined as If the NP is also at the loop level, we expect |C 1 |, |C 2 | ∼ O(1). If it is at tree level, |C 1 |, |C 2 | 1. At the leading order, the Inami-Lim function X t is given by We will use the following numbers for our subsequent analysis: where τ B is the lifetime of the B meson.
The exclusive differential decay distributions are given by [7,8] dΓ B→Kνν and the inclusive distribution by where and κ(0) = 0.83 is the QCD correction factor. Note that the structure of the interference term in |1+C 1 | 2 changes if O V1 has LFV or non-LFU nature.
Our Analysis New Data FIG. 1: Allowed ranges of C 1 and C 2 . Also shown is the truncated region allowed by the recent Belle data [2].
For B → K * , the form factors are defined in terms of the conventional set as where M = m B + m K * . To get the form factors, one first defines the function where (22) Then we define the generic structure as where the pole masses m P are V : 5.415 GeV, A 0 : 5.366 GeV, A 1 , A 12 : 5.829 GeV, (24) and [9] Here A 2 has been replaced by A 12 , given by The form factor A 0 will be needed when we discuss the decays to light invisible scalars.
The scalar form factors f 0 and f + for B → K are given by [10] f 0 (q 2 ) = K k=0 a 0 k z(q 2 ) k , where Another observable that we may use is the modified transverse polarization fraction of K * in B → K * νν decays, defined as Note that the denominator has been integrated over, to give an overall normalization. It can be shown easily that From the experimental bounds on the branching fractions, namely, at 90% CL, we get the following approximate constraints on the scaled Wilson coefficients, assuming them to be real (but not necessarily positive): The allowed parameter spaces are shown in Fig. 1. Belle has a recent update [2], mostly on the B → K * νν mode: While our analysis uses the old parameter space, the results, as we will see, are obvious even with the new results. Fig. 1 also shows the updated parameter space.
We will show with the OO technique how much of the parameter space can be successfully differentiated from the SM, and with what confidence level. In our analysis, we have noted that the errors extracted on the new WCs are independent of the choices of the seed values. As mentioned earlier, these seed values can be chosen from the allowed NP parameter space. Obviously, depending on the data, different observables will have different power to differentiate NP effects from the SM. As the values are not known a priori, one has to look at all the observables and the pattern of the signal to have an idea of the underlying model.

B. Light invisible scalar
Another possibility is to consider the decay b → sSS where S is some gauge singlet scalar, which can be a cold DM candidate. In the Higgs portal DM models, S couples only to the SM doublet Φ through a term like a 2 S 2 Φ † Φ → 1 2 a 2 S 2 h 2 in the Lagrangian, where h is the SM Higgs field. If m S < m h /2, the invisible decay h → SS opens up, and one must keep a 2 to be sufficiently small to avoid the LHC bound on such invisible decay channels: BR(h → invisible) < 10%. Although this is in contradiction to a thermalized cold DM giving the correct relic density of the universe, the singlets can form only a part of the relic density and may even be non-thermal in nature.
One has to be careful about the construction of effective operators. At the first sight, it may appear that an effective dimension-6 operator s L b R ΦS 2 may lead to the decay b → sSS when Φ is replaced by its vacuum expectation value (VEV). This is indeed the case if S does not have any VEV, which is essential if S is a DM candidate (otherwise it will mix with h and decay to SM final states). On the other hand, the Higgs penguin diagrams like b → sh * , h * → SS, as discussed in some literature [11,12], cannot be there if the electroweak symmetry is broken by a single Higgs field. The reason is that the effective off-diagonal Yukawa coupling y bsh is proportional to the off-diagonal mass term m bs in the mass matrix, and once one goes to the stationary basis, such off-diagonal Yukawa couplings must vanish. This loophole can be avoided if there are more than one fields responsible for symmetry breaking, or if there are higher dimensional operators involving Φ in quadratic or more, so that the proportionality of the Yukawa matrix and the mass matrix gets spoiled 2 . Here, we will just assume the existence of a set of effective operators and explore the consequences.
We start with an effective Lagrangian of the form FIG. 2: Allowed ranges of C S1 and C S2 for m S = 0.5 GeV and m S = 1.8 GeV. Also shown is the truncated region allowed by the recent Belle data [2]. and assume only the SM operator to be present for the b → sνν decay, so that Following Ref. [7], one gets where the form factors f 0 (q 2 ) and A 0 (q 2 ) can be obtained from Eq. (27) and Eq. (23) respectively.
For the decay B → K * SS, all K * s are longitudinally polarized. We define a modified longitudinal polarization fraction which comes out to be Obviously, the allowed range of the WCs depend on the scalar mass m S , which is shown in Fig. 2. Thus, apart from the new Wilson coefficients, m S is also another a priori unknown quantity.

IV. RESULTS: ONLY NEUTRINOS
Our results are shown for the projected SuperBelle integrated luminosity L int = 50 ab −1 . However, to moti-vate experimentalists, we also show, for some cases, the results with L int = 2 ab −1 , just to bring home the message that there might be reasons to feel excited even within the first year of running. We have taken the production cross-section for B 0 and B + to be the same, which is known to be an excellent approximation. The detection efficiencies for different channels are taken from Ref. [14] 3 : and we use the SU (2) averaged detection efficiency for B → K * νν. We also take the detection efficiency for the semi-inclusive B → X s channel to be the same as that of B → K * . These numbers will probably be slightly modified for the next generation detectors. However, in the absence of a detailed simulation study, it is impossible to include the systematic errors, so we have to work with the statistical error only. From the definition of the OOs, it is clear that one needs to have at least two different c i s for this technique to work, otherwise it is just a simple scaling of the SM expectation. With the assumption of LFU and no LFV, this is what happens for the decay B → Kνν; the SM factor of 1 is replaced by |1 + C 1 + C 2 | 2 . On the other hand, the decays B → K * νν, B → X s νν, and the scaled transverse polarization fraction F T all have more than one combinations of the new WCs.
In Figs. 3a and 3b, we show the results from the OO analysis of the decays B → K * νν and B → X s νν respectively. These plots use the q 2 -integrated data, and shows how far the NP can be differentiated from the SM depending on the precise values of C 1 and C 2 . The χ 2 = n 2 (with n = 1, 3,5,7,9 ) lines are obtained in the C 1 , C 2 basis with χ 2 | SM = 0, where, depending on the values of n, each line represents a deviation of n σ from the SM. Obviously, C 1 = C 2 = 0 is the SM and close to that the chances of separation are the weakest, as shown by the 1σ band. Note that C 1 = −2 and C 2 = 0 is also SM-like, because of the destructive interference between the two amplitudes, keeping |1 + C 1 | 2 = 1. As can be seen, with L int = 50 ab −1 (left panels), both the decays can differentiate NP from SM over most of the allowed parameter space with a high confidence level. Even small NP contributions like |C 1 | and/or |C 2 | of order 10 −1 can be differentiated from the SM at more than 5σ confidence level. The point C 1 = −1, C 2 = 0 denotes completely destructive interference with the SM and no signal events, and this is obviously much away from the SM expectation. The separations are expectedly worse for L int = 2 ab −1 , as shown in the right panels of the Figs. 3a and 3b, but even then, there are regions in the parameter space that can show some interesting trend. As an example, we note that it is possible to separate out NP contributions like |C 1 |, |C 2 | ≈ 1 from the SM at 5σ confidence level or more.
With enough data, one may even measure the differential decay distribution dΓ/dq 2 . In Figs. 3c and 3d, we show the differences in dΓ/dq 2 profiles between the SM and the NP for a couple of benchmark points, shown as NP-1 and NP-2, for the decays B → K * νν and B → X s νν. Note that both the benchmark points are allowed even by the new Belle data (Fig. 1). Integrated branching fractions of these modes in SM and the selected benchmark points are listed in eq.(41).
The present data almost rules out |C i | 1 -the NP has to be either loop-mediated or the new particles have to be so massive as to lie outside the direct detection range of the LHC -and so we concentrate on small-C i points. The bars shown in the plots represent the combined errors due to the various theory inputs, mostly coming from the form factors. In the case of NP, if we treat the δC i s coming from the OO analysis of the respective decay modes as a measure of future statistical uncertainties on the NP WC's, then both dBr/dq 2 and the total branching fraction will have additional errors coming from them. We note that the NP sensitivities on the q 2 distributions of the exclusive and inclusive decay modes are different. For example, the distribution for B → K * νν is highly sensitive to NP in the region 10 GeV 2 < q 2 < 15 GeV 2 , while that for the decay B → X s νν is more sensitive to the low-q 2 region, q 2 < 10 GeV 2 . Therefore, study of dΓ/dq 2 for these exclusive and inclusive channels may be quite useful to pin down the parameters of the NP. For most of the beyond-SM theories, there should be a corroborative signature from charged lepton final state channels, but, as we pointed out, this may not be true always.
Other observables are expected to yield different confidence level contours. This is shown in Fig. 4 for the modified transverse polarization fraction F T , both for low and high L int . Note that the NP sensitivities of this observable are similar to that for the decay B → K * νν. Note that the separation for this observable may not go beyond 3σ confidence level for the low-L int option. With invisible light scalars escaping the detector, one gets an identical signal as b → sνν. We will assume two such identical light scalars produced in the decay, i.e., b → sSS. The differential decay distributions depend on the mass of S, which we take to be either 0.5 GeV (called the light scalar or LS option), or 1.8 GeV (called the heavy scalar or HS option). For both these options, we show our results taking L int = 50 ab −1 and 2 ab −1 , just as before. Obviously, separation from the SM will be better for lighter scalars, as for heavier scalars, the low-q 2 region will be covered only by the SM and hence those bins will be irrelevant for the analysis.
We show the confidence levels in Figs. 5a, 5b and 5c for the decays B → KSS, B → K * SS and B → X s SS respectively for the LS option. The shape of the contours are intuitively obvious from the expressions of dΓ/dq 2 . For example, the mode B → KSS is not of much use if C S1 ≈ −C S2 . A complementary set of information can be obtained from the B → K * SS mode. As expected, only regions close to the SM point C S1 = C S2 = 0 may not be differentiable from the SM itself. Roughly speaking, one can have a 5σ separation from the SM in at least one channel with L int = 50 ab −1 if |C S1 | and/or |C S2 | be as small as 0.03. The inclusive channel B → X s SS is even more powerful, as the branching fraction depends on the combination |C S1 | 2 +|C S2 | 2 . This leads to circular contours around the origin. There is a subleading term proportional to the strange quark mass m s which breaks this symmetry, and so the contours appear to be slightly deformed. The low-luminosity option as displayed in the right panels show that the differentiation is harder for exclusive modes, while the inclusive mode fares better. Points like |C S1 | and/or |C S2 | ≈ 0.05 can be differentiated from the SM at more than 5σ confidence level. Integrated branching fractions of these modes in SM and the selected benchmark point ( fig. 6) The q 2 distributions for the decay rates of B → KSS, B → K * SS and B → X s SS are shown in Figs. 6a, 6b, and 6c respectively for L int = 50 ab −1 for the LS option. While for B → KSS and B → K * SS, the q 2 distributions are sensitive for NP over the entire q 2 region except for very high (> 15 GeV 2 ) and very low (≈ 0) regions, for the semi-inclusive decay the NP sensitivity is more in the region 2 GeV 2 < q 2 < 10 GeV 2 . Similar plots for m S = 1.8 GeV are shown in Fig. 8. We note that though q 2 min is much higher for this case, the nature of the q 2 distributions, and therefore the NP sensitivities, is similar to that obtained for the LS case.
As is defined in Eq. 38, the decay B → K * SS has another observable, namely, the longitudinal polarization F L . This is because the K * mesons appearing with scalars are completely longitudinally polarized. The confidence level contours for F L obtained from the OO analysis are shown in Figs. 9a and 9b for the LS and the HS options respectively. This observable has similar kind of NP sensitivity as that of B → K * SS.

VI. SUMMARY
We have analysed the NP sensitivities of the different observables in the decays b → s + invisibles using the Optimal Observables technique. We consider two NP models: (1) only neutrinos as the carrier of missing energy but with a new operator involving right-handed quark current; and (2) apart from the SM neutrinos, light invisible scalars as the carrier of missing energy. The analysis takes into account all the new effective operators and their effects on several observables, namely, the total decay width for inclusive and exclusive modes, the differential decay distributions, and the modified transverse and longitudinal polarization fractions as defined in the text.
We show our results both for the high-and lowluminosity options of Belle-II, namely, L int = 50 ab −1 and 2 ab −1 respectively. All the observables are sensitive to NP effects, and even small NP effects might be detectable at future high-luminosity Belle-II. The differentiation of the NP from the SM is obviously not that trivial for the low-luminosity option, apart from the observables like inclusive branching fractions.
The NP sensitivities of dΓ/dq 2 for exclusive and inclusive channels are different. As the data on that will possibly come after the branching fraction data, they will serve as an additional check on the operator structure and parameter values of the NP. Note that the exclusive distributions are more or less similar for both the NP models, but the inclusive distributions are different, so that may serve as a good discriminator. Thus, we encourage our experimental colleagues to investigate both the q 2 -integrated branching fractions as well as differential distributions.