1 Introduction

While the semileptonic decay \(B\rightarrow K^{(*)}\ell ^+\ell ^-\) mediated by the flavor-changing neutral current (FCNC) transition \(b\rightarrow s\) has received a lot of attention as a sensitive probe of new physics (NP) beyond the Standard Model (SM), much less discussion is available in the literature on the analogous process \(b\rightarrow s\nu \bar{\nu }\); either the exclusive channels \(B\rightarrow K^{(*)}\nu {\bar{\nu }}\) or the semi-inclusive one \(B\rightarrow X_s\nu {\bar{\nu }}\). There are three main reasons for this. First, the decay is yet to be observed; there is only an upper limit to the branching ratio (BR) of such processes [1, 2]. This is not unexpected considering that the experimental sensitivity is about one order of magnitude above the SM predictions. Second, the number of observables are less than the processes involving charged leptons, because the neutrinos escape undetected. Third, the theoretical uncertainties like those coming from the form factors are more serious than relatively cleaner channels like \(K\rightarrow \pi \nu {\bar{\nu }}\).

One can pose counter-arguments too. For example, the Belle upgrade with a much enhanced integrated luminosity (or any other future \(e^+e^- B\) factory) will almost definitely observe this process even if there is no NP involved. With a small number of observables, one may extract only a limited amount of information as regards possible beyond-SM contributions to this decay. As we will show quantitatively, one can successfully use even these few observables not only to differentiate some well-motivated NP models from the SM but also to have a glimpse of the possible operator structure of those models. This remains true even when one takes into account all the theoretical uncertainties like the form factors, elements of the Cabibbo–Kobayashi–Maskawa (CKM) matrix, the running quark masses, or the higher-order corrections.

The conception that whatever may be inferred from the neutrino channels can also be inferred in a much cleaner way addressing charged lepton final states, because of the \(SU(2)_L\) conjugate nature of the corresponding operators, is also not entirely correct. Consider, for example, an \(SU(2)_L\) singlet current of the form \(\epsilon _{ab} \bar{L}_L^a \gamma ^\mu Q_L^b\), involving both quark and lepton doublets that couple to a vector leptoquark. The charged lepton final states, obviously, come only from the anomalous top decays and not from B decays. Thus, the neutrino channels are worthy to be studied on their own right. There may be other light invisible particles in the final state (we will show an example later) which have nothing to do with charged leptons. As neutrinos or other such invisibles go undetected, this channel offers an effective probe for such models. Thus, one must treat the decay channel \(b\rightarrow s~+\) invisible(s) as an independent source of information from \(b\rightarrow s\ell ^+\ell ^-\), although there can be correlations in some beyond-SM models. A typical case of how crucial the invisible channel can be for the explanation of the apparent anomalies in the semileptonic decays of the B meson has been exemplified in Ref. [3]

As an example of what we have just said, let us note that apart from neutrinos coming from non-SM operators, the FCNC transition \(b\rightarrow s\) can involve light invisible scalars in the final state. If they are singlet under all the SM gauge groups, they can very well be a candidate for cold dark matter (DM). Although there are strong constraints on such light DM particles from the direct detection experiments like LUX, XENON or PANDAX [4,5,6], one may avoid them if the DM is nonthermal in origin. The only limit comes from the invisible decay width of the Higgs boson at the Large Hadron Collider (LHC), which one can easily keep within the tolerable limit of \(\sim 10\%\) if the Higgs-DM coupling is small. Thus, an analysis of the decay \(b\rightarrow s~+\) invisibles will also act as a complementary probe to the DM direct search.

At this point, let us emphasize that it is not necessary for a particular observable, like the differential branching ratio, to have equal sensitivity to different types of NP operators. Therefore it is useful to know how an observable can be optimized to guarantee the maximal sensitivity to a particular type of NP interaction, which in turn will help us select observables suitable for the extraction of a particular type of coupling. Hence, from a phenomenological point of view, it is important to find the significance of different types of NP interaction to an observable. To achieve this goal, we will use the optimal observable (OO) technique for the analysis. The OO technique helps one to identify observables where the NP can be differentiated from the SM with highest confidence level. Of course, which observables are to be used depends on which part of the parameter space the NP falls. While this technique has been more widely used for collider studies [7, 8], Ref. [9] shows how this can be applied to semileptonic B decays as well. Note that in the absence of any data, one must work only with statistical uncertainties. The systematic uncertainties will somewhat relax the reaches and the confidence levels. In the OO technique, the NP sensitivities are decided on the basis of the statistical uncertainties in the extraction of the relevant new model parameters in comparison to a reference model. In general, the smaller the uncertainty, the better the expected sensitivity. Any prediction or conclusion that is obtained as a result of this analysis takes only three things in consideration: (a) the theoretical expression involving all (un)known parameters, (b) the seed values of the parameters (which can be taken to be zero), and (c) the projected effective luminosity. Unlike any post-data statistical analysis, like the maximum likelihood method, this technique focuses on the predictive power of the theoretical expression of the observable itself and quantifies the sensitivity of that observable in distinguishing between competing models. In other words, given two points in the allowed parameter space A and B, we can say with what confidence level we can differentiate model A from model B depending on the observables chosen. The method is obviously even more useful for cases where there is no data available yet.

The OO technique not only shows the regions of the parameter space for NP where differentiation from the SM will be easy but also the variables that one should look at to have such a successful differentiation. In other words, a simultaneous study of all the relevant observables can effectively pin down the region of the parameter space where any beyond-SM physics may lie. In the next section, we will provide a sketchy discussion of the OO technique. In Sect. 3, we discuss two most popular NP models; one with neutrinos in the final state but the effective operator basis augmented by some NP operators; and the second with light scalars in the final state along with SM neutrinos. Sections 4 and 5 discuss our results for these two models, respectively. We summarize and conclude in Sect. 6.

2 The optimal observable technique

This section is rather sketchy and follows the notation of Refs. [7, 8]. Suppose there is an observable O which depends on the variable \(\phi \) as

$$\begin{aligned} O(\phi ) = \sum _i c_i f_i(\phi ), \end{aligned}$$
(1)

where \(c_i\)s are model-dependent coefficients, like the Wilson coefficients (WC), and \(f_i(\phi )\) are known functions of \(\phi \). For our case, \(\phi \) can be identified with the momentum transfer (to the invisible particles) squared, \(q^2=(p_B - p_{K^{(*)}})^2\), where \(p_a\) denotes the four-momentum of the particle a. To get \(c_i\), one can fold with weighting functions \(w_i(\phi )\) such that

$$\begin{aligned} \int w_i(\phi ) O(\phi ) \, \mathrm{d}\phi = c_i. \end{aligned}$$
(2)

There happens to be a unique choice of \(w_i(\phi )\) such that the statistical error in \(c_i\)s are minimized. For this choice, the covariance matrix V, defined as

$$\begin{aligned} V_{ij} \propto \int w_i(\phi ) w_j(\phi ) O(\phi )\, \mathrm{d}\phi , \end{aligned}$$
(3)

is at a stationary point with respect to the variation of \(\phi : \delta V_{ij} = 0\). This happens if we choose

$$\begin{aligned} w_i(\phi ) = \frac{ \sum _j X_{ij} f_j(\phi ) }{ O(\phi )}, \end{aligned}$$
(4)

where

$$\begin{aligned} X_{ij} = (M^{-1})_{ij},\ \ M_{ij} = \int \frac{f_i(\phi )f_j(\phi ) }{O(\phi )} \, \mathrm{d}\phi . \end{aligned}$$
(5)

In this case,

$$\begin{aligned} c_i = \sum _j X_{ij} I_j = \sum _j (M^{-1})_{ij} I_j, \ \ I_j = \int f_j(\phi )\, \mathrm{d}\phi . \end{aligned}$$
(6)

For only this choice of weighting functions, the covariance matrix is

$$\begin{aligned} V_{ij} = \langle \varDelta c_i \varDelta c_j\rangle = \frac{ (M^{-1})_{ij} \sigma _T}{N}, \end{aligned}$$
(7)

where \(\sigma _T = \int O(\phi )\, \mathrm{d}\phi \). (If \(O(q^2) = \mathrm{d}\varGamma /\mathrm{d}q^2, \sigma _T=\varGamma \).) N is the total number of events, given by the integrated cross section times total luminosity times the efficiencies. This result holds even if there are applied cuts. The minimum of statistical uncertainty in the extraction of a parameter gives the maximum significance of that parameter over the others. Therefore, using this technique we can test the significance of a specific NP model over the other models, including the SM. In other words, given the data, one can say with what significance some observable may differentiate a particular type of NP from the SM. This significance, as one should emphasize here, depends on the observable chosen, on the parameters of the NP model, and on the integrated luminosity, all of which are intuitively obvious.

As an example, suppose one is looking at the branching fractions of a B meson to several final states. For the final state f, the branching fraction can be expressed as

$$\begin{aligned} \mathcal{B}(B\rightarrow f)^{\mathrm{exp}} = \frac{1}{\varGamma } \, \int \frac{\mathrm{d}\varGamma }{\mathrm{d}q^2}\, \mathrm{d}q^2, \end{aligned}$$
(8)

where \(\varGamma \) is the total decay width. The uncertainties of \(c_i\)s in the parameter space, extracted from the branching fractions, can be written as [9]

$$\begin{aligned} |\delta c_i| = \sqrt{\frac{X_{ii} {\mathcal{B}(B\rightarrow f)}^{\mathrm{exp}}}{N}} = \sqrt{\frac{X_{ii}}{\sigma _P \mathcal{L}_{\mathrm{eff}}}}. \end{aligned}$$
(9)

As given in Eq. (9), the errors are also related to the total production cross section \(\sigma _P\) ( = \(\sigma _{B \rightarrow f}/\mathcal{B}(B \rightarrow f)\)), and the effective luminosity \(\mathcal{L}_{\mathrm{eff}} = \mathcal{L}_{\mathrm{int}} \epsilon _s\), where \(\mathcal{L}_{\mathrm{int}}\) and \(\epsilon _s\) are the integrated luminosity and reconstruction efficiency, respectively.

When the number of nonzero NP parameters is small, the analysis can also be done by defining a quantity analogous to \(\chi ^2\), such as

$$\begin{aligned} \chi ^2 = \sum _{i,j} (c_i - c_i^0) (c_j - c_j^0) V_{ij}^{-1}. \end{aligned}$$
(10)

The \(c_i^0\)s are called the seed values; they can be considered as model inputs. Thus, they are the values of \(c_i\)s with the parameter values chosen for the reference model. \(V_{ij}\)s are defined in Eq. (7).

In the absence of data and with only an estimate of the effective luminosity (\({\mathcal {L}}_\mathrm{eff}\)), the only prediction one can make is a model-dependent prediction of the event distribution. The strength of the OO technique lies in the fact that, even in the absence of any data, it can predict the minimum uncertainty with which competing models can be separated from each other, given any experimentally measurable observable. For example, if we consider SM (i.e. vanishing NP parameters), there would be regions in the NP parameter space which will be indistinguishable from SM under a certain confidence level. The OO technique finds the highest precision (minimum-area error-ellipse for any confidence level) with which any NP model (a specific point on the parameter space) can be distinguished from SM. The technique works not only for SM, but also for any other specific reference model, denoted by \(c^0_{i,j}\) in Eq. 10.

As mentioned earlier, our goal is to study the decay \(b\rightarrow s~+\) invisible(s), which includes the exclusive modes like \(B \rightarrow K^{(*)} \nu {{\bar{\nu }}}\) Footnote 1. In such decays, the major sources of uncertainties are the hadronic form factors, like \(F^{B\rightarrow K^{(*)}}(q^2)\). Thus, it is important to differentiate between SM and any possible NP taking into account all these uncertainties, and check whether the future experimental statistics will allow a clear separation of the two. Let us now explain briefly how to use a \(\chi ^2\) statistic test to pinpoint such a differentiation.

There are a few things one should take note of while interpreting the results of the OO technique.

  1. 1.

    As is explained in detail in Sect. 4 of Ref. [10], OO technique was introduced to measure the values of physical parameters \(g_i(i = 1,\ldots , m)\) giving small contributions to the differential cross section in a reaction. Expanding the differential cross section to first order in \(g_i\) gives \(\frac{\mathrm{d}\sigma }{\mathrm{d}\phi } = S_0 + \sum \nolimits _{i} S_{1,i} g_i\). Reference [10] goes on to show that observables \({\mathcal {O}}_i = S_{1,i} / S_0\) are optimal in the sense that the error ellipsoids in the parameter space are of the smallest volume. These are equivalent to the \(f_i(\phi )\)s in our analysis (see Eq. (1)). Optimizing the covariance matrix of \(f_i\)s in turn optimizes the covariance matrix of \(c_i\)s. We use this precise fact in our analysis in predicting the optimal isolation of competing models, in the line of [7]. The ‘observables’ that we talk about, and those that the experimentalists measure, are not the same as those used in the context of OO.

  2. 2.

    A regular \(\chi ^2\) statistic is a function of parameters of a model and ‘measures’ the deviation of those from observed values of some experimental quantities. The one we need to concoct, however, should take one specific model (e.g. SM) as reference, in place of experimental results, and should be a function of some parameters, each set of values of which indicates one comparison-worthy model (e.g. NP). For example, if we have a set of new operators \(O_i\) with \(c_i\) being the corresponding WCs, \(c_i=0\, \forall \, i\) is the SM and any other set is a NP model that may be compared with the SM. The \(\chi ^2\) should also be a measure of ‘separation’ (deviation) between any NP model and the reference one. In the rest of this analysis, the reference model will always be SM. By construction, we ensure that \(\chi ^2|_\mathrm{SM} = 0\) and \(\chi ^2 = n^2\) denotes a separation of \(n~\sigma \) from the SM.

  3. 3.

    Projections of the constant \(\chi ^2 = 1\) contours on each parameter axis will give us the corresponding \(\delta c_i\)s. Following the point above, the constructed \(\chi ^2\) has no measurements or data points in it and thus the \(\delta c_i\)s obtained are not statistical uncertainties on the \(c_i\)s; it is not even the predictions for them. These are uncertainties of the reference model, or, in other words, a measure of the region in parameter space where the reference model is indistinguishable from models parametrized by surrounding points in the parameter space. Thus, points on the \(1\sigma \) contour parametrize models that can be distinguished from the SM at \(1\sigma \) level only.

  4. 4.

    When varying the \(\chi ^2\) over the allowed parameter space, \(V_{i j}^{-1}\) also depends on the parameter values of the reference model through \(O(\phi )\), which comes in the denominator of \(M_{i j}\) (Eq. (5)). This we will call the seed dependence.

  5. 5.

    The covariance matrix \(V_{i j}\) as well as the \(\chi ^2\) is obtained using the central values of all the parameters. So any separation obtained after the analysis, though qualitatively correct, has to be modified after inclusion of the SM errors.

  6. 6.

    If considered, theoretical uncertainties in \(O(\phi )\) will in turn introduce an uncertainty in the \(\chi ^2\). In other words, in the presence of the uncertainties, the \(n~\sigma \) contours will become bands of nonzero width in the parametric space.

For completeness, we will also provide the decay rate distributions of the processes under consideration and test their usefulness in differentiating the NP models from the SM.

3 New physics models and observables

3.1 Only neutrinos as invisible

The first NP model treats neutrinos as the only carriers of missing energy in \(b\rightarrow s\) decays. We will also take, for simplicity, not only no lepton flavor violation (LFV) but also lepton flavor universality (LFU). This means all the three flavors of \(\nu {\bar{\nu }}\) pairs are produced in equal number even by the NP operators, and there are no \(\nu _i{\bar{\nu }}_j\) final states with \(i\not =j\). Note that both these assumptions can be violated in specific NP models.

The effective Hamiltonian for \(b\rightarrow s \nu _i{\bar{\nu }}_i\) can be written as

$$\begin{aligned} \mathcal{H}_{\mathrm{eff}} = \frac{4G_F}{\sqrt{2}} V_{tb} V^*_{ts} \left[ C_\mathrm{SM} O_\mathrm{SM} + C_{V_1}O_{V_1} + C_{V_2} O_{V_2}\right] , \nonumber \\ \end{aligned}$$
(11)

where

$$\begin{aligned} O_\mathrm{SM}= & {} O_{V_1} = \left( \bar{s}_L \gamma ^\mu b_L\right) \left( {\bar{\nu }}_{iL}\gamma _\mu \nu _{iL}\right) , \nonumber \\ O_{V_2}= & {} \left( \bar{s}_R \gamma ^\mu b_R\right) \left( {\bar{\nu }}_{iL}\gamma _\mu \nu _{iL}\right) . \end{aligned}$$
(12)

Note that \(O_\mathrm{SM}\) and \(O_{V_1}\) are identical only with our assumption of LFU and no LFV. NP with LFU can mean, in an extreme case, that only one flavor of neutrino will be present; with LFV, the two neutrinos can be of different flavor. Similar considerations apply for \(O_{V_2}\). Under our simplifying assumptions, one can write Eq. (11) as

$$\begin{aligned} \mathcal{H}_{\mathrm{eff}} = \frac{4G_F}{\sqrt{2}} V_{tb} V^*_{ts}C_\mathrm{SM} \left[ (1+C'_1) O_{V_1} + C'_2 O_{V_2}\right] \end{aligned}$$
(13)

in terms of the scaled Wilson coefficients defined as \(C'_{1,2} \equiv C_{V_{1,2}}/C_\mathrm{SM}\), with

$$\begin{aligned} C_\mathrm{SM} = \frac{\alpha }{2\pi \sin ^2\theta _W} X_t(x_t). \end{aligned}$$
(14)

If the NP is also at the loop level, we expect \(|C'_1|, |C'_2| \sim \mathcal{O}(1)\). If it is at tree level, \(|C'_1|, |C'_2| \gg 1\). At the leading order, the Inami–Lim function \(X_t\) is given by

$$\begin{aligned} X_t^\mathrm{LO} = \frac{x_t}{8} \left[ \frac{x_t+2}{x_t-1} -3 \frac{x_t-2}{(x_t-1)^2} \ln x_t\right] , \end{aligned}$$
(15)

with \(x_t = m_t^2/m_W^2\).

We will use the following numbers for our subsequent analysis:

$$\begin{aligned} m_B= & {} 5.280~\mathrm{GeV},\ \ m_{K^*}=0.896~\mathrm{GeV},\nonumber \\ m_{K}= & {} 0.498~\mathrm{GeV}, \ \ m_s = 0.096~\mathrm{GeV} ,\nonumber \\ |V_{tb}V_{ts}^*|= & {} 0.0401,\ \ \sin ^2\theta _W = 0.2313,\nonumber \\ X_t= & {} 1.469,\ \ \tau _B = 1.519~\mathrm{ps},\nonumber \\ G_F= & {} 1.166\times 10^{-5}~\mathrm{GeV}^{-2},\ \ \alpha = 1/127.925, \end{aligned}$$
(16)

where \(\tau _B\) is the lifetime of the B meson.

The exclusive differential decay distributions are given by [11, 12]

$$\begin{aligned} \frac{\mathrm{d}\varGamma _{B \rightarrow K\nu {\bar{\nu }}}}{\mathrm{d}q^2}= & {} \frac{G_F^2 \alpha ^2}{256\pi ^5}\, \frac{|V_{tb} V_{ts}^*|^2 X_t^2}{m_B^3 \sin ^4\theta _W} \, \lambda ^{3/2}(m_B^2,m_K^2,q^2)\nonumber \\&\times \left[ f_+^K(q^2)\right] ^2 \left| 1+ C'_1 + C'_2\right| ^2,\nonumber \\ \frac{\mathrm{d}\varGamma _{B \rightarrow K^*\nu {\bar{\nu }}}}{\mathrm{d}q^2}= & {} \frac{G_F^2 \alpha ^2}{256\pi ^5}\, \frac{|V_{tb} V_{ts}^*|^2 X_t^2}{m_B^3 \sin ^4\theta _W} \, q^2\lambda ^{1/2}(m_B^2,m_{K^*}^2,q^2) \nonumber \\&\times \left[ \left( |1+C'_1|^2+|C'_2|^2 \right) \right. \nonumber \\&\times \left( H^2_{V,+} + H^2_{V,-} + H^2_{V,0} \right) \nonumber \\&-\,2 \mathrm{Re}[(1+C'_1){C'_2}^*] \nonumber \\&\times \left. \left( H^2_{V,0} + 2 H_{V,+}H_{V,-} \right) \right] , \end{aligned}$$
(17)

and the inclusive distribution by

$$\begin{aligned} \frac{\mathrm{d}\varGamma _{B \rightarrow X_s\nu {\bar{\nu }}}}{\mathrm{d}q^2}= & {} \frac{G_F^2 \alpha ^2}{128 \pi ^5}\, \frac{|V_{tb} V_{ts}^*|^2 X_t^2}{m_b^3 \sin ^4\theta _W} \kappa (0) \left( |1+C'_1|^2 + |C'_2|^2\right) \nonumber \\&\times \lambda ^{1/2}(m_b^2,m_s^2,q^2) \nonumber \\&\times \left[ 3q^2 \left( \phantom {\left. \frac{ \mathrm{Re}[(1+C'_1){C'_2}^*]}{ |1+C'_1|^2 + |C'_2|^2} \right) }m_b^2+m_s^2 - q^2 -4m_sm_b \right. \right. \nonumber \\&\times \left. \left. \frac{ \mathrm{Re}[(1+C'_1){C'_2}^*]}{ |1+C'_1|^2 + |C'_2|^2} \right) + \lambda (m_b^2,m_s^2,q^2)\right] ,\nonumber \\ \end{aligned}$$
(18)

where

$$\begin{aligned} \lambda (a,b,c) = a^2+b^2+c^2-2\left( ab+bc+ca\right) , \end{aligned}$$
(19)

and \(\kappa (0) = 0.83\) is the QCD correction factor. Note that the structure of the interference term in \(|1+C'_1|^2\) changes if \(O_{V_1}\) has LFV or non-LFU nature.

For \(B\rightarrow K^*\), the form factors are defined in terms of the conventional set as

$$\begin{aligned} H_{V,\pm }(q^2)&= M A_1(q^2) \mp \frac{\lambda ^{1/2}(m_B^2, m_{K^*}^2,q^2) V(q^2)}{M},\nonumber \\ H_{V,0}(q^2)&= \frac{M}{2m_{K^*}\sqrt{q^2}} \left[ \frac{ \lambda (m_B^2, m_{K^*}^2,q^2)}{M^2} A_2(q^2) \right. \nonumber \\&\quad \left. \phantom {\left[ \frac{ \lambda (m_B^2, m_{K^*}^2,q^2)}{M^2} A_2(q^2) \right. }- \left( m_B^2 - m_{K^*}^2-q^2\right) A_1(q^2)\right] , \end{aligned}$$
(20)

where \(M = m_B+m_{K^*}\). To get the form factors, one first defines the function

$$\begin{aligned} z(q^2) = \frac{ \sqrt{t_+ - q^2} - \sqrt{t_+ - t_0}}{\sqrt{t_+ - q^2} + \sqrt{t_+ - t_0}}, \end{aligned}$$
(21)

where

$$\begin{aligned} t_{\pm } = \left( m_B \pm m_{K^{(*)}}\right) ^2, \ \ t_0 = t_+ \left( 1 - \sqrt{1-t_- / t_+}\right) . \end{aligned}$$
(22)

Then we define the generic structure as

$$\begin{aligned} F_i(q^2) = \frac{1}{1-q^2/m_P^2} \sum _k \alpha ^i_k \left[ z(q^2) - z(0)\right] ^k, \end{aligned}$$
(23)

where the pole masses \(m_P\) are

$$\begin{aligned} V: 5.415~\mathrm{GeV},\ \ A_0: 5.366~\mathrm{GeV},\ \ A_1,A_{12}: 5.829~\mathrm{GeV},\nonumber \\ \end{aligned}$$
(24)

and [13]

$$\begin{aligned} \alpha _0^V&= 0.38(3),\ \ \alpha _1^V = -1.17(26),\ \ \alpha _2^V = 2.42\pm 1.53,\nonumber \\ \alpha _0^{A_0}&= 0.37(3),\ \alpha _1^{A_0} = -1.37(26),\ \alpha _2^{A_0} = 0.13\pm 1.63,\nonumber \\ \alpha _0^{A_1}&= 0.30(3),\ \alpha _1^{A_1} = 0.39(19),\ \ \ \alpha _2^{A_1} = 1.19\pm 1.03,\nonumber \\ \alpha _0^{A_{12}}&= 0.27(2),\alpha _1^{A_{12}} = 0.53(13),\ \ \alpha _2^{A_{12}} = 0.48\pm 0.66. \end{aligned}$$
(25)

Here \(A_2\) has been replaced by \(A_{12}\), given by

$$\begin{aligned} A_{12}(q^2) =\frac{ M^2\left( m_B^2-m_{K^*}^2-q^2\right) A_1 - \lambda (m_B^2,m_{K^*}^2,q^2) A_2}{16 M m_B m_{K^*}^2}. \end{aligned}$$
(26)

The form factor \(A_0\) will be needed when we discuss the decays to light invisible scalars.

The scalar form factors \(f_0\) and \(f_+\) for \(B \rightarrow K\) are given by [14]

$$\begin{aligned} f_0(q^2)&= \sum _{k=0}^{K}a^0_k z(q^2)^k, \nonumber \\ f_+(q^2)&= \frac{1}{1-q^2/m_P^2} \nonumber \\&\quad \times \sum _{k=0}^{K-1}a^+_k \left[ z(q^2)^k-(-1)^{k-K}\frac{k}{K}z(q^2)^K\right] , \end{aligned}$$
(27)

where

$$\begin{aligned} m_P = m_B + \varDelta ^*_+ \end{aligned}$$
(28)

and

$$\begin{aligned} a_0^0= & {} 0.550(20),\ \ a_1^0=-1.89(23),\ \ a_2^0=1.98(1.24),\nonumber \\ a_3^0= & {} -0.02(2.00),\ \ a_0^+=0.432(15),\ \ a_1^+=-0.65(23),\nonumber \\ a_2^+= & {} -0.97(1.24),\ \ \varDelta ^*_+ =0.04578(35). \end{aligned}$$
(29)

Another observable that we may use is the modified transverse polarization fraction of \(K^*\) in \(B\rightarrow K^*\nu {\bar{\nu }}\) decays, defined as

$$\begin{aligned} F'_T \equiv \frac{\mathrm{d}\varGamma _T/\mathrm{d}q^2}{\int (\mathrm{d}\varGamma /\mathrm{d}q^2)\, \mathrm{d}q^2} = \tau _B \frac{\mathrm{d}\varGamma _T}{\mathrm{d}q^2}. \end{aligned}$$
(30)

Note that the denominator has been integrated over, to give an overall normalization. It can easily be shown that

$$\begin{aligned} \frac{\mathrm{d}\varGamma _T}{\mathrm{d}q^2}&= \frac{G_F^2 \alpha ^2}{256\pi ^5}\, \frac{|V_{tb} V_{ts}^*|^2 X_t^2}{m_B^3 \sin ^4\theta _W} \, q^2\lambda ^{1/2}(m_B^2,m_{K^*}^2,q^2) \nonumber \\&\quad \times \left[ \left( |1+C'_1|^2+|C'_2|^2 \right) \left( H^2_{V,+} + H^2_{V,-} \right) \right. \nonumber \\&\quad \left. -\,4 \mathrm{Re}[(1+C'_1){C'_2}^*] H_{V,+}H_{V,-} \right] . \end{aligned}$$
(31)

From the experimental bounds on the branching fractions, namely,

$$\begin{aligned} \mathrm{Br}(B\rightarrow K\nu {\bar{\nu }})&< 1.7\times 10^{-5},\ \nonumber \\ \mathrm{Br}(B\rightarrow K^*\nu {\bar{\nu }})&< 7.6\times 10^{-5}, \end{aligned}$$
(32)

at 90% CL, we get the following approximate constraints on the scaled Wilson coefficients, assuming them to be real (but not necessarily positive):

$$\begin{aligned}&-3.0 \, \le C'_1+C'_2 \le \, 1.0,\ \nonumber \\&\quad \left( |1+{C'_1}|^2 + |C'_2|^2\right) -1.3 \left( 1+C'_1\right) C'_2 \, \le \, 8.0. \end{aligned}$$
(33)

The allowed parameter spaces are shown in Fig. 1. Belle has a recent update [2], mostly on the \(B\rightarrow K^*\nu {\bar{\nu }}\) mode:

$$\begin{aligned} \mathrm{Br}(B\rightarrow K\nu {\bar{\nu }})&< 1.6\times 10^{-5},\ \nonumber \\ \mathrm{Br}(B\rightarrow K^*\nu {\bar{\nu }})&< 2.7\times 10^{-5}. \end{aligned}$$
(34)

While our analysis uses the old parameter space, the results, as we will see, are obvious even with the new data. Figure 1 also shows the updated parameter space.

We will show (with the OO technique) how much of the parameter space can be successfully differentiated from the SM, and with what confidence level. In our analysis, we have noted that the errors extracted on the new WCs are independent of the choices of the seed values. As mentioned earlier, these seed values can be chosen from the allowed NP parameter space. Obviously, depending on the data, different observables will have different powers to differentiate NP effects from the SM. As the values are not known a priori, one has to look at all the observables and the pattern of the signal to have an idea of the underlying model.

Fig. 1
figure 1

Allowed ranges of \(C'_1\) and \(C'_2\). Also shown is the truncated region allowed by the recent Belle data [2]

3.2 Light invisible scalar

Another possibility is to consider the decay \(b \rightarrow sSS\) where S is some gauge singlet scalar, which can be a cold DM candidate. In the Higgs portal DM models, S couples only to the SM doublet \(\varPhi \) through a term like \(a_2 S^2\varPhi ^\dag \varPhi \rightarrow \frac{1}{2} a_2 S^2 h^2\) in the Lagrangian, where h is the SM Higgs field. If \(m_S < m_h/2\), the invisible decay \(h\rightarrow SS\) opens up, and one must keep \(a_2\) to be sufficiently small to avoid the LHC bound on such invisible decay channels: \(\mathrm{BR}(h\rightarrow \mathrm{invisible}) < 10\%\). Although this is in contradiction to a thermalized cold DM giving the correct relic density of the universe, the singlets can form only a part of the relic density and may even be nonthermal in nature.

One has to be careful about the construction of effective operators. At the first sight, it may appear that an effective dimension-6 operator \(\bar{s}_L b_R \varPhi S^2\) may lead to the decay \(b\rightarrow sSS\) when \(\varPhi \) is replaced by its vacuum expectation value (VEV). This is indeed the case if S does not have any VEV, which is essential if S is a DM candidate (otherwise it will mix with h and decay to SM final states). On the other hand, the Higgs penguin diagrams like \(b\rightarrow sh^*, h^*\rightarrow SS\), as discussed in some literature [15, 16], cannot be there if the electroweak symmetry is broken by a single Higgs field. The reason is that the effective off-diagonal Yukawa coupling \(y_{bsh}\) is proportional to the off-diagonal mass term \(m_{bs}\) in the mass matrix, and once one goes to the stationary basis, such off-diagonal Yukawa couplings must vanish. This loophole can be avoided if there are more than one fields responsible for symmetry breaking, or if there are higher dimensional operators involving \(\varPhi \) in quadratic or more, so that the proportionality of the Yukawa matrix and the mass matrix gets spoiled.Footnote 2 Here, we will just assume the existence of a set of effective operators and explore the consequences.

Fig. 2
figure 2

Allowed ranges of \(C_{S_1}\) and \(C_{S_2}\) for \(m_S = 0.5\) GeV and \(m_S=1.8\) GeV, taking \(\varLambda =1\) TeV. Also shown is the truncated region allowed by the recent Belle data [2]

We start with an effective Lagrangian of the form

$$\begin{aligned}&\mathcal{L}_{b\rightarrow sSS} = \frac{1}{\varLambda ^2} \left[ C_{S_1} m_b \bar{s}_L b_R S^2 + C_{S_2} m_b \bar{b}_L s_R S^2 + \mathrm{H.c.}\right] ,\nonumber \\ \end{aligned}$$
(35)

and assume only the SM operator to be present for the \(b\rightarrow s\nu {\bar{\nu }}\) decay, so that

$$\begin{aligned} \left. \frac{\mathrm{d}\varGamma }{\mathrm{d}q^2} \right| _{b\rightarrow s+\mathrm{invis}} = \left. \frac{\mathrm{d}\varGamma }{\mathrm{d}q^2} \right| _{b\rightarrow s\nu {\bar{\nu }}} + \left. \frac{\mathrm{d}\varGamma }{\mathrm{d}q^2} \right| _{b\rightarrow sSS}. \end{aligned}$$
(36)

Here, \(C_{S_1}\) and \(C_{S_2}\) are dimensionless numbers, typically of the order of unity or less, and \(\varLambda \) is the generic mass scale of new physics which gives rise to such operators. Unless otherwise mentioned, we will take \(\varLambda =1\) TeV for our analysis, reminding the reader that both \(C_{S_1}\) and \(C_{S_2}\) scale as \(\varLambda ^2\). Following Ref. [11], one gets

$$\begin{aligned} \frac{\mathrm{d}\varGamma _{B \rightarrow KSS}}{\mathrm{d}q^2}= & {} \frac{ f_0^2(q^2) (m_B^2-m_K^2)^2 \left| C_{S_1}+C_{S_2}\right| ^2}{512\pi ^3 m_B^3 \varLambda ^4} \nonumber \\&\times \, \sqrt{1-\frac{4m_S^2}{q^2}} \lambda ^{1/2}(m_B^2,m_{K}^2,q^2),\nonumber \\ \frac{\mathrm{d}^2\varGamma _{B \rightarrow K^*SS}}{\mathrm{d}q^2\, \mathrm{d}\cos \theta }= & {} \frac{ 3 A_0^2(q^2) \left| C_{S_1}-C_{S_2}\right| ^2}{1024\pi ^3 m_B^3 \varLambda ^4} \nonumber \\&\times \,\sqrt{1-\frac{4m_S^2}{q^2}} \lambda ^{3/2}(m_B^2,m_{K^*}^2,q^2) \, \cos ^2\theta ,\nonumber \\ \frac{\mathrm{d}\varGamma _{B \rightarrow X_sSS}}{\mathrm{d}q^2}= & {} \frac{|C_{S_1}|^2 + |C_{S_2}|^2}{128 \pi ^3 m_b \varLambda ^4}\, \nonumber \\&\times \, \sqrt{1-\frac{4m_S^2}{q^2}} \lambda ^{1/2}(m_b^2,m_s^2,q^2) \nonumber \\&\times \left[ (m_b^2+m_s^2-q^2) - 4 m_s m_b \frac{ \mathrm{Re}[C_{S_1}C_{S_2}^*]}{ |C_{S_1}|^2 + |C_{S_2}|^2}\right] , \nonumber \\ \end{aligned}$$
(37)

where the form factors \(f_0(q^2)\) and \(A_0(q^2)\) can be obtained from Eqs. (27) and (23), respectively.

Fig. 3
figure 3

(a) and (b) The SM-NP differentiating \(\chi ^2\) contours for the exclusive and the inclusive channels coming from \(b\rightarrow s \nu {\bar{\nu }}\), where the left and the right panels are for \(\mathcal{L}_{\mathrm{int}}= 50\,\hbox {ab}^{-1}\) and 2 \(\hbox {ab}^{-1}\), respectively. The \(q^2\) (in \(\hbox {GeV}^2\)) distributions of the decay rates are shown in c and d, respectively, with \(\mathcal{L}_{\mathrm{int}}= 50\) for two benchmark scenarios of NP. For these and subsequent plots, we have not shown anything beyond \(9\sigma \)

Fig. 4
figure 4

The contours from the measurement of \(F^{\prime }_T\) for \(\mathcal{L}_{\mathrm{int}}=50\,\hbox {ab}^{-1}\) (left), 2 \(\hbox {ab}^{-1}\) (right)

For the decay \(B\rightarrow K^* SS\), all \(K^*\)s are longitudinally polarized. We define a modified longitudinal polarization fraction

$$\begin{aligned} F'_L\equiv & {} \frac{\mathrm{d}\varGamma _L/\mathrm{d}q^2}{\int (\mathrm{d}\varGamma /\mathrm{d}q^2)\,\mathrm{d}q^2} = \tau _B \frac{\mathrm{d}\varGamma _L}{\mathrm{d}q^2} \nonumber \\= & {} \tau _B \left[ \frac{\mathrm{d}\varGamma _L}{\mathrm{d}q^2}_{B\rightarrow K^*\nu {\bar{\nu }}} + \frac{\mathrm{d}\varGamma }{\mathrm{d}q^2}_{B\rightarrow K^*SS} \right] , \end{aligned}$$
(38)

which turns out to be

$$\begin{aligned} F'_L&= \frac{G_F^2 \alpha ^2}{256\pi ^5}\, \frac{|V_{tb} V_{ts}^*|^2 X_t^2}{m_B^3 \sin ^4\theta _W} \tau _B \, q^2\lambda ^{1/2}(m_B^2,m_{K^*}^2,q^2) \, H_{V,0}^2 \nonumber \\&\quad + \left| C_{S_1}-C_{S_2}\right| ^2 \frac{A_0^2(q^2)}{512 \pi ^3 m_B^3 \varLambda ^4} \tau _B \, \nonumber \\&\quad \times \sqrt{1- \frac{4m_S^2}{q^2} }\, \lambda ^{3/2}(m_B^2,m_{K^*}^2,q^2). \end{aligned}$$
(39)

Obviously, the allowed range of the WCs depends on the scalar mass \(m_S\), which is shown in Fig. 2. Thus, apart from the new WCs, \(m_S\) is also another a priori unknown quantity.

Fig. 5
figure 5

The differentiation contours for \(b\rightarrow sSS\) with \(m_S=0.5\) GeV and \(\mathcal{L}_{\mathrm{int}}=50\,\hbox {ab}^{-1}\) (left panels), 2 \(\hbox {ab}^{-1}\) (right panels). We take \(\varLambda =1\) TeV

4 Results: only neutrinos

Our results are shown for the projected SuperBelle integrated luminosity \(\mathcal{L}_{\mathrm{int}} = 50\,\hbox {ab}^{-1}\). However, to motivate experimentalists, we also show, for some cases, the results with \(\mathcal{L}_{\mathrm{int}} = 2\,\hbox {ab}^{-1}\), just to bring home the message that there might be reasons to feel excited even within the first year of running. We have taken the production cross section for \(B^0\) and \(B^+\) to be the same, which is known to be an excellent approximation. The detection efficiencies for different channels are taken from Ref. [18], which is an update over [19]Footnote 3:

$$\begin{aligned} \epsilon (B^+\rightarrow K^+\nu {\bar{\nu }})= & {} 5.68\times 10^{-4},\nonumber \\ \epsilon (B^0\rightarrow K_S\nu {\bar{\nu }})= & {} 0.84\times 10^{-4},\nonumber \\ \epsilon (B\rightarrow K^*\nu {\bar{\nu }})= & {} 1.46\times 10^{-4}, \end{aligned}$$
(40)

and we use the SU(2) averaged detection efficiency for \(B\rightarrow K^*\nu {\bar{\nu }}\). We also take the detection efficiency for the semi-inclusive \(B\rightarrow X_s\) channel to be the same as that of \(B\rightarrow K^*\). These numbers will probably be slightly modified for the next generation detectors. Note that the detection efficiencies include a large part of systematic errors too. However, in the absence of a detailed simulation study for Belle-II, it is impossible to include all the systematic errors, so we have to work with the statistical error only. As is obvious, with higher detection efficiencies, the number of events goes up, which can be parametrized by a higher effective luminosity.

Fig. 6
figure 6

Comparison of the \(q^2\) (in \(\hbox {GeV}^2\)) distributions of the decay rates for several \(b\rightarrow sSS\) channels, with \(\mathcal{L}_{\mathrm{int}}=50\,\hbox {ab}^{-1}\) and \(m_S = 0.5\) GeV

Fig. 7
figure 7

Same as Fig. 5 with \(m_S=1.8\) GeV

Fig. 8
figure 8

Same as Fig. 6 with \(m_S=1.8\) GeV

From the definition of the OOs, it is clear that one needs to have at least two different \(c_i\)s for this technique to work, otherwise it is just a simple scaling of the SM expectation. With the assumption of LFU and no LFV, this is what happens for the decay \(B\rightarrow K\nu {\bar{\nu }}\); the SM factor of 1 is replaced by \(|1+C'_1+C'_2|^2\). On the other hand, the decays \(B\rightarrow K^*\nu {\bar{\nu }}, B\rightarrow X_s\nu {\bar{\nu }}\), and the scaled transverse polarization fraction \(F'_T\) all have more than one combinations of the new WCs.

In Fig. 3a, b, we show the results from the OO analysis of the decays \(B\rightarrow K^*\nu {\bar{\nu }}\) and \(B\rightarrow X_s\nu {\bar{\nu }}\), respectively. These plots use the \(q^2\)-integrated data, and shows how far the NP can be differentiated from the SM depending on the precise values of \(C'_1\) and \(C'_2\). The \(\chi ^2 = n^2\) (with \({n} = 1,3,5,7,9\)) lines are obtained in the \(C_1^{\prime }, C_2^{\prime }\) basis with \(\chi ^2|_\mathrm{SM} = 0\), where, depending on the values of n, each line represents a deviation of \(n~\sigma \) from the SM. Obviously, \(C'_1=C'_2=0\) is the SM and close to that the chances of separation are the weakest, as shown by the \(1\sigma \) band. Note that \(C'_1=-2\) and \(C'_2=0\) are also SM-like, because of the destructive interference between the two amplitudes, keeping \(|1+C'_1|^2 = 1\). As can be seen, with \(\mathcal{L}_{\mathrm{int}}=50\,\hbox {ab}^{-1}\) (left panels), both decays can differentiate NP from SM over most of the allowed parameter space with a high confidence level. Even small NP contributions like \(|C_1^{\prime }|\) and/or \(|C_2^{\prime }|\) of order \(10^{-1}\) can be differentiated from the SM at more than 5\(\sigma \) confidence level. The point \(C'_1=-1,\, C'_2=0\) denotes completely destructive interference with the SM and no signal events, and this is obviously much away from the SM expectation. The separations are expectedly worse for \(\mathcal{L}_{\mathrm{int}}=2 \hbox {ab}^{-1}\), as shown in the right panels of the Fig. 3a, b, but even then there are regions in the parameter space that can show some interesting trend. As an example, we note that it is possible to separate out NP contributions like \(|C_1^{\prime }|, |C_2^{\prime }| \approx 1\) from the SM at 9\(\sigma \) confidence level or more.

With enough data, one may even measure the differential decay distribution \(\mathrm{d}\varGamma /\mathrm{d}q^2\). In Fig. 3c, d, we show the differences in \(\mathrm{d}\varGamma /\mathrm{d}q^2\) profiles between the SM and the NP for a couple of benchmark points, shown as NP-1 and NP-2, for the decays \(B\rightarrow K^{*}\nu {\bar{\nu }}\) and \(B\rightarrow X_s\nu {\bar{\nu }}\). Note that both the benchmark points are allowed even by the new Belle data (Fig. 1). Integrated branching fractions of these modes in SM and the selected benchmark points are listed in Eq. (41):

$$\begin{aligned} \mathrm{Br}(B\rightarrow K^*\nu {\bar{\nu }})_{\mathrm{SM}}&= (9.43 \pm 1.48) \times 10^{-6},\ \nonumber \\ \mathrm{Br}(B\rightarrow K^*\nu {\bar{\nu }})_{\mathrm{NP - 1}}&= (17.77 \pm 2.86) \times 10^{-6},\ \nonumber \\ \mathrm{Br}(B\rightarrow K^*\nu {\bar{\nu }})_{\mathrm{NP - 2}}&= (3.99 \pm 0.70) \times 10^{-6},\ \nonumber \\ \mathrm{Br}(B\rightarrow X_s\nu {\bar{\nu }})_{\mathrm{SM}}&= (28.88 \pm 1.90) \times 10^{-6},\ \nonumber \\ \mathrm{Br}(B\rightarrow X_s\nu {\bar{\nu }})_{\mathrm{NP - 1}}&= (49.40 \pm 3.24) \times 10^{-6},\ \nonumber \\ \mathrm{Br}(B\rightarrow X_s\nu {\bar{\nu }})_{\mathrm{NP - 2}}&= (8.61 \pm 0.56) \times 10^{-6}.\ \end{aligned}$$
(41)

The present data almost rules out \(|C'_i|\gg 1\)—the NP has to be either loop-mediated or the new particles have to be so massive as to lie outside the direct detection range of the LHC—and so we concentrate on small-\(C'_i\) points. The bars shown in the plots represent the combined errors due to the various theory inputs, mostly coming from the form factors. In the case of NP, if we treat the \(\delta C'_i\)s coming from the OO analysis of the respective decay modes as a measure of future statistical uncertainties on the NP WCs, then both \(\mathrm{d} Br/\mathrm{d} q^2\) and the total branching fraction will have additional errors coming from them. We note that the NP sensitivities on the \(q^2\) distributions of the exclusive and inclusive decay modes are different. For example, the distribution for \(B\rightarrow K^{*}\nu {\bar{\nu }}\) is highly sensitive to NP in the region \(10~\mathrm{GeV}^2<q^2 < 15~\mathrm{GeV}^2\), while that for the decay \(B\rightarrow X_s\nu {\bar{\nu }}\) is more sensitive to the low-\(q^2\) region, \(q^2 < 10\,\hbox {GeV}^2\). Therefore, study of \(\mathrm{d}\varGamma /\mathrm{d}q^2\) for these exclusive and inclusive channels may be quite useful to pin down the parameters of the NP. For most of the beyond-SM theories, there should be a corroborative signature from charged lepton final state channels, but, as we pointed out, this may not be true always.

Other observables are expected to yield different confidence level contours. This is shown in Fig. 4 for the modified transverse polarization fraction \(F'_T\), both for low and high \(\mathcal{L}_{\mathrm{int}}\). Note that the NP sensitivities of this observable are similar to that for the decay \(B\rightarrow K^{*}\nu {\bar{\nu }}\). Note that the separation for this observable may not go beyond \(7\sigma \) confidence level for the low-\(\mathcal{L}_{\mathrm{int}}\) option.

5 Results: invisible light scalars

Fig. 9
figure 9

The contours from the measurement of \(F'_L\) (Eq. (38)), with \(m_S=0.5\) and 1.8 GeV, \(\mathcal{L}_{\mathrm{int}}=50\,\hbox {ab}^{-1}\) (left panels), 2 \(\hbox {ab}^{-1}\) (right panels)

With invisible light scalars escaping the detector, one gets an identical signal as \(b\rightarrow s\nu {\bar{\nu }}\). We will assume two such identical light scalars produced in the decay, i.e., \(b\rightarrow sSS\). The differential decay distributions depend on the mass of S, which we take to be either 0.5 GeV (called the light scalar or LS option), or 1.8 GeV (called the heavy scalar or HS option). For both these options, we show our results taking \(\mathcal{L}_{\mathrm{int}} = 50\,\hbox {ab}^{-1}\) and 2 \(\hbox {ab}^{-1}\), just as before. Obviously, separation from the SM will be better for lighter scalars, as for heavier scalars, the low-\(q^2\) region will be covered only by the SM and hence those bins will be irrelevant for the analysis.

We show the confidence levels in Fig. 5a–c for the decays \(B\rightarrow KSS, B\rightarrow K^{*}SS\) and \(B\rightarrow X_s SS\), respectively, for the LS option. The shape of the contours are intuitively obvious from the expressions of \(\mathrm{d}\varGamma /\mathrm{d}q^2\). For example, the mode \(B\rightarrow KSS\) is not of much use if \(C_{S_1} \approx - C_{S_2}\). A complementary set of information can be obtained from the \(B\rightarrow K^*SS\) mode. As expected, only regions close to the SM point \(C_{S_1}=C_{S_2}=0\) may not be differentiable from the SM itself. Roughly speaking, one can have a \(5\sigma \) separation from the SM in at least one channel with \(\mathcal{L}_{\mathrm{int}}=50\,\hbox {ab}^{-1}\) if \(|C_{S_1}|\) and/or \(|C_{S_2}|\) be as small as 0.01. The inclusive channel \(B\rightarrow X_s SS\) is even more powerful, as the branching fraction depends on the combination \(|C_{S_1}|^2 + |C_{S_2}|^2\). This leads to circular contours around the origin. There is a subleading term proportional to the strange quark mass \(m_s\) which breaks this symmetry, and so the contours appear to be slightly deformed. The low-luminosity option as displayed in the right panels show that the differentiation is harder for exclusive modes, while the inclusive mode fares better. Points like \(|C_{S_1}|\) and/or \(|C_{S_2}| \approx 0.01\) can be differentiated from the SM at more than 5\(\sigma \) confidence level. Integrated branching fractions of these modes in SM and the selected benchmark point (Fig. 6) are listed in Eq. (42):

$$\begin{aligned} \mathrm{Br}(B\rightarrow K + \mathrm{invis})_{\mathrm{SM}}&= (3.86 \pm 0.53) \times 10^{-6},\ \nonumber \\ \mathrm{Br}(B\rightarrow K + \mathrm{invis})_{\mathrm{NP - 1}}&= (8.89 \pm 0.79) \times 10^{-6},\ \nonumber \\ \mathrm{Br}(B\rightarrow K^* + \mathrm{invis})_{\mathrm{SM}}&= (9.43 \pm 1.48) \times 10^{-6},\ \nonumber \\ \mathrm{Br}(B\rightarrow K^* + \mathrm{invis})_{\mathrm{NP - 1}}&= (23.91 \pm 2.77) \times 10^{-6},\ \nonumber \\ \mathrm{Br}(B\rightarrow X_s + \mathrm{invis})_{\mathrm{SM}}&= (28.88 \pm 1.90) \times 10^{-6},\ \nonumber \\ \mathrm{Br}(B\rightarrow X_s + \mathrm{invis})_{\mathrm{NP - 1}}&= (12.39 \pm 0.45) \times 10^{-5}. \end{aligned}$$
(42)

Similar set of plots for the HS option are shown in Fig. 7. The nature of the plots is identical to that of Fig. 5. However, to reach the same sensitivity, one needs higher WCs than the LS option, as the NP effects are visible only in the low-\(q^2\) bins. Integrated branching fractions of these modes for the selected benchmark point (Fig. 8) are listed in Eq. (43):

$$\begin{aligned} \mathrm{Br}(B\rightarrow K + \mathrm{invis})_{\mathrm{NP - 1}}&= (5.79 \pm 0.52) \times 10^{-6},\ \nonumber \\ \mathrm{Br}(B\rightarrow K^* + \mathrm{invis})_{\mathrm{NP - 1}}&= (11.20 \pm 1.11) \times 10^{-6},\ \nonumber \\ \mathrm{Br}(B\rightarrow X_s + \mathrm{invis})_{\mathrm{NP - 1}}&= (34.79 \pm 1.71) \times 10^{-6}. \end{aligned}$$
(43)

The \(q^2\) distributions for the decay rates of \(B\rightarrow KSS, B\rightarrow K^{*}SS\) and \(B\rightarrow X_s SS\) are shown in Fig. 6a–c, respectively, for \(\mathcal{L}_{\mathrm{int}} = 50\,\hbox {ab}^{-1}\) for the LS option. While, for \(B\rightarrow KSS\) and \(B\rightarrow K^{*}SS\), the \(q^2\) distributions are sensitive for NP over the entire \(q^2\) region except for very high (\(>15\,\hbox {GeV}^2\)) and very low (\(\approx 0\)) regions, for the semi-inclusive decay the NP sensitivity is more in the region \(2~\mathrm{GeV}^2< q^2 < 10~\mathrm{GeV}^2\). Similar plots for \(m_S=1.8\) GeV are shown in Fig. 8. We note that though \(q^2_{\mathrm{min}}\) is much higher for this case, the nature of the \(q^2\) distributions, and therefore the NP sensitivities, is similar to that obtained for the LS case.

As is defined in Eq. (38), the decay \(B\rightarrow K^*SS\) has another observable, namely, the longitudinal polarization \(F'_L\). This is because the \(K^*\) mesons appearing with scalars are completely longitudinally polarized. The confidence level contours for \(F'_L\) obtained from the OO analysis are shown in Fig. 9a, b for the LS and the HS options, respectively. This observable has similar kind of NP sensitivity as that of \(B\rightarrow K^{*} SS\).

6 Summary

We have analyzed the NP sensitivities of the different observables in the decays \(b\rightarrow s~+\) invisibles using the Optimal Observables technique. We consider two NP models: (1) only neutrinos as the carriers of missing energy but with a new operator involving a right-handed quark current; and (2) apart from the SM neutrinos, light invisible scalars as the carriers of missing energy. The analysis takes into account all the new effective operators and their effects on several observables, namely, the total decay width for inclusive and exclusive modes, the differential decay distributions, and the modified transverse and longitudinal polarization fractions as defined in the text.

We show our results both for the high- and low-luminosity options of Belle-II, namely, \(\mathcal{L}_{\mathrm{int}} = 50\,\hbox {ab}^{-1}\) and 2 \(\hbox {ab}^{-1}\), respectively. All the observables are sensitive to NP effects, and even small NP effects might be detectable at future high-luminosity Belle-II. The differentiation of the NP from the SM is obviously not that trivial for the low-luminosity option, apart from the observables like inclusive branching fractions.

The NP sensitivities of \(\mathrm{d}\varGamma /\mathrm{d}q^2\) for exclusive and inclusive channels are different. As the data on that will possibly come after the branching fraction data, they will serve as an additional check on the operator structure and parameter values of the NP. Note that the exclusive distributions are more or less similar for both the NP models, but the inclusive distributions are different, so that may serve as a good discriminator. Thus, we encourage our experimental colleagues to investigate both the \(q^2\)-integrated branching fractions and the differential distributions.

We would also like to point out that when the data comes, and one has a better idea of backgrounds and systematic errors, it is easy to extrapolate our findings by taking the appropriate efficiency into account. The number of events will change, which can be parametrized by a change of the effective luminosity \(\mathcal{L}_{\mathrm{eff}}\) in the denominator of Eq. (9). This will increase the volumes enclosed by constant \(\chi ^2\) surfaces in the parameter space. Thus, two points that are differentiable at \(a\sigma \) in our analysis will only be differentiable at \(b\sigma \) where \(b < a\). Still, our conclusions are expected to be valid, as we have shown large parts of the parameter spaces differentiable up to \(\sim 9 \sigma \) from the SM, for the higher luminosity case.