The effect of S-wave interference on the $B^0 \to K^{\ast 0}\ell^+\ell^-$ angular observables

The rare decay $B^0 \to K^{\ast 0}\ell^+\ell^-$ is a flavour changing neutral current decay with a high sensitivity to physics beyond the Standard Model. Nearly all theoretical predictions and all experimental measurements so far have assumed a $K^{\ast 0}$ P-wave that decays into the $K^+\pi^-$ final state. In this paper the addition of an S-wave within the $K^+\pi^-$ system of $B^0 \to K^{\ast 0}\ell^+\ell^-$ and the subsequent impact of this on the angular distribution of the final state particles is explored. The inclusion of the S-wave causes a distinction between the values of the angular observables obtained from counting experiments and those obtained from fits to the angular distribution. The effect of a non-zero S-wave on an angular analysis of $B^0 \to K^{\ast 0}\ell^+\ell^-$ is assessed as a function of dataset size and the relative size of the S-wave amplitude. An S-wave contribution, equivalent to what is measured in $B^0 \to J/\psi K^{\ast 0}$ at BaBar, leads to a significant bias on the angular observables for datasets of above 200 signal decays. Any future experimental analysis of the $K^+\pi^-\ell^+\ell^-$ final state will have to take the S-wave contribution into account.


Introduction
The description of flavour physics in the Standard Model (SM) has so far accurately matched the observations in the data from the B factories, the Tevatron and the LHC very well. However, there are several fundamental questions which do not have an explanation within the SM such as the mass hierarchy of the quarks and why there are three generations. To avoid creating large flavour changing neutral currents, any physics beyond the SM that contains new degrees of freedom that couple to the flavour sector is required to be at an energy scale of multiple TeV or to have small couplings between the generations, i.e. couplings that closely mimic those of the SM. The measurement of the inclusive b → sγ width [1] is one of the strongest constraints on new physics from the flavour sector; for the exclusive decays, B 0 → K * 0 + − is of major importance.
The analysis of B 0 → K * 0 + − is based on the evaluating the angular distribution of the daughter particles [2]. How to extract the maximal amount of information from the decay while keeping uncertainties from QCD minimal has recently attracted much interest [3][4][5][6][7][8]. The results from the experimental analyses of B 0 → K * 0 + − [9][10][11][12] have focused on the forward backward asymmetry of the dimuon system (A FB ) and the fraction of longitudinal polarisation of the K * 0 (F L ) as a function of the dimuon invariant mass.
With the acquisition of large data sets of B 0 → K * 0 + − decays, scrutiny is required of assumptions that have been made in current experiments. Nearly all theoretical papers to date use the narrow width assumption for the K + π − system meaning that the natural width of the K * 0 (892) is ignored. This means there is no interference with other K + π − resonances. Existing B 0 → K * 0 + − analyses consider B 0 → K * 0 + − signal with K + π − candidates in a narrow mass window around the K * 0 (892). However, in this region there is evidence of a broad S-wave below the K * 0 (892) and higher mass states which decay strongly to K + π − , such as the S-wave K * 0 0 (1430) and the D-wave K * 0 2 (1430) [13]. The best understanding of the low mass S-wave contribution comes from the analysis of K + π − scattering at the LASS experiment [14].
The interference of an S-wave in a predominantly P-wave system has previously been used to disambiguate otherwise equivalent solutions for the value of the CP -violating phase in B 0 [15] and B 0 s [16] oscillations. In the determination of ϕ s in the B 0 s → J/ψ φ decay it was also shown that it is required to take the S-wave contribution into account [17] and this has subsequently been done for the experimental measurements [18][19][20]. The interference of a K + π − S-wave in the angular analysis of B 0 → K * 0 µ + µ − has previously been considered in Refs. [21,22]. In both references, the authors show that the presence of the S-wave can introduce significant biases to angular observables in the decay. We extend these studies to explore the consequences of the S-wave contribution for the present and future experimental analyses. Further, we explore the interplay between statistical and systematical uncertainties for different analysis approaches.
In this paper, we detail how a generic K + π − S-wave contribution to B 0 → K * 0 + − can be included in the angular analysis. Firstly, we develop the formalism set out in [23] to explicitly include a spin-0 S-wave and a spin-1 P-wave state in the B 0 → K + π − + − angular distribution. Here K * 0 is used for any neutral kaon state which decays to K + π − .
The impact of an S-wave contribution on the determination of the theoretical observables is evaluated in two ways: in the first we look for the minimum sample size in which an S-wave contribution (such as measured in [15]) significantly biases the angular observables; secondly we determine, for a given sample size, the minimum S-wave contribution needed to bias the angular observables. We then demonstrate how the S-wave contribution can be correctly taken into account and evaluate the effect of this on the statistical precision that can be obtained on the angular observables with a given number of signal events.
2 The B 0 → K * 0 + − angular distribution The differential angular distribution for B 0 → K * 0 + − is expressed as a function of the five kinematic variables (cos θ l , cos θ K , φ, p 2 and q 2 ). The angle θ K is defined as the angle between the K + and the B 0 momentum vector in the rest frame of the K * 0 . The angle θ l is similarly defined between the + in the rest frame of the dilepton pair and the momentum vector of the B 0 . The angle φ is defined as the signed angle between the planes, in the rest frame of the B 0 , formed by the dilepton pair and the K + π − pair respectively. 2 The mass squared of the K + π − system is denoted p 2 and the mass squared of the dilepton pair q 2 . The angular distribution is given as a function of cos θ l , cos θ K and φ as Ignoring scalar and tensor contributions, the complete set of angular terms are 2) where A H(0,||,⊥,t) are the K * 0 helicity amplitudes and β 2 l = 1 − 4m 2 l /q 2 [2]. In this paper the lepton mass is assumed to be insignificant, such that the angular terms with m 2 l /q 2 dependence can be neglected and β l = 1 such that I 1 and I 2 can be related by I c 2 = −I c 1 and I s 2 = 1 3 I s 1 . For a K + π − state which is a combination of different spin states, the amplitudes for a given handedness (H = L, R) can be expressed as a sum over the resonances (J) 3 Angular distribution of B 0 → K + π − + − for a combined S-and P-wave For K + π − masses below 1200 MeV, 3 the contribution to the amplitudes from the D-wave K * 0 (1430) is so small that it can be ignored [14] and only the J = 0, 1 terms in the sums of Eq. 2.3 will be considered. The S-wave contribution to these amplitudes only enters in A 0 giving where the spherical harmonics have been expanded out, leaving the propagator and the matrix element as part of the spin-dependent amplitudes where the first index denotes the spin and the normalisation from the three-body phase space factor is omitted. The propagator for the P-wave is described by a relativistic Breit-Wigner distribution with the amplitude given by where m K * 0 1 is the resonant mass and the running width. Here t is the K + momentum in the rest frame of the K + π − system and t 0 is t evaluated at the K + π − pole mass. B is the Blatt-Weisskopf damping factor [25] with a radius R P . The amplitude can be defined in terms of a phase (δ) through the substitution cot δ = m 2 3 Natural units are assumed throughout this paper Table 1: Parameters of the K + π − resonances used to generate toy data sets. The K * masses and widths are taken from Ref. [13] and the K * 0 1 Blatt-Weisskopf radius and the LASS parameters are taken from Ref. [26] State.
to give the polar form of the relativistic Breit-Wigner propagator The LASS parametrisation of the S-wave [14] can be used to describe a generic K + π − S-wave. In this parametrisation, the S-wave propagator is defined as where the first term is an empirical term from inelastic scattering and the second term is the resonant contribution with a phase factor to retain unitarity. The first phase factor is defined as where r and a are free parameters and t is defined previously, while the second phase factor describes the K * 0 0 (1430) through Here, m S is the S-wave pole mass and Γ S is the running width using the pole mass of the K * 0 0 (1430). The overall strong phase shift between the results from the LASS scattering experiment and measured values for B 0 → J/ψ K + π − has been found to be consistent with π [15]. The parameters for the p 2 spectrum used in this paper are given in Table 1.
The angular terms modified by the inclusion of the S-wave are I 1,2,4,5,7,8 and the complete set of angular terms expressed in terms of the spin-dependent amplitudes is The interference term of I 1 shows how this parametrisation encompasses the strong phase difference between the S and P-wave state. The left handed part of the interference term for I 1 can be written as  Figure 1: An illustration of the p 2 spectrum for the P-wave (dashed) and the S-wave (dash-dotted). The total distribution from both states is the solid line. The values were calculated at q 2 = 6 GeV 2 by integrating out the angular distribution of B 0 → K + π − + − using equal matrix elements for each state. The S-wave fraction here is 16% between 800 < p < 1000 MeV where δ M JL0 is the phase of the longitudinal matrix element and δ P J is the phase of the propagator. The phases in the interference terms for I 4,5,7,8 can be similarly defined. For real matrix elements, i.e. nearly true in the Standard Model, the phases are equal for both handed interference terms δ L = δ R . The phase difference between the S-wave and the P-wave propagators can be expressed as a single strong phase, δ S . The p 2 spectrum for the B 0 → K + π − + − angular distribution can be calculated by summing over the S-and P-waves and integrating out the cos θ l , cos θ K and φ dependence. This is illustrated in Fig. 1 where the matrix elements from Refs [4,6] at a q 2 value of 6 GeV 2 are used. Here the S-wave amplitude is assumed to be equivalent to the longitudinal P-wave amplitude. The S-wave fraction in the 800 < p < 1000 MeV window around around the P-wave is calculated to be 16% when using this approximation. As will be seen later there are no interference terms left in the angular distribution after the integral over cos θ K . 4 The effect on B 0 → K + π − + − observables So far the forward-backward asymmetry (A FB ), the fraction of the K * 0 longitudinal polarisation (F L ) and two combinations of the transverse amplitudes (A 2 T and A Im ) have been measured. As such, these are the observables that will be concentrated on here.
A FB is defined in terms of the amplitudes as for a pure P-wave state where the generic combination of amplitudes A Ji A * Ji is defined as where i ∈ {0, ||, ⊥, t} and J = 0, 1. The factorisation of the amplitudes into matrix elements and the propagators removes the p 2 dependence from the theoretical observables.
In a similar way, F L , A 2 T and A Im are defined as These theoretical observables are normalised to the sum of the spin-1 amplitudes. In terms of the angular distribution, A FB can also be expressed as the difference between the number of 'forward-going' µ + and the number of 'backward-going' µ + in the rest frame of the B 0 , which explains the name of the observable. In Ref [24], this expression was used to determine the zero-crossing point of A FB . The inclusion of the S-wave in the complete angular distribution means that A FB can no longer be determined by experimentally counting the number of events with forward-going and backward-going leptons, as Eqs. 4.1 and 4.4 are no longer equivalent. However, as the S-wave has no forward-backward asymmetry, no bias occurs in the determination of the zero-crossing point by ignoring the S-wave. The total normalisation for the angular distribution changes to the sum of S-and P-wave amplitudes, such that there is a factor of between the pure P-wave and the admixture of the S and the P-wave. This is the fraction of the yield coming from the P-wave at a given value of p 2 and q 2 . Similarly, the S-wave fraction is defined as and the interference between the S-wave and the P-wave as Substituting the above observables into the angular terms gives For the purpose of this paper, a simplification of the angular distribution can be achieved by folding the distribution in φ such that φ = φ − π for φ < 0 [27]. The I 4,5,7,8 angular terms which are dependent on cos φ or sin φ are cancelled leaving I 1,2,3,6,9 in the angular distribution: +2I 6 cos θ l + 2 √ 2I 9 sin 2 θ l sin 2φ . Combining Equation 4.10 with 4.9 gives the differential decay distribution, 1 Γ d 5 Γ dq 2 dp 2 dcosθ K dcosθ l dφ = 9 16π (4.11) The angular distribution as a function of cos θ l and cos θ K is given by integrating over φ in Eq. 4.11 1 Γ d 4 Γ dq 2 dp 2 d cos θ K d cos θ l = 9 16 and further integration from Equation 4.11 yields the angular distribution for each of the angles, (4.13) The angular distribution can be integrated over p 2 using the weighted integral O(q 2 ) = O(p 2 , q 2 ) d 2 Γ dp 2 dq 2 dp 2 d 2 Γ dp 2 dq 2 dp 2 (4.14) for the value of the observables integrated over a given region in p 2 . This leads to the integrated observables F P , F S and A S which are solely dependant on q 2 . By definition, the fraction of the S-wave and the P-wave sum to one, F S + F P = 1. The complete angular distribution without any p 2 dependence is given by where the normalisation of the angular distribution is given by The 'dilution' effect of the S-wave can clearly be seen from the factor of (1-F S ) that appears in front of the observables in Eq. 4.15.
The effect of an S-wave on the angular distribution as a function of cos θ K , cos θ l and φ as illustrated in Figure 2. Here it is possible to see that the asymmetry in cos θ l , given by A FB , has decreased and that there is an asymmetry in cos θ K introduced by the interference term.

Effect of an S-wave on the angular analysis
In an angular analysis of B 0 → K + π − + − , the S-wave can be considered to be a systematic effect that could bias the results of the angular observables. The implications of this systematic effect are tested by generating toy Monte Carlo experiments and fitting the angular distribution to them. The results of the fit to the observables are evaluated for multiple toy datasets. The effect of the S-wave is evaluated for two different cases. Firstly, the effect of S-wave interference is examined as a function of the size of the dataset used. The aim of this is to give an idea of the current situation and the possible implications on future measurements of B 0 → K + π − + − . Datasets of sizes between 50 and 1000 events are tested. For comparison, the latest results from LHCb [24] have between 20 and 200 signal events in the 6 different q 2 bins considered. Secondly, the effect of different levels of S-wave contribution is examined. At present, the only information about the S-wave fraction is the measurement of F S of approximately 7% in the decay B 0 → J/ψ K + π − from [15] for the range 800 < p < 1000 MeV. As the value may be different in B 0 → K + π − + − , we consider values of F S in this region ranging from 1% to 40%. The fraction of the S-wave, F S , is expected to have some q 2 dependence because of the q 2 dependence of the transverse P-wave amplitudes. T and A Im are taken from Ref. [24] in the 1 < q 2 < 6 ( GeV 2 ) bin. The F S value is taken from Ref. [15] . Obs.
The parameters used to generate the toy datasets are summarised in Tables 1 and 2. The values of the angular observables used to generate toy Monte Carlo simulations are taken from the LHCb angular analysis of B 0 → K * 0 µ + µ − in the 1 < q 2 < 6 GeV 2 bin [24]. Within errors, these measurements are compatible with the Standard Model prediction for B 0 → K * 0 + − and the central value of the measurement is used. The nominal magnitude and phase difference of the S-wave contribution are taken from the angular analysis of B 0 → J/ψ K + π − [15].
The toy datasets are generated as a function of the cos θ l , cos θ K , φ and p 2 using the angular distribution given in Eq. 4.11. For each set of input parameters 1000 toy datasets were generated. For each of these toy datasets, an unbinned log likelihood fit is performed that returns the best fit value of the observables and an estimate of their error. The expected experimental resolution is obtained by plotting the best fit values of an observable for the ensemble of toy simulations as illustrated for A FB in Fig. 3 (left) The pull value for an observable (O) is defined as where σ i O is the estimated error on the fit to the observable O i . This distribution is seen in Fig. 3 (right). The mean and the width are extracted from a Gaussian fit. For a well performing fit without bias, the pull distribution should have zero mean and unit width. A negative pull value implies that the result is underestimated and a positive pull value implies overestimation of the true observable.

5.1
The impact of ignoring the S-wave in an angular analysis of B 0 → K * 0 + − Firstly, the effect of an S-wave was tested as a function of dataset size in order to find a minimum dataset at which the bias from the S-wave in the angular observables becomes significant. Datasets were generated for sample sizes ranging from 50 and 1000 events and analysed assuming a pure P-wave state. The results are shown in Fig. 4. From Eq. 4.12, it can be seen that A 2 T has a factor of (1-F L ) in front of it. The large value of F L used in generated the datasets is in turn causing A 2 T to have a much worse resolution than A FB , F L and A Im . There is significant bias (non-zero mean) of the pull distribution for all observables when the S-wave is ignored for datasets of more than 200 events. This corresponds to a change of 0.2σ in F L for a dataset of 200 events. The  Figure 4: Resolution (left) and pull mean (right) of 1000 toy datasets analysed as a pure P-wave state as a function of dataset size. It can be seen that the bias on the observable increases dramatically as the sample size increases. This is because the statistical error decreases increasing the sensitivity to the S-wave contribution. The bias of A FB is positive because A FB in negative in the q 2 bin chosen.
behaviour can be understood in terms of the (1 − F S ) factor in Eq 4.12. It gives an offset to the fitted value of the observables which are proportional to the value of F S . Secondly, the angular fit was performed on toy datasets with an increasing S-wave contribution. Datasets of 500 events were generated with a varying S-wave contribution in the narrow p 2 mass window of (800 < p < 1000 MeV) from no S-wave up to a F S value of 0.6. The resolution, the mean and width of the pull distribution for each of the four observables (A FB , F L , A 2 T , A Im ) were calculated and the results are shown in Fig. 5. Significant bias is seen in the angular observable for an S-wave magnitude of greater than 5%. The linear increase in the bias is another consequence of the (1-F S ) factor.

Measuring the S-wave in
Obtaining unbiased values for the angular observables beyond the limits shown requires a measurement of the S-wave contribution rather than ignoring it. With the formalism developed in Sect.4, three options are explored for measuring this. The first option is to ignore the p 2 dependence and simply fit for p 2 -averaged values of F S and A S . The second option is to fit the p 2 line-shape simultaneously with the angular distribution. This can be done in a small p window between 800 and 1000 MeV or in the region from the lower kinematic threshold to 1200 MeV. In all cases the datasets used to perform the studies are identical to those used in Sect. 5.1. The difference is in how the fit is performed. In each case, the dataset and the S-wave sizes refer to the number of events in the smaller p 2 window. The angular distribution without p 2 dependence is given in Eq. 4.15. for each set of samples, we look at the resolution, the mean and the width of the pull distribution of the angular observables.
The change in the resolution obtained on the angular observables for the three methods of including the S-wave in the angular distribution is demonstrated by plotting the ratio with respect to the resolution obtained when a single P-wave state is assumed.
The resolutions and the mean of the pull distributions for the three different fit methods (ignoring the p 2 dependence, fitting a narrow p 2 window and fitting a wide p 2 window) relative to the resolution and mean obtained using the assumption of a pure P-wave state. The ratio between the fit methods including the S-wave in angular distribution and assuming a P-wave state as a function of dataset size are shown in    Figure 6: Resolutions for three different methods to incorporate the S-wave relative to the resolution obtained when the S-wave is ignored. It can be seen that the best resolution is obtained when using the largest p 2 window. The original resolution is recovered to within 10%.
For all observables, it can be seen that the resolution degrades when the S-wave is included and the p 2 dependence is ignored. The resolution degrades by a smaller amount when the p 2 dependence is included in a small bin and the original resolution is recovered to within 10% when using the large p 2 range. There are two effects contributing to the improvement of the resolution. There are more P-wave events in the larger range and the wider mass window allows for the S-wave to be constrained by using the information from above and below the P-wave resonance. This results in the best resolution when the S-wave is included in the angular distribution.
For all the observables, the pull mean approaches zero for datasets of greater than 300 events implying that the bias present in all the observables when a pure P-wave state is removed when an S-wave is included in the angular distribution. This means that the inclusion of the S-wave component will be mandatory for all future experimental analyses.  Figure 7: Pull mean for the three different methods to incorporate the S-wave and when the S-wave is ignored. There is a slight bias when the S-wave is included for datasets of less than 200 events but this bias is removed from all the observables when the S-wave is included in the fit for datasets of over 500 events.
Another approach to reduce the bias from the S-wave is to ignore it in fits but to only include data from a narrower window in p arounnd the K * 0 (892) resonance. By reducing the window from 200 MeV to 100 MeV, the P-wave component is reduced by 20% while the S-wave component is roughly halved. Conducting the same tests as described above shows, as expected, a 10% increase in the statistical error of the observables while the bias for a given dataset is reduced by a factor two. Given what has been shown in this paper, the experimental datasets will in the future be so large that the best approach is to fit the S-wave rather than half the bias and accepting an increased statistical uncertainty.
Until now the lineshape of the S-wave has been parameterised according to the LASS model (Eqs. 3.7-3.9). We asses the model dependence of this assumption by using the alternative isobar model [28] for generating the S-wave component while keeping the same fit model. This only has an effect on the fits where a fit is performed to the p 2 dependence. The results of this show that the systematic uncertainly due to the model dependence is much smaller than the statistical error for all observables for all sample sizes we studied.

Conclusion
In summary, the inclusion of a resonant K + π − S-wave in the angular analysis of B 0 → K * 0 + − has been formalised and the complete angular distribution for both an S-and P-wave state described. We find that the inclusion of an S-wave state has an overall dilution effect on the theoretical observables. The impact of an S-wave on an angular analysis is evaluated using toy Monte Carlo datasets. We find that the S-wave contribution can only be ignored for datasets of less than 200 events. The bias on the angular observables incurred by assuming a pure P-wave K + π − state can be removed by including the S-wave in the angular distribution. The degradation in resolution on the angular observables from fitting a more complicated angular distribution can be minimised by performing the fit in a wide region around the K * 0 (892) resonance. The systematic uncertainty introduced by the model dependence of the S-wave lineshape is minimal and can be ignored.