A complete description of P- and S-wave contributions to the $B^0\to K^{+}\pi^{-}\ell^{+}\ell^{-}$ decay

In this paper we present a detailed study of the four-body decay $B^0\to K^{+}\pi^{-}\ell^{+}\ell^{-}$, where tensions with the Standard Model predictions have been observed. Our analysis of the decay with P- and S-wave contributions to the $K^{+}\pi^{-}$ system develops a complete understanding of the symmetries of the distribution, in the case of massless and massive leptons. In both cases, the symmetries determine relations between the observables in the $B^0\to K^{+}\pi^{-}\ell^{+}\ell^{-}$ decay distribution. This enables us to define the complete set of observables accessible to experiments, including several that have not previously been identified. The new observables arise when the decay rate is written differentially with respect to $m_{K\pi}$. We demonstrate that experiments will be able to fit this full decay distribution with currently available data sets and investigate the sensitivity to new physics scenarios given the experimental precision that is expected in the future. The symmetry relations provide a unique handle to explore the behaviour of S-wave observables by expressing them in terms of P-wave observables, therefore minimising the dependence on poorly-known S-wave form factors. Using this approach, we construct two theoretically clean S-wave observables and explore their sensitivity to new physics. By further exploiting the symmetry relations, we obtain the first bounds on the S-wave observables using two different methods and highlight how these relations may be used as cross-checks of the experimental methodology. We identify a zero-crossing point that would be at a common dilepton invariant mass for a subset of P- and S-wave observables, and explore the information on new physics and hadronic effects that this zero point can provide.


Contents 1 Introduction and Motivation 1
2 Structure of the differential decay rate: P and S waves 2 2.1 P-wave massive observables 4 2.2 Definition of S-wave observables: massless and massive case 7 3 Symmetries of the distribution 10 3.1 Counting degrees of freedom: massive and massless cases 11 3.2 P-wave and S-wave symmetry relations among observables 13 4 Bounds on S-wave observables and W 1,2 observables 16 5 Common zeroes of P-and S-wave observables 20 5.1 A closer look at the observable X 2 : from New Physics to hadronic contributions 24 6 Experimental prospects and precision 28

Introduction and Motivation
Recent years have witnessed rising interest in the B-flavour anomalies as potential hints of New Physics (NP). On the one side quantitatively, due to the observation of an increasing number of observables deviating from their Standard Model (SM) predictions; and on the other side, qualitatively, via an enhancement of the statistical significance of the NP hypotheses in b → s + − global analyses. Recent analyses [1,2] (see also [3][4][5][6]), show that some NP hypotheses exhibit a pull with respect to the SM of more than 7σ and point to a NP contribution that is dominantly left-handed with a vector (or axial-vector) coupling to muons that breaks Lepton Flavour Universality. Solutions with additional small NP contributions from right-handed currents or Lepton Flavour Universal (LFU) NP contributions [7] are also compatible with the data. Measurements of B 0 → K * 0 (→ K + π − ) + − decays with the K + π − system in an P-wave configuration give rise to several of the anomalies observed and an improved understanding of these decays is essential to distinguish between the SM and possible NP scenarios. The LHCb collaboration has observed the presence of a large K + π − S-wave component in B 0 → K + π − + − decays [8,9]. However, the lack of reliable B → K + π − S-wave form factors means that the physics potential of this component remains untapped.
In this paper we present the potential of B 0 → K + π − + − transitions to search for physics beyond the SM, considering both P-and S-wave contributions to the K + π − system. For other works studying the impact of the S-wave contribution we refer the reader to Refs. [10][11][12][13][14][15][16][17]. Key to this work is the identification of the symmetries of the five dimensional decay rate that underpins the complete set of B 0 → K + π − + − observables and the relations between them. In particular, we identify new observables related to the interference between the S-and P-wave amplitudes of the K + π − system, and use the symmetry relations to investigate the potential of S-wave observables as precision probes of NP. We work under the hypothesis of no scalar or tensor NP contributions in our study of the symmetries. In addition, we present a new and robust way to extract information on both NP scenarios and non-perturbative hadronic contributions by studying the common position in dilepton mass squared at which a subset of P-and S-wave observables cross zero.
Using pseudoexperiments that account for both background and detector effects, we make the first study of the capability of the LHCb experiment to extract the complete set of P-and S-wave observables from a single fit to the five-dimensional differential decay rate of B 0 → K + π − + − decays. We also investigate the potential of combinations of the new S-wave observables to separate between relevant NP scenarios, in light of the current anomalies, for both current and future data sets. The complexities of both experimental and theoretical techniques to study B 0 → K + π − + − transitions lend themselves to systemic errors. We therefore use the symmetry relations to devise stringent and model-independent cross-checks of the validity of both experiment and theory methodologies.
The paper is organised as follows. In section 2, we discuss the structure of the differential angular distribution including P-and S-wave contributions. In the case of P-wave observables with massive leptons, we study the sensitivity of previously identified observables to new scalar and pseudoscalar contributions. In the case of the S wave, we define new observables. In section 3 we first perform an analysis of the degrees of freedom required to fully describe the angular distribution, identify the symmetries of the angular distributions and derive a set of relations between P-and S-wave observables that are a consequence of the transformation symmetries of the angular distribution. These relations offer control tests for both experimental and theoretical analyses. Significantly given the lack of knowledge of S-wave form factors, these relations also enable predictions for some combinations of S-wave observables in terms of P-wave observables. In section 4, the relations are used to obtain the first bounds on the complete set of S-wave observables and the potential to observe NP with some of these observables is discussed. In section 5, a set of P-and S-wave observables that share a zero at the same position in dilepton invariant mass is highlighted and the resulting information on both NP scenarios and on hadronic effects is discussed. The experimental prospects for determining all of the P-and S-wave observables discussed, in both massless and massive cases, are presented in section 6. Finally, a summary and conclusion are presented in section 7.
2 Structure of the differential decay rate: P and S waves The differential decay rate of the four-body transition B → Kπ + − receives contributions from the amplitude of the P-wave decay B → K * (→ Kπ) + − , as well as from the amplitude of the S-wave decay B → K * 0 (→ Kπ) + − , with K * 0 being a broad scalar resonance. The rate can then be decomposed into: where dΩ = d cos θ d cos θ K dφ and Γ P contains the pure P-wave contribution and Γ S contains the contributions from pure S-wave exchange, as well as from S-P interference.
Here, q 2 denotes the square of the invariant mass of the lepton pair and m Kπ the invariant mass of the Kπ system. The angles θ , θ K describe the relative directions of flight of the final-state particles, while φ is the angle between the dilepton and the dimeson plane (see Ref. [18] for definitions). The differential rate for a B 0 decay to a final state in the P-wave configuration is J 1s sin 2 θ K + J 1c cos 2 θ K + J 2s sin 2 θ K cos 2θ +J 2c cos 2 θ K cos 2θ + J 3 sin 2 θ K sin 2 θ cos 2φ +J 4 sin 2θ K sin 2θ cos φ + J 5 sin 2θ K sin θ cos φ +J 6s sin 2 θ K cos θ + J 6c cos 2 θ K cos θ +J 7 sin 2θ K sin θ sin φ + J 8 sin 2θ K sin 2θ sin φ +J 9 sin 2 θ K sin 2 θ sin 2φ × |BW P (m Kπ )| 2 , with a similar form for the B 0 rate. The m Kπ dependence, denoted by BW P (m Kπ ), can be modelled by a relativistic Breit-Wigner amplitude describing the K * 0 resonance, including the apposite angular momentum and phase-space factors. The Breit-Wigner amplitude is normalised such that the integral of the modulus squared of the amplitude over the m Kπ region of the analysis is one. For the exact form of the Breit-Wigner functions BW i (m Kπ ) we refer the reader to Ref. [10].
The differential rate of the S-wave final state configuration is +J c 1b cos θ K +J c 2b cos 2θ cos θ K +J 4 sin 2θ l sin θ K cos φ +J 5 sin θ l sin θ K cos φ +J 7 sin θ l sin θ K sin φ +J 8 sin 2θ l sin θ K sin φ .  [19,20]. Similarly to the P wave, the S-wave m Kπ -amplitude is normalised such that the integral of the modulus squared of the amplitude over the analysed m Kπ range is one.
If not explicitly stated otherwise, we will not consider the presence of scalar or tensor contributions in the following (this implies, in particular, that J 6c in Eq.(2.2) is taken to be zero). The decays B → K * + − and B → K * 0 + − are described by seven complex amplitudes A L,R ,⊥,0 , A t and three complex amplitudes A L,R 0 , A t , respectively, where the upper index L, R refers to the chirality of the outgoing lepton current, while in the case of the P-wave the lower index , ⊥, 0 indicates the transversity amplitude of the K * -meson.
Since the distribution is summed over the spins of the leptons, the observables J i and J i are described in terms of spin-summed squared amplitudes of the form A L * i A L j ± A R * i A R j . This structure suggests that the amplitudes can be arranged in a set of two-component complex vectors: (2.4) Two vectors are needed to parametrize the L and R components of the A 0 amplitude, and the A t and A t amplitudes are not expressed in terms of two-complex vectors. Except for the lepton mass terms that mix the L and R components and include the A t (or A t ) amplitudes, one can express the coefficients of the distribution in terms of these vectors. The expression for the coefficients in the P-wave terms can be found in Ref. [18] and [21]. For the S-wave terms we find Similarly for the P-S (real) interference terms and finally for the P-S (imaginary) interference terms where β = 1 − 4m 2 /q 2 and the superscript indices r and i (here and for the rest of the paper) refer to the real and imaginary parts of the bilinears, respectively. The study of the S-wave observables presented in this paper is the first to consider the complete set of observables that arise when the decay rate is written differentially with respect to m Kπ . As a consequence, the interference between the S-P-wave m Kπ lineshapes projects out additional bilinear combinations of S-and P-wave amplitudes, giving rise to the 12 new observablesJ r, i i given in Eqs. (2.6) and (2.7). Previous studies, such as those of Ref. [22], only considered the differential decay rate integrated over m Kπ . In this case one obtains the six well-known S-P interference observablesJ i that can be described by a single two-dimensional S-wave amplitude vector n S , without the need for n S .

P-wave massive observables
The so-called optimized observables are designed to reduce form factor uncertainties. The set of such observables that describes the P-wave Kπ system has been discussed at length in a series of papers [21,23,24]. However, due to improvements in experimental precision, there is increasingly sensitivity to observables that are suppressed by factors of the lepton mass. For the optimized observables, P i , the impact that lepton masses have in the very low q 2 region via the kinematical prefactor β is well known.
Our interest here is to explore two further optimized observables M 1 and M 2 , introduced in Ref. [21], that can be neglected in the massless limit. These observables are defined in terms of the coefficients of the distribution as follows 1 : For this specific type of observable it makes sense to explore the impact from NP scalar and pseudoscalar contributions. Therefore, we will relax in this section the hypothesis of no scalar or pseudoscalar contributions. Even considering a large set of NP scenarios, the observable M 1 is found to be practically insensitive to NP and is not analysed further. By contrast, M 2 can potentially provide information on scalar and/or pseudoscalar NP scenarios. In order to explore reasonable values of (pseudo)scalar contributions, we constrain the range for the coefficients C P,S by considering only those values allowed by the experimental measurement of B(B s → µµ). Thus, we write the following ratio [25], which is used to define the 1σ region from B exp (B s → µµ): where the quantities S, P 2 contain the different NP contributions and are given by: (2.11) Fig. 1 shows the allowed region for S and P once the latest experimental value for B(B s → µµ) = (2.85 ± 0.34) [26] is included, corresponding to R Bs→µµ = (0.78 ± 0.10). In this analysis we have not allowed for the presence of right-handed currents.
We perform an analysis of the behaviour of M 2 under different hypotheses for (pseudo)scalar NP contributions that are compatible with Fig. 1. The case S = 0, P = 1 corresponds to the SM, as can be seen from Eqs. (2.10) and (2.11). We consider three other possible scenarios, corresponding to maximal values of S, P : iii) S = 0, P = −0.94. 1 In order to make the comparison with experimental prospects easier, in this work we have slightly changed the definition of M1,2 by removing the constant terms appearing in Ref. [21]. 2 Not to be confused with the P-and S-wave components of the decay, this S, P refer to Scalar and Pseudoscalar NP contributions entering RB s→µµ . The latter includes the SM axial-vector contribution. Figure 1. Region of allowed values for S, P that fulfill the condition |R SM Bs→µµ − R NP Bs→µµ | ≤ 0.10. In order to illustrate the sensitivity of this observable to NP contributions, we display its value in the SM (black star) and in one of the favoured scenario from Ref. [2] (blue dot): Only the dependence on C 10µ is displayed in the plot. The tiny difference of this scenario with the SM illustrates that M 2 is an observable with low sensitivity to the preferred scenarios of present global fits. For this reason we explore its sensitivity under other types of NP, namely scalars and pseudoscalars.
These three benchmark cases are: i) only a scalar contribution (with two possible signs) and no pseudoscalar NP, ii) both S and P contributions present and equal in magnitude and iii) the opposite sign of the SM case with a negative pseudoscalar contribution. Fig. 2 shows the theoretical prediction of the large-and low-recoil bins of M 2 in the four scenarios mentioned above.
It is evident from Fig. 2 that the rather small sensitivity of M 2 to (pseudo)scalar contributions makes it difficult to get a significant distinction between the different scenarios. This is especially the case in the very low q 2 region, where the uncertainties associated with the theoretical prediction of this observable are larger. Only for the S = 0, P = −0.94 scenario in the large-recoil region is a clean separation between hypotheses possible, given suitably high precision measurements. The situation is somewhat better in the low-recoil q 2 region, where the theoretical errors are smaller but an even higher experimental resolution will be required. The experimental prospects for such a separation of NP hypotheses is outlined in section 6.3.1.

Definition of S-wave observables: massless and massive case
In this section we define the list of S-wave observables that can be constructed using the coefficients of the distribution. They follow from the previous section including P and S waves in the massless case but also taking into account lepton mass terms. The S-wave observables that were mostly treated as nuisance parameters thus far will become an interesting target for future experimental analyses.
Our goal here will be to define the S-wave observables but it is beyond the scope of this paper to provide SM predictions and enter into a discussion of the form factors or other hadronic uncertainties. Our first interest is to determine how many of the observables are genuinely independent. The question of the number of degrees of freedom is critical for the stability of experimental fits and is discussed further in section 3.
As discussed in section 2, the new S-P interference observables defined in Eqs.(2.6) and(2.7), can be defined in terms of the vectors in Eq.(2.4) as follows: Here the prime stands for the differential distribution. Note that once we include lepton mass terms, F S should be extracted fromJ c 2a and not from the combination withJ c 1a such that: (2.14) In order not to overload excessively the notation it should be understood that from Eq.(2.14) to Eq.(2.23) each explicit J orJ is accompanied by its CP -conjugate partner. In the case that B 0 andB 0 decays were experimentally separated, a set of CP -asymmetries corresponding to each J andJ observable would also become accessible (see section 6.1).
In terms of these observables, the angular distribution in the massless limit (taking β → 1 in Eq.(2.12)) is given: The corresponding angular distribution in the massive case can be obtained from Eq.(2.3) using optimized S-wave observables and mass terms defined by: together with the extra S-P interference massive optimized terms: We define and Notice that in the massless limit (M ( ) i → 0, β → 1) Eq.(2.18) reduces to Eq.(2.15). Finally, in order to write the whole distribution with massive terms and optimized observables, the substitution: is needed, where the optimised observables for the interference terms in all q 2 bins are Using the expressions 3

23)
3 One may add to this list another observable, related to the presence of scalars, associated with the coefficient J6c. Given that in the present paper we only allow for scalars when analyzing the observable M2, we direct the reader to Ref. [21], where this case is discussed.
where N 1,2 = β 2 F T,L Γ P , N 3 = β 2 √ F T F L Γ P (and the addition of the CP conjugate in Γ P is implicit) and including the definitions of M 1,2 , one finds: where a global pre-factor β 2 has been absorbed inside the re-definitionF T,L = β 2 F T,L .

Symmetries of the distribution
In this section we present the explicit form of the symmetry transformations of the amplitudes that leave the full distribution (including P and S wave) invariant, and obtain explicitly the relations among the observables. The massless and the massive cases are discussed separately.
The number of symmetries of the distribution are determined by performing an infinitesimal transformation A = A + δ, where A is a vector collecting the real and imaginary parts of all the amplitudes entering the distribution (the vector A depends on whether the massless or massive hypothesis is taken), and the condition to be a symmetry is that the vector δ is perpendicular to the hyperplane spanned by the set of gradient vectors: (3.1) The gradients are defined then by the derivatives of the coefficients with respect to the real and imaginary parts of all the amplitudes. The difference between the dimension of the hyperplane that the gradient vectors span if they are all independent (equal to the number of coefficients of the distribution) and the dimension of the hyperplane that they effectively span tells us the number of relations among the coefficients that exist. By relations we will refer only to non-trivial relations. We will discuss these relations in the following subsections. For completeness, we first find explicitly the form of the continuous symmetries. In Ref. [21], the massless and massive symmetries were discussed for the P wave. There it was found that, in the massless case, four symmetries (two phase transformations for the left and right components and two "angle rotations") leave the P-wave part of the distribution invariant. Alternatively, using the vectors n i we can implement the four symmetry transformations by means of a 2 × 2 unitary matrix, i.e, n i = U n i with i =⊥, , 0.
However, the inclusion of the S wave that requires two different vectors n S and n S breaks two of the symmetries 4 and only the two independent phase transformations survive, i.e., The massive case is relatively similar and again only two phase transformations survive. However, the existence of interference terms between left and right components fixes φ L = φ R = φ, but this is compensated by the independent transformation of the extra amplitudes A ( ) t : with i = 0, ⊥, 5 .

Counting degrees of freedom: massive and massless cases
One important question is how many degrees of freedom there are or, in other words, how many observables in the set discussed in section 2.2 are independent. The number of independent observables to fully describe the distribution depends on whether massless or massive leptons are considered. We again work under the hypothesis that there are no scalar contributions but pseudoscalar ones are allowed in the massive case.
The number of observables that can be constructed out of the complex amplitudes is given by: Each symmetry transformation of the amplitudes that leaves the distribution invariant reduces the number of independent observables. In the following, we determine the number of relations for the massless and massive case and consequently the number of independent observables required to have a full description of the corresponding distribution.

Massless case:
Assuming the absence of scalars, we have 11 coefficients for the P-wave and 14 coefficients for the S-wave distribution. Under the approximation of negligible lepton masses, there are two trivial relations for the P-wave coefficients: and three trivial relations for the S-wave coefficients: reducing the number of coefficients to n c = 20. The vector A in the massless case is given by: Using Eq.(3.1), we find that the dimension of the space spanned by the gradient vectors (given by the rank of the matrix M ij = ∇ i X j with X = J,J and i being the elements of A in Eq.(3.7)) is n rank = 14. This rank gives the number of independent observables n obs . According to the discussion above, the number of relations fulfills: Therefore for the massless case n rel = 6. There is one well-known relation among the coefficients for the P wave (see Ref. [18,28]) and five, previously unknown, relations for the S wave. An independent cross check of the rank of the matrix is provided by the fact that the number of degrees of freedom counting amplitudes minus symmetries, or coefficients minus relations should agree. This implies the equation: The number of complex amplitudes n A = 8 and the number of symmetries of the full distribution (P and S wave) is n sym = 2 (see Eq.(3.2)). The set of 14 independent observables consists of 8 (9 coefficients minus one relation) independent observables for the P wave and 6 (11 coefficients minus 5 relations) independent observables for the S wave. This implies that in the massless case the basis of 20 observables, O m =0 = {Γ , F L , P 1 , P 2 , P 3 , P 4 , P 5 , P 6 , P 8 , has some redundancy. Among these 20 observables there are 6 relations leading to only 14 independent observables. The set of 6 massless relations can be obtained from the 6 massive expressions given below, after taking the massless limit. Notice that the seventh relation, given in the appendix, is exactly zero in the massless limit.

Massive case:
The counting in this case, following the same steps as in the massless case, goes as follows. Our starting point is the same number of coefficients 11 (14) for the P wave (S wave), but now there are no trivial relations, i.e., n c = 25. Here the vector A is: Notice that pseudoscalar contributions are included in the amplitude A t . Evaluating the rank of the corresponding matrix M ij , one finds n rank = 18, indicating that in the massive case the number of independent observables is n obs = 18. Following Eq.(3.8), one immediately finds that the number of relations should be 7. These relations are discussed and presented in the next subsection. As in the previous case, we can repeat the counting using the amplitudes that build the observables. The number of complex amplitudes is n A = 10 with the same number of symmetries n sym = 2 (see Eq.(3.3)) as in the massless case, such that we confirm that there are 18 independent observables.
The set of 18 independent observables in the massive case consists of 10 (11 coefficients minus one relation) independent observables for the P wave and 8 (14 coefficients minus 6 relations) independent observables for the S wave. The corresponding basis of 25 observables is: Therefore, among this set of 25 observables there are 7 relations and only 18 observables are independent.

P-wave and S-wave symmetry relations among observables
In this subsection we present for the first time the full set of symmetry relations of the P and S wave in the massive case. These complete the previous partial results given in Refs. [18,21,22,28]. It is helpful to express the observables J i andJ i in terms of scalar products n † i n j , as shown in Eq.(2.12). All the relations found in this section are functions of J i andJ i and an equivalent set of relations in terms of the CP -conjugate partnersJ i andJ i can be written. However, the observables are functions of the coefficients and their CP partners. This means that when writing one of these relations in terms of observables the substitution J j → aP i is strictly speaking J j → a(P i + P CP i )/2 (with a being some normalization factor). The observable P CP i is the CP asymmetry associated with the observable P i , defined in Ref. [23,28], and similarly forJ i . For the following analysis and for simplicity, we will neglect the CP asymmetries for both the P and S wave. This is a very good approximation, given that such asymmetries are tiny both in the SM and in presence of NP models that do not have large NP phases.
Following the strategy in Ref. [22], we exploit the fact that a couple of n i vectors (with i =⊥, , 0, S or i =⊥, , 0, S ) span the space of complex 2-component vectors. We therefore express the other vectors as linear combinations of these vectors. For instance, Contracting with the vectors n and n ⊥ , we obtain a system of linear equations [22] which can be solved for a i , b i : (3.14) Using the decomposition of n 0 , n S in terms of n , n ⊥ (Eq.(3.12)) to calculate the scalar products |n 0 | 2 , |n S | 2 , n † 0 n S , the first three relations are obtained. We leave the expressions explicitly in terms of J i to let the reader choose between different bases or conventions to write the P-wave observables.
yielding the first relation: This first relation was found in the massless case in Ref. [18] and in the massive case in Ref. [23] and its consequences discussed in Ref. [28] once re-expressed in terms of optimized observables: where the parameters k 1 and δ i (with i = 1, ...4) are defined in Ref. [28].
II. Similarly for i = S in Eq.(3.12) one finds |n S | 2 = a S (n † S n )+b S (n † S n ⊥ ) and this translates to: once expressed in terms of S-wave observables.
III. Finally, the scalar product n † 0 n S leads to the third relation: Eq.(3.17) and Eq.(3.18) are the generalizations of the massless limit (β → 1) expressions found in Ref. [22]. Following the same methodology but using instead the vector n S yields three new relations. Expressing n S in terms of n ⊥ and n : and contracting with n and n ⊥ we get a system of linear equations (3.20) We can determine a S and b S : Using the properties of the vector n S we then obtain the following three relations: IV. From the equality of the modulus of both vectors n S and n S one obtains which implies the following relation: V. Above we focus on relations constructed from the real part of the product of vectors. The imaginary parts provide additional new relations: which leads to: VI. Finally, combining the vectors n S and n S one finds: which corresponds to These six relations are common to the massive and massless case, and they reduce to the massless case when taking the limit β → 1. There is a very long seventh relation that applies only in the massive case, i.e. it is zero in the limit of massless leptons. For this reason and given that it is difficult to extract information from such a long relation, we refrain from writing it explicitly and, instead, provide only the main steps to obtain this relation in the Appendix. 4 Bounds on S-wave observables and W 1,2 observables Following the strategy of Ref. [22], the relations found in the previous section enable bounds to be placed on the S r Si observables and the newly defined S i Si observables. For instance, solving for S r S2 and imposing a real solution in relation II gives: [22]). The first three terms are negative definite and each of them separately has to be smaller than the last positive definite term.  Figure 4. Illustration of the sensitivity of the central value of the bound to the preferred NP scenarios for two observables S r,i S2 and S r,i S3 . We have checked explicitly that the variation of the bound in the most significant NP scenarios amounts to at most a 20-25% enhancement.
In a similar way but solving for S r S3 and imposing a real solution one finds: This implies the following constraints for S r S2,3 : and for S i S4,5 : Similarly using relation IV, one finds and which leads to the following bounds: and In summary, . All the bounds above can alternatively be obtained using the Cauchy-Schwarz inequalities. For the observables S r,i S1 this is the only way to obtain the bounds. For instance, from |n † 0 n S | 2 ≤ |n 0 | 2 |n S | 2 and a corresponding inequality with n S using the properties of the vectors Eq.(2.4) one arrives at All the bounds on the other observables can be re-derived using the four inequalities: We have computed explicitly the bounds of the S r,i Si observables in the SM in Fig. 3. The relatively low sensitivity of the central value of the bound for S r,i S2,3 on the dominant NP scenarios is illustrated in Fig. 4. We work under the approximation of substituting q 2 dependent observables by their binned equivalents, where we denote the latter using angular brackets. This introduces some uncertainty but, as shown in Ref. [28], this uncertainty is negligible, especially for slowly varying observables like those involved in the bounds. To compute the binned form of the bounds from Eq.(4.9) we consider the theoretical prediction for the observables F L,T , P 1 , taking into account the 1σ ranges of such observables. Therefore Fig. 3 shows the maximum value allowed for such constraints. For F S we extract the value from a reduced m Kπ resonance window, 0.795 < m Kπ < 0.995 GeV. In Fig. 4 we evaluate F L,T and P 1 in the corresponding NP scenarios, while taking the SM prediction for F S . The computation of F S is the only place where we use S-wave form factors. If F S is taken as an experimental input, then no S-wave form factors are required. Finally, notice that the bounds include a term (1 − F S ). However, in evaluating these bounds we have neglected a small lepton mass dependent term (see Eq.(2.20)) taking F S instead of F S .
The third term in Eqs.(4.1),(4.2),(4.5),(4.6) should tend to zero when x(q 2 1 ) → 0, in order not to violate the condition of a real solution. Indeed, if we repeat the same procedure using relation II but impose a real solution for ∆(S i S4 ) and ∆(S i S5 ) and for relation IV impose a real solution for ∆(S i S2 ) and ∆(S i S3 ), we find respectively: Neglecting quadratically suppressed terms, P 3 S i Sj P 2 S r Sj with j = 2...5, the previous equations can be combined to obtain: and from x(q 2 1 ) = 0, neglecting P 2 3 , one finds at q 2 Figure 5. SM and NP predictions for the observables W 1 and W 2 as continuous functions of q 2 and binned in q 2 .
Another example of the information that can be extracted from the relations, neglecting quadratic terms of the type O(P 3 S i Sj , P 2 3 ), are the following expressions. These are valid for all q 2 and derive from relations II and IV, respectively. They can be tested as a cross-check of the experimental analyses: with i = 2...5, p = (1 − P 1 )/(1 + P 1 ) and q = −8β 2 P 2 /(1 + P 1 ). Similarly, where p = (1 + P 1 )/(1 − P 1 ) and q = −8β 2 P 2 /(1 − P 1 ).
Eq.(4.14) is particularly interesting because at the zero ofŜ r S3 (or equivalently P S r 3 ) one can predict the absolute value ofŜ r S2 (or P S r 2 ) as a function of P-wave observables with no need to rely on any S-wave form factors. In the case of Eq.(4.16), at the zero ofŜ r S4 one can predict the absolute value ofŜ r S5 at this particular value of q 2 . These are valuable tests to compare with future predictions using calculations of the form factors.
Given that Eq.(4.14) and Eq.(4.16) are functions of P-and S-wave optimized observables (P S r i and P 1,2 ), W 1,2 are also optimized observables. We can compute SM and NP predictions for these two observables using the right hand side of Eq.(4.14) and Eq.(4.16), respectively. These relations then give access to the n ⊥, ,0 components inside the new S-wave observables, cancelling the dependence on n S and n S and hence their predictions do not require the Swave form factors. The W 1,2 observables bring new information that can help to disentangle the SM from different NP scenarios, as illustrated in Fig. 5. From W 1 in the region above 4 GeV 2 , the SM and C NP 9µ = −C NP 10µ are not distinguishable but all the other scenarios shown can in principle be distinguished from the SM. The expected experimental precision for such measurements is detailed in section 6.
Finally we can use relation III, again neglecting all terms including quadratic products of observables sensitive to imaginary parts of bilinears (P 3 , P 6 ,8 and S i S3,5 ), to find: however, this does not give any additional experimental insight.

Common zeroes of P-and S-wave observables
The optimized observable P 2 can be rewritten in terms of the q 2 -dependent complex-vectors n ⊥ and n in the following way: In the absence of right-handed currents, the maximum of P 2 , denoted P max 2 , occurs at a certain value of q 2 , which we denote q 2 1 . At the maximum, P max 2 (q 2 1 ) 1/(2β). To a very good approximation, this maximum occurs when where in principle a different q 2 is involved. This is because this expression is in fact four equations (two for the real and two for the imaginary part) and, moreover, they have to be combined with their CP conjugated equivalents. Strictly speaking this would require that real and imaginary parts and left and right handed parts have the zero at the same point in q 2 , which is not the case. If we restrict ourselves to only Re(A L ⊥ (q 2 0 )) = Re(A L (q 2 0 )) the obtained position of the zero q 2 0 is in very good agreement with the position of the maximum given by q 2 1 , as illustrated in Table 1. In the presence of right-handed currents the condition n ⊥ (q 2 0 ) n (q 2 0 ) can only be fulfilled if a very concrete combination of Wilson coefficients is realized in Nature: One of the NP scenarios that presently has the highest pull with respect to the SM, (C 7 = 0, C NP 9µ , C 9 µ = −C 10 µ ) indeed fulfills this condition. From now on we will refer to this combination (Eq.(5.3)) as condition R .
In the SM, in the absence of right-handed currents, or in the presence of right-handed currents that fulfill condition R , Table 1 illustrates that q 2 0 and q 2 1 are within 1% of each other. This can be understood due to the small phases entering, but also because the equation: Re(A R ⊥ (q 2 0 )) = −Re(A R (q 2 0 )), (5.4) is exactly fulfilled in the large recoil limit in the absence of right handed currents, or if such currents are present but obey condition R . Under these conditions, deviations from this relation then owe to departures from the large recoil limit. We can parametrize these tiny deviations and the effect of imaginary terms in the following form 6 : 6 Besides the fact that we can compute δ, L and R, these quantities can be bounded experimentally using Eq.(5.1) and Eq.(5.7) and rewriting P2 at the point of its maximum (again, for new physics scenarios with right-handed currents that satisfy conditionR, or in the absence of right handed currents) as: This implies that the tiny difference between 1/(2β) and the maximum imposes a bound on each term |δ|, | L| and | R| separately:  Table 1. Position of the zero evaluated from: a) Re(A ⊥ (q 2 0 )) = Re(A (q 2 0 )), b) position of P max 2 (q 2 1 ) and c) the exact position given by X 2 (q 2 2 ). This shows that only in the presence of right handed currents that do not fulfill condition R , as in Hypothesis 1 [2], do the zero points differ significantly from one another.
where N is the normalization factor defined: For new physics scenarios with right handed currents that satisfy condition R or in the absence of right handed currents, a number of other observables are zero at the same point in q 2 at which P 2 is maximal. The relevant observables are formed from pairs of P-and S-wave angular observables: In Fig. 6 the dependence of the position of the zero for several P-wave observables is shown for different NP scenarios. The observables P 3 and P 6,8 would also in principle give a further zero. However, given that they are numerically small over the entire low-q 2 region, they are difficult to determine experimentally. Moreover, the small contribution coming from the λ u = V ub V * us piece of the Hamiltonian distorts the position of the zero for such observables, which motivates their omission from the list above. For the same reason, the S i Si observables and P 1 in the absence of right handed currents are also not included. The X 2,3,4 observables then offer the possibility of looking at the compatibility of multiple zeros, rather than just the zero of single variables such as A FB . In the presence of sizeable right handed currents that do not fulfill condition R , P 2 does not reach the maximal value 1/(2β), and a small difference between the X i observables should be observed. Misalignment between the zeroes of the X i observables could then help confirm a right handed current scenario, although another possible reason for a tiny misalignment is the presence of scalar or pseudoscalar contributions. The observable X 1 is not included in the list above because it is difficult to identify precisely the position of the maximum experimentally.
The point where P 5 and P 4 cross gives the zero of X 2 , as shown in Fig. 6. Unfortunately, when including theory uncertainties using the KMPW computation [29] of the form factors V, A 0,1,2 , T 1,2,3 and long-distance charm, the overlap between the zeroes of different NP scenarios is as shown in Fig. 7. This implies that further efforts are required to improve on the theoretical uncertainty of the observables.
We use the complete perpendicular and parallel amplitudes given by: 5.11) and the longitudinal amplitude: where T ⊥, are defined in Ref. [30], r 1,2,3 (s) correspond to the different types of nonfactorizable power corrections included in our analysis [31], and c long ⊥, ,0 (s) is a parametrization of long distance charm contribution (see Refs. [32,33] for the definition of the parameters): .

(5.13)
Finally, the corresponding theoretical position of the zero is the solution of the following implicit equation: where C eff 9 (q 2 ) collects all pieces and, in order to simplify the expression, we take all nonfactorizable power corrections at their central values but keep long distance charm explicit inside C eff 9 (taking c long ⊥ = c long = c long ): The form factors include soft form factors, α s and power corrections and T ⊥ also includes the non-factorizable QCDF contribution. Eq.(5.14) offers an interesting combined test of form factors, Wilson coefficients and long-distance charm at a specific point in q 2 .

A closer look at the observable X 2 : from New Physics to hadronic contributions
In this section the properties of the observable X 2 = βP 5 −P 4 are analyzed in detail, focusing on the q 2 bin where the zeroes fall both in the SM and in the NP scenarios considered [1,2]. While all the relevant observable information is already included inside global fits, analyzing particular observables like X 2 can provide guidance on how to disentangle NP effects in the longer term. This observable has a simple structure in terms of Wilson coefficients when evaluated in the q 2 bin [1.8,2.5]: 16) where in this equation refers to a tiny contribution that is non-zero only in the presence of right handed currents, in particular contributing to C 9 µ , that can be cast as −0.02 C 9 µ (1 + 2(C 9 µ − C NP 9µ )). As can be seen immediately from this equation, X 2 SM [1.8,2.5] ∼ −0.14. Independent of the details of the physics model, almost all NP scenarios with C NP 9µ = 0 yield 0.88 < C NP 10µ − C NP 9µ < 1.26, implying 0.05 ≤ X 2 [1.8,2.5] ≤ 0.14. One relevant exception is Scenario 8, corresponding to X 2 [1.8,2.5] = 0.19. This scenario contains a LFU contribution in C 9 , which would imply a contribution to the electronic mode too, X 2e [1.8,2.5] = βP 5,e − P 4,e [1.8,2.5] 0.07.
In summary, given that X 2 [1.8,2.5] is predicted to be approximately −0.1 in the SM and up to +0.2 in some relevant NP scenarios, an experimental precision of ±0.1 would allow some of the NP scenarios to be disentangled from the SM. However, as shown above, with the present theoretical accuracy the theory predictions in q 2 bins yield a large overlap, preventing any clear discrimination. This is not surprising, because the deviation of P 5 in the [1.8, 2.5] bin is not so large compared to the anomalies in the bins [4,6] or [6,8]. Moreover, given that P 4 is quite SM-like (see discussion below), it is expected that the largest deviation for this observable will occur in the [4,6] and [6,8] bins. This is confirmed in Fig. 8.
Due to the stability of X 2 under most NP scenarios, it is essential to improve on its theoretical uncertainties. In parallel we can explore the sensitivity that P 5 and P 4 may offer Figure 8. SM and NP predictions for X 2 binned in q 2 .
individually in the [1.8, 2.5] bin. For completeness, we provide the relevant expressions here:  Table 1 and Refs. [1,2]). This is different in the case of P 5 [1.8,2.5] , which exhibits an enhanced sensitivity to C NP 9µ that drives the wider range. Moreover, the current size of the theory uncertainty of P 4 [1.8,2.5] erases any possibility of discrimination between the SM and NP scenarios, but in the case of P 5 [1.8,2.5] the smaller size of the error leaves some discrimination power.
In order to discern hadronic contributions, the following strategy can be employed. The best fit point from a global fit that excludes P 5 and P 4 can be used to predict the NP contributions entering X 2 [1.8,2.5] , as well as P 5 and P 4 individually. These predictions can be constrasted with the experimental results in order to assess the SM contributions to P 5 and P 4 . As noted above, the SM predicts X 2 [1.8,2.5] = −0.14, but P 5 [1.8,2.5] = −0.01 and P 4 [1.8,2.5] = 0.13. Such values arise from a complex interplay between several SM sources, among them the hadronic form factors, T ⊥ , T (these pieces encode, in particular, the non-factorizable power corrections), the value of the Wilson coefficients in the SM but also perturbative charm-loop contributions. Here we parametrize the remaining charm loop long-distance contributions in a manner that matches the non-perturbative computation from Ref. [29]. In practice, when quoting long-distance charm loops we refer to Eq.(5.13) for the transverse and perpendicular components and for the longitudinal one.
Using Eqs.(5.10-5.12) we can write the observables as follows 7 : The first point to notice is that both P 4 [1.8,2.5] and P 5 [1.8,2.5] are dominated by ReT ⊥ and the dominant long distance comes from s 0 , in both cases with a very similar magnitude. Subleading contributions arise from T and s ⊥, . Secondly, X 2 [1.8,2.5] has a negligible sensitivity to T and s 0 , and the first long-distance piece enters via subleading contributions from s ⊥, . Thus this observable is basically dominated by ReT ⊥ and proves to be quite robust against long-distance charm loop contributions in this bin.
Finally, recalling the stability of X 2 [1.8,2.5] under different NP scenarios, we can parametrize this observable to a very good approximation as: 19) where the interplay between NP and the non-factorizable QCDF hadronic contributions is clearly encoded. This implies that a measurement of X 2 [1.8,2.5] could provide an experimental constraint on ReT ⊥ in [1.8, 2.5], correlated with the NP scenario used, to be confronted with the SM prediction. The determination of ReT ⊥ can be seen as a non-trivial test of QCDF. Notice also that, as discussed at the beginning of this section, ReT ⊥ has a significant impact on the position of the zero of X 2 . As soon as T ⊥ is experimentally determined, the correlated measurement of the individual observables P 4 [1.8,2.5] and P 5 [1.8,2.5] will provide a handle on s 0 , the dominant long-distance charm loop in this bin.
The size of such effects should be clearly seen with the precision that should be attained during Run 4 of the LHC. 7 We neglect tiny contributions from ImT ⊥

Experimental prospects and precision
The angular observables in B 0 → K + π − µ + µ − decays are usually extracted by means of a maximum likelihood fit of the decay rate in Eqs.(2.1), (2.2) and (2.3) to experimental data in bins of q 2 [35]. Practically, such a fit is achieved by the minimisation of a negative log-likelihood. Section 3 has elucidated that from the combination of the decay amplitudes not all of the angular observables are independent. In a fit to experimental data however, each observable is simply the coefficient of an angular term and is therefore an independent parameter. In the massless lepton case one can impose relations, for example the trivial S c 1 = −S c 2 , where the CP -averaged observables are defined by S i = (J i +J i )/(Γ P +Γ P ). However, these do not apply in the low q 2 region where the leptons should be treated as massive. In principle, the new relations of section 3 may also be used to reduce the number of free parameters in the fit. However they are too complex to be implemented in a minimisation procedure, as they cause discontinuities in the negative log-likelihood. Therefore the full bases of angular observables must be fitted in each of the massive and massless cases. The symmetry relations may instead be checked after the fit has been carried out in order to ensure that physically reasonable results have been obtained.
For a given data set there is no guarantee that all observables may be determined in a single maximum likelihood analysis. There may be large correlations between fit observables, apparent degeneracies due to the limited statistics, detector resolution effects and physical boundaries that distort the likelihood. Furthermore, the determination of the new interference observables that arise in the complete five-dimensional description of the decay can be distinguished only by making use of the m Kπ line shape, which has not been done experimentally before. All of these effects could impinge on the success of any new experimental analysis of the five-dimensional decay-rate. In order to study the stability of a fit to all the angular observables and to obtain an estimate of the experimental precision, a simple simulation was used to perform LHCb-like pseudo experiments. As the dominant effects on the experimental fits are statistical, rather than contingent on the experimental details, the results presented here will apply equally to future Belle II analyses.

Experimental setup
Data sets are generated with the expected sample sizes collected by the LHCb collaboration at various points in time. The data the experiment currently has in hand, referred to as the Run 2 data set, is the combination of the Run 1 and Run 2 data with integrated luminosity of 9 fb −1 . Projections are made for future LHCb runs: Run 3 with 23 fb −1 , Run 4 with 50 fb −1 [36], and Run 5 representing the total data collected by the proposed Upgrade II with 300 fb −1 [37]. The signal yields are extrapolated from those in Ref. [35], scaling for the integrated luminosity, the B production increase from Run 1 to Run 2, and an enlarged m Kπ window of 0.750 < m Kπ < 1.200 GeV that is used to help determine the additional S-wave terms. The expected combinatorial background yields are similarly scaled. In a real analysis such a large m Kπ window would bring in extra partially-reconstructed backgrounds as well as more significant contributions from other P-and D-wave transitions, which would need to be accounted for in the systematic uncertainties. Furthermore, the exact form of the S-wave lineshape becomes more important in a wider window as the interference observables gain greater significance. Again, this would need to be accounted for in the systematic uncertainties which are beyond the scope of this paper.
The effect of the detector reconstruction and selection criteria is modelled using an angular acceptance function approximated to that in Ref. [38]. For the m Kπ window considered, the acceptance is assumed to be constant with m Kπ , following Ref. [39].
The q 2 bins used are the same as those in Ref. [35]. An alternative configuration with each bin split in half is also trialled. In contrast to previous experimental analyses, the fit is performed simultaneously with both B flavours, in principle allowing the CP -symmetric and CP -asymmetric observables 8 to be determined from a single fit. In order to fit the complete set of CP asymmetries (including those for the S-wave and interference observables) one cannot solely rely on the angular description. The overall scale of the decay rate needs to be constrained with an extended term in the likelihood. The constraint may be the CP asymmetry of the total decay rate, or the branching fraction of the average of the B 0 and B 0 decays. The latter is preferred. It gives complementary information for use in global fits to the Wilson Coefficients that describe these decays and is thus of interest in its own right even when the CP -asymmetries are not being extracted. An angular analysis is the only way to measure it in a model independent way, as the experimental efficiency can be corrected over all the kinematic variables of the K + π − µ + µ − system (within a q 2 bin these are the three angles and m Kπ ).
Measuring absolute branching fractions is difficult due to systematic uncertainties that are hard to control. Instead the total P+S-wave rate relative to the mode B 0 → J/ψ K + π − is taken, with the normalisation decay finishing in the same K + π − µ + µ − final state as the rare mode signal 9 . Using the measured S-wave fractions the P-wave relative branching fractions in each q 2 bin may be readily ascertained 10 .
The SM values of the angular observables are used in the generation of the pseudo-data, except where stated. For the P-wave observables (and only for this experimental sensitivity study), the B → K * form factors are taken from Ref. [42] and rely on a combination of Light Cone Sum Rules and Lattice QCD calculations. For the S-wave observables, the B → K * 0 form factors are taken from Ref. [15]. For all observables the non-local charm contribution is taken from Ref. [43], with the longitudinal and S-wave phase difference for all J P C = 1 −− dimuon resonances relative to the rare mode set to zero. The exact choice of these parameters has no impact on the conclusions of this study. The stability of the fit and 8 The CP -asymmetries are defined as Ai =J i −J i Γ P +Γ P for the P-wave observables and AS for interference observables. 9 Making the measurement relative to the B 0 → J/ψ K + π − mode ensures that, to a large extent, nuisance production and detection asymmetries cancel. However, in an analysis of real data the normalisation mode is affected by contributions from exotic J/ψ π − states [40], both in terms of the signal yield and the angular distribution. Corrections to the fitted results will therefore need to be ascertained to produce the correct relative P-wave only rate and are beyond the scope of this paper. 10 Recent developments in calculations of B 0 → K + π − form factors in a P-wave configuration [41] rely on a model to describe the lineshape of the K + π − system. For a correct comparison between measurements and predictions of the branching fraction, any differences in the lineshape models used both in theory and experiment must be taken into account.
the experimental precision on the P-wave observables is largely independent of the details of the model. Background events are simulated using a representative PDF constructed as the product of second order polynomials for each angular fit variable and an exponential function in m B 0 . As is customary, the observable F L is used, here defined by S 2c = −β 2 F L such that in the limit of no CP violation F L is exactly representative of the longitudinal polarisation fraction divided by the total P-wave rate. The CP asymmetry observable AF L is defined by (J 2c − J 2c )/(Γ P +Γ P ) = −β 2 AF L . Furthermore the forward-backward asymmetry is used as a fit parameter, with the customary definition Similarly for the S-wave contribution the observable F S is used, defined byS c Again, F S is a direct representation of the relevant transversity amplitude (|n S | 2 ), divided by the total P-and S-wave rate. In the massless-lepton limit the integral of the S-wave component is therefore F S (the S-wave fraction). The corresponding CP asymmetry observable, AF S is defined by (J c 2a −J c 2a )/Γ = − 3 8 β 2 AF S . For the q 2 bins in which the leptons are considered massless, which are those where q 2 > 1.1 GeV 2 , there are 19 CP -averaged (8 P-wave, 1 S-wave and 10 interference split into real and imaginary parts) angular observables to be fitted as listed in Eq.(3.10), plus the total P+S-wave rate. In these bins β 2 = 1. For the q 2 bins in which the lepton is treated as massive, where q 2 < 1.1 GeV 2 , there are 24 CP -averaged observables, as per Eq.(3.11); and one may again choose to fit the total rate as well. For these bins q 2 is evaluated at the centre of the bins.

Results with massless leptons
Initially only the CP -averaged observables are free to vary in the fit; those for the CP asymmetry are fixed to 0. For both the P ( ) i and S i basis, despite the inclusion of all the S-P-wave interference terms, > 99% of the pseudoexperiment fits for massless leptons in q 2 bins with q 2 > 1.1 GeV 2 converge and the P -wave CP -averaged observables are determined without any significant bias (approximately 20% of the statistical uncertainty or less) and with good statistical coverage. A summary of the distribution of the pulls resulting from the pseudoexperiment fits of the angular observables in two q 2 bins is shown in Fig. 9, including fits using the optimised (P ( ) i ) P-wave observable basis. For an observable the pull is defined as the difference between the fitted value and true value, divided by the statistical uncertainty estimated in the fit. It should be noted that both the real and imaginary parts of the interference observables can be determined by the fit.
Furthermore the optimised interference observables, P S i , may also be readily determined with the data that LHCb already has in hand. Summaries of the fit behaviour with this configuration are shown in Fig. 10. The estimated statistical uncertainties for these new observables as a function of the integrated luminosity collected are shown in Fig. 11. The points show the expected luminosity for future LHCb runs.    Figure 11. Estimated statistical uncertainty for the (left) real and (right) imaginary optimised interference observables P S i as a function of integrated luminosity for the q 2 bin 4.0 < q 2 < 6.0 GeV 2 .
For the alternative narrower q 2 binning, the situation is not so ideal; summaries for two q 2 bins are in Fig. 12. In general, the central fit values for all the variables do not show biases above 20% of statistical uncertainties. The exception is the two bins in the region 1.1 < q 2 < 2.5 GeV 2 (the bins are 1.1 − 1.8 GeV 2 , shown on the left of Fig. 12 and 1.8 − 2.5 GeV 2 ), where the predicted values of F L and A FB lie close to the edge of the physically allowed parameter space. For the narrower bins this boundary distorts the likelihood close to where the minimum should be. The result is an imperfect determination of these variables, or P 2 in the optimised basis, as the fit crosses into the unphysical region. However, as the other observables in these bins behave well and all the other bins behave well, there is motivation to use the finer q 2 bins even with the Run 2 data set. As the uncertainties are shown to be too small by the pull distributions the Feldman-Cousins method [44] will need to be employed to establish confidence intervals. The problems of bias and error determination are readily ameliorated with more data and even by the end of Run 3 the fit behaviour will be much improved.
When including the CP -asymmetry observables as free parameters in the fit it is again found that the fits converge successfully. These parameters themselves are found to be unbiased, although the estimated uncertainties are in general too small. Examples are shown in Fig. 13. The extracted relative branching fraction is also found to be unbiased. However, the extra free parameters lead to larger biases in the CP -averaged observables of up to 40-50% in some cases. With more data the situation will be improved and after 50 fb −1 it will be possible to extract all CP -averaged and CP -asymmetry observables in a single fit with minimal biases and good coverage.     In the 0.1 < q 2 < 0.98 GeV 2 bin for massive leptons, the situation is more complex. For the basis fitting only unoptimised observables (both P-and S-wave) the fit in general gives unbiased pull distributions. The exception is for the observables F L and S 1c which have a large anticorrelation between them. For the regular optimised P-wave observables the fit also does not converge well, likely due to the small value of F L in this bin, which appears in the denominator of the optimised observables. The Feldman-Cousins method will therefore be required to obtain the correct confidence intervals for all observables in this q 2 bin.
For the new optimised P-wave observables in the massive lepton q 2 bin, M 1 and M 2 , good behaviour is only obtained with the integrated luminosities expected from the LHCb upgrade. These observables are problematic for the fits as they are essentially the ratio of two angular coefficients with the same m Kπ dependence. Therefore they are almost completely anti-correlated and the fit struggles to converge, as shown in Fig. 14. However, with enough data the fit will improve and even by the end of LHCb Upgrade I reasonable behaviour for these observables can be expected, as shown in Fig. 15. For similar reasons the S-wave only optimised observable M 3 is poorly behaved. Due to the small S-wave contribution an even larger data set, such as the 300 fb −1 expected with LHCb Upgrade II, would be required for its successful extraction. This is shown in Fig. 16.
Finally the optimised P-and S-wave interference observables that occur when accounting for the lepton mass, M 4 and M 5 , have been considered. These can be extracted with the Run 2 data set as displayed in Fig. 17. The observables are not straightforward ratios of two other observables, which lessens the correlations in the fit. Furthermore they are functions of P-wave and S-wave observables, which are likely to be well constrained by the rest of the angular PDF; in particular they have different m Kπ shapes.
For a judicious choice of observable quantities, future experimental analyses should therefore be able to use the full angular distribution, including both the additional S-wave terms, and assuming massive leptons.

Results with a possible scalar amplitude
If one wanted to fit the data without assuming the absence of scalar amplitudes one must introduce the observables S c 6 and S c 1 in all bins. For maximal theoretical reach S c 1 would ideally be replaced with its optimised equivalent M 2 (see the discussion in section 2.1). The precision on this has been estimated for various future integrated luminosity scenarios, as shown in Fig. 18. Even with 300 fb −1 of data, the expected statistical uncertainty is much larger than that required to measure significant scalar new physics. Upgrade II -300fb Figure 18. The expected statistical uncertainty of the observable M 2 as a function of q 2 for various future integrated luminosities.

Symmetry relations
The six symmetry relations may be applied to the results of the binned fits as an independent check of the robustness of the experimental methodology. As the fitted observables are averaged over a q 2 bin the relations are not exact in this experimental context. This is particularly apparent in the lowest q 2 bin, where the changes in the variables with q 2 are most notable. Furthermore, as only the bins for q 2 < 1 GeV 2 are treated as having massive leptons there is some small imprecision in the symmetry relations for the bins immediately above 1 GeV 2 due to residual effects of the massless lepton treatment. Example distributions of the relations are shown in Fig. 19. These distributions of the symmetry relations may be used for a ready check by an experimenter of their fit to real data. If the relation calculated from the data lies outside these distributions the fit can be discounted and the experimenter invited to check their method. Care must be taken however as the experimental relations are calculated with q 2 averaged observables. This introduces some model dependence in the distributions of the pseudo-experiments.   Figure 20. Example distributions of three observables in q 2 . Overlaid in red is the result of fits of second polynomials with a common zero between the three observables. The blue line is a reference for 0. The observables are taken from pseudo-experiment fits for the estimated LHCb Run 2 yields. The value of β is taken to be 1.

Zero points
The zero points of the observables X 2 , X 3 and X 4 provide a good test of the SM. From the experimental results they are found by taking the independent fit results from each q 2 bin for the relevant observables. The three observables are plotted in q 2 and a χ 2 fit of second-order polynomials is carried out simultaneously for each observable. The point at which the polynomials are zero is a common fit parameter. The correlations between the fitted observables within a q 2 bin are included in the χ 2 fit. As the q 2 dependence of the observables is of most interest, it makes sense to employ the half-sized binning, doubling the number of q 2 points. The fits are found to behave well in these finer q 2 bins with the expected yield for the LHCb Run 2 data set for those variables of interest.
Example fits of the q 2 distributions for the expected LHCb Run 2 data set are shown in Fig. 20. Alternative fits are performed with no common zero between the observables and the change in χ 2 determined in order to test the hypothesis of a common zero crossing point. Three hypotheses have been tested: the SM and two NP models from the fits to current experimental results in Ref. [2]. The two NP scenarios are: i) 'Scenario 8', which corresponds to only left-handed new physics and includes a LFU new physics contribution; and ii) 'Hypothesis 1', which introduces right handed currents that do not satisfy condition R (defined by Eq. (5.3)), and should lead to the three X observables not having a common With 300 fb −1 , as displayed in the right of Fig. 21, it can clearly be seen that the χ 2 of the fit with a common zero is worse than that for independent zeroes, giving discrimination between Hypothesis 1 and the SM.
Even if the common-zero χ 2 fit is unable to distinguish between the three physics hypotheses with the available data, the position of the zero may enable them to be separated. The expected precision on the common zero crossing point with 9 fb −1 of data is ∼ 0.18 GeV 2 , becoming 0.07 GeV 2 with 50 fb −1 . For comparison, the estimated uncertainty on the zero using the regular q 2 binning is found to be marginally worse: ≈ 0.19 GeV 2 for the Run 2 data set. The uncertainty is completely dominated by the precision with which the P 5 − P 4 observable is determined. The distribution of measured zeros for the three observables fitted independently is shown in Fig. 22. It is clear that the S-wave interference observables are comparatively imprecise, which is to be expected given their small simulated contributions of ≈ 10%. Fig. 22 suggests that with the just the Run 2 data set there is little discrimination between the SM and the trialled NP hypotheses from the position of the zero point. However, Fig. 23 demonstrates that with 50 fb −1 there is clear distinction between the SM and the scenario 8 NP model.

S wave in the global fits
Equations (4.14) and (4.16) allow us to include S-wave interference observables in Wilson coefficient fits for new physics without having to calculate the S-wave form-factors. The expected precision for W 1 and W 2 with only the P-wave observables, with the interference observables, and the combination of the two has been assessed. Pseudo-experiments are run with the SM hypothesis and using the new optimised interference observables, P S    calculated along with their uncertainties, accounting for the correlations between the fitted parameters. The correlation between the expressions involving only P-wave observables and that including the interference observables is assessed for each of W 1 and W 2 . Subsequently the average and statistical uncertainty when combining the P wave only part with the interference part is found for each observable. For the Run 2 data set the narrow bins cannot reliably be used to extract the optimised observables. Therefore here the wider q 2 bins are used. The expected precision of W 1 and W 2 is shown in Fig. 24. It can be seen that the combination of P-wave only with the Pand S-wave observables is only marginally more precise than for the P-wave only alone. This is to be expected due to the small contribution of the S wave that is simulated and the presence of P-wave parameters in the combination with the interference observables such that the contribution of the S wave is not statistically independent.
In the future the size of the data sets will become sufficient for the narrower bins to be readily used. An example is shown in Fig. 25 of the putative LHCb Run 4 data set with 50 fb −1 .

Summary and Conclusions
This paper presents the fully differential decay rate of B 0 → K + π − + − transitions, with the K + π − system in a P-or S-wave configuration, which can be used to analyse such decays in current and future experiments. This work paves the way for the next step in the analysis of this decay, going beyond previous analyses by identifying and exploring the experimental prospects of massive and S-wave observables that were previously neglected or treated as nuisance parameters. Our analysis relies on a complete description of the symmetries that apply to the full distribution. This enables us to define the complete set of observables that describe the decay and the relations between them, excluding only the presence of NP scalar or tensor contributions.
Our study shows, in particular, that the symmetries of the B 0 → K + π − + − decay rate give rise to relations that allow a combination of S-wave observables, W 1,2 , to be expressed in terms of P-wave only observables. These combined observables then have no dependence on the poorly known S-wave form factors and therefore offer genuine probes of physics beyond the SM. This opens a new seam in the phenomenology and, for the first time, will allow S-wave events in the data to contribute to global fits for the underlying physics coefficients.
We also present strong bounds on the set of new S-wave observables using two different methods, the relations themselves and Cauchy-Schwartz inequalities relying only on the structure of the observables in terms of 2D complex vectors. They serve as important experimental cross-checks.
From the point of view of experimental analyses, it has been shown that all of the Pand S-wave angular observables for the B 0 → K + π − µ + µ − decay may be extracted with a five-dimensional fit to the data sample that the LHCb collaboration already has in hand. Our analysis includes the complete description of the m Kπ dependence of the differential decay rate for the first time, as well as the treatment of the leptons as massive at low values of q 2 . The exploitation of the symmetry relations for the observables will allow an immediate test of the veracity of the fits to data without resorting to theoretical predictions. Finally, the common zero crossing point of a set of P-wave and S-wave interference observables may contribute to discrimination between the SM and NP independently of global fits, and can offer insight into the hadronic contributions.

A Appendix: The 7th massive relation
In this Appendix we will provide the necessary steps to determine the last relation. This relation vanishes in the massless limit and is particularly lengthy. For both reasons, specially the latter, is of limited practical use. Therefore we will present here the steps to derive this relation but will not write it out explicitly. The derivation is based on five steps: Step 1: Our starting point will be a particular combination of the 2D vectors that will allow us to introduce the structure of the observable M 1 for the first time.
(n † n S + n † n S ) × (n † n S − n † n S ) In order to avoid repeating the coefficient 4m 2 /q 2 of M 1 , we introduce a reduced version, that we will call m 1 defined by We will use the freedom given by the symmetry (see section 3) to choose the phase such that A L 0 has only a real component. Then we solve Eq.(A.1) for m 1 and its imaginary counterpart: Step 2: Using n 0 = en S + f n S and multiplying this equation by σ.n S , σ.n S and σ.n 0 , where σ = ((0, 1), (1, 0)) one can show that all terms A ( )L 0 A ( )R 0 can be written in terms of where e = (n † n S )(n † ⊥ n 0 ) − (n † n 0 )(n † ⊥ n S ) (n † n S )(n † ⊥ n S ) − (n † n S )(n † ⊥ n S ) f = (n † n S )(n † ⊥ n 0 ) − (n † n 0 )(n † ⊥ n S ) (n † n S )(n † ⊥ n S ) − (n † n S )(n † ⊥ n S ) (A.7) Both coefficients e and f can be trivially rewritten in terms of P-and S-wave observables, as in Eq.(A.5).
Step 3: We define a set of reduced observables related to the corresponding remaining massive observables: We can combine them in one single equation cancelling the dependence on A ( ) t : and using Eqs.(A.6) we can rewrite this equation in terms of only A L 0 A R * 0 : giving the desired relation but involving A L 0 and A R 0 amplitudes that still need to be expressed in terms of observables.
Step 4: Using the decomposition n ⊥ = gn S + hn S and after determining g and h by multiplying by n ⊥ and n , we find the following relation: (h * 2 − g * 2 )n S † n S = h * n † ⊥ n S − g * n † ⊥ n S , (A.11) where g = |n ⊥ | 2 (n † n S ) − (n † n ⊥ )(n † ⊥ n S ) (n † n S )(n † ⊥ n S ) − (n † n S )(n † ⊥ n S ) , h = |n ⊥ | 2 (n † n S ) − (n † n ⊥ )(n † ⊥ n S ) (n † n S )(n † ⊥ n S ) − (n † n S )(n † ⊥ n S ) . (A.12) Then combining the previous equation with the observable F S , one can determine |A L 0 | 2 and |A R 0 | 2 (remember that A L 0 is taken to be real using the symmetry properties) by solving the system: (A.14) Now we have all the necessary ingredients to arrive at the relation. If we define we have two equations in terms of x and y (using Eq. A.3 and Eqs. A.13 and A.14): m 1 = −by + ax 4(x 2 + y 2 ) x 2 + y 2 = 1 4 (F S Γ ) 2 − ∆ 2 (A. 16) These two equations can be solved to determine x and y in terms of observables.
Step 5: Finally, the last step consists of trivially expressing A L 0 , A R 0 in Eq.(A.10) in terms of x and y (all other quantities like the m i and the coefficients e and f are already direct functions of observables). Then after solving the system for x and y using Eq.(A. 16) insert the result in Eq.(A.10) to get a final lengthly expression written entirely in terms of observables.
Notice that in order to relate the reduced observables to the measured massive observables M 1,2,3 ,4 ,5 one needs to multiply the previous relations involving the m i 's on both sides by factors of 4m 2 /q 2 . For this reason in particular Eq.(A.10) vanishes exactly in the massless limit.