Toward extracting γ from B → DK without binning

B ± → DK ± transitions are known to provide theoretically clean information about the CKM angle γ . The best available methods exploit the cascade decay of D into CP self-conjugate states, employing binning in the D decay Dalitz plot. In this paper, we present a proof of principle for an extraction of γ that does not rely on such binning. In particular, this new method does not require any assumptions about the amplitude and strong phase variation over the Dalitz plot. Reminiscent of the binned method, for which the ultimate statistical error depends on optimizing the binning, the highest sensitivity to γ in the unbinned method is achieved through an optimal choice of statistical method. With future optimizations of the test statistic, this newly introduced unbinned method has the potential to be competitive with corresponding binned analyses.


I. INTRODUCTION
The B meson decays B ± → DK ± are mediated through a combination of b → cūs and b → ucs transitions (and their conjugates).Both receive tree-level contributions with comparable suppression factors from the Cabibbo-Kobayashi-Maskawa (CKM) matrix.It has been recognized for over four decades that the interference of the resulting diagrams allows for efficient access to CP violation in the CKM matrix [1,2].Specifically, these decay transitions provide sensitivity to the phase γ (also referred to as ϕ 3 ) [3,4] with negligible theoretical error [5], where The method of Refs.[3,4] uses two-body D decays to CP eigenstates, which is conceptually simple but suffers from suppressed sensitivity to γ due to a large hierarchy between the two interfering amplitudes.In Refs.[6,7], this complication was ameliorated through the use of interference between Cabbibo-allowed and doubly-Cabbibo-suppressed flavor eigenstates of the neutral D mesons.However, the single best measurement of γ is currently due to the BPGGSZ method [8][9][10][11][12], where D 0 and s D 0 decay into multi-body CP self-conjugate states (such as K S π − π + ) with large interference between the two parton-level weak transitions.All of these methods have by now received many years of experimental effort [10,[13][14][15][16][17][18][19][20][21][22][23][24][25][26][27][28][29][30][31][32], with the resulting error on γ at the several percent level.
In the BPGGSZ method, the D decays are binned in the Dalitz plane in order to convert dependence on the full amplitude and phase into a finite number of bin-averaged values.
Symmetry properties of the CP -conjugate decay ensure that the number of independent quantities is smaller than the number of bins, provided the binning is symmetric with respect to CP conjugation.Given the leading role of the BPGGSZ method in the overall uncertainty of the measurement of γ, it is perhaps not surprising that the method has received significant experimental and theoretical attention, resulting in several improvements over the years.
Arguably the most significant is the optimized choice of binning, pioneered in Refs.[33,34].
There, the shape of the bins in the D → K S π − π + Dalitz plot is initialized to the isocontours of the strong phase difference ∆δ D = δ 13,12 − δ 12,13 (see Eqs. (7) and (8) for definitions) between CP symmetric points in the D 0 and D0 Dalitz plots.The bins are then continuously deformed in order to optimize sensitivity to γ.This procedure minimizes the washout of sensitivity to γ that occurs when ∆δ D is bin-averaged.Binning in this manner is extremely powerful and is estimated to lead to only ∼ 10 % lower statistical sensitivity to γ for the choice of 2 × 8 bins than if the unbinned amplitude model were used [34].(The amplitude model provides the smallest statistical error, but induces difficult-to-quantify systematical errors.) An alternative approach, eliminating the bin dependence of the BPGGSZ method, was presented in Ref. [35].In that work, the binning of the D decay Dalitz plot was replaced by a Fourier transform in phase space variables.Truncation of the series gives a setup with a finite number of unknown parameters, such that it is possible to extract γ.The choice of bins is thus replaced by the choice of variables in which to perform the Fourier transform, as well as of the order at which to truncate the series.That is, some information from the Dalitz plane (in this case, the higher frequency components in the Fourier expansion) is still removed at the data processing step.Moreover, choosing a coordinate basis for the Fourier transform is driven by the same modeling of the strong phase that is used in the determination of optimized Dalitz plot bins.While the Fourier transform method has not yet been implemented on experimental data, the study of Ref. [35] indicates that it can be highly competitive with the binned BPGGSZ method.
In this manuscript, we present a proof of principle for a novel γ extraction method that requires neither binning nor truncation.Given the highly optimized binning procedure currently used in experimental implementations of the BPGGSZ method, our main goal here is conceptual.Our hope is that, eventually, this new method will be able to improve on the BPGGSZ method and/or the method of Ref. [35].This is because the existing methods all require removal of a certain amount of information during data processing.While this step can be optimized, the lost information cannot be recovered by any statistical treatment.
We replace this with an alternative data processing procedure that, in principle, removes no decay information.(Note, however, that not all relevant information is used in the current implementation.)Our method makes the problem of γ determination equivalent to the question of whether two inequivalent measurements are sampling the same function.
This structure allows for the application of a wide range of nonparametric methods to the problem, replacing the optimization of the data processing step with the conceptually different procedure of test statistic optimization.
The explicit analysis we introduce in the main part of the paper indeed falls short in terms of its statistical sensitivity to γ compared to the methods above.Nevertheless, we find that a toy example of a method without binning or truncation is still useful, as it makes clear that such approximations can be replaced with the problem of finding an optimal test statistic (that is, an observable optimally sensitive to γ).We wish to stress that while each of the available methods -BPGGSZ, Ref. [35], and the methods presented in this paperrequire some optimization in order to reduce the statistical error on γ, these optimizations take on qualitatively different forms.For the BPGGSZ method, this is the number of bins and their shapes.For Ref. [35], it is the choice of Fourier transform variables and the order of truncation.For our method, it is the choice of test statistic.The question of which method can ultimately give the smallest statistical error on measurements of γ depends critically on this optimization step.We hope to improve on our proof-of-principle implementation with a follow-up work in which we perform such an optimization for our method.
The paper is structured as follows.In Sec.II A, we review the dependence of the B ± → (K S π − π + ) D K ± decay rates on the CKM angle γ.In Sec.II B, we present a carefully chosen combination of reduced differential partial decay widths in order to extract cot 2 γ.In Sec.II C, we derive the implications for cumulative reduced partial decay widths and present the fundamental idea of our algorithm.Namely, we extract γ as a parameter that brings two functions of empirical cumulative probability distributions into as much agreement as possible.
We implement this idea by employing a measure adapted from the Kolmogorov-Smirnov test statistic in Sec.III.To demonstrate our proof of principle, we show numerical results based on toy Monte Carlo data in Sec.IV. Conclusions are given in Sec.V.

II. THEORY A. Notation
To set the stage, we review how sensitivity to the CKM unitarity triangle angle γ is achieved in B ± → DK ± transitions, following the notation of Ref. [9].Consider the CP -conjugate cascade decays A Dalitz plot analysis of K S π − π + , characterizing the intermediate neutral D meson, allows us to fully specify the γ-dependence of the process.Focusing first on the initial two-body A B is real by convention, with δ B the difference between the strong phases of the D 0 K and s D 0 K amplitudes.The amplitudes in Eq. (4) carry a weak phase which agrees with γ in Eq. ( 1) up to O(λ 4 ) corrections.These have been computed to give relative shifts in the determination of γ of ∼ 2 × 10 −3 , which we ignore going forward [5].Due to the color and CKM suppression of the amplitudes in Eq. ( 4), the theoretical expectation is that r B is small (r B ∼ 0.1-0.2), in agreement with the experimental determination [13], The smallness of r B reduces sensitivity to γ in all methods using B ± → DK ± .In the following analysis, we neglect D 0s D 0 mixing, which contributes at second order in the mixing parameters and yields a subleading correction of less than 1 % in the extraction of γ [36].
For the subsequent three-body decay of the intermediate neutral D meson, we define the amplitudes A( s where s ij = (p i + p j ) 2 is the invariant mass squared of the ij system.We define A 12,13 and δ 12,13 to be real functions with A 12,13 ≥ 0 and δ 12,13 ∈ [0, 2π).The simple relationship between the amplitudes in Eqs. ( 7) and ( 8) follows from the CP symmetry of the strong interaction and the fact that the final state K S π − π + has zero spin.(CP violation in D decays is very small and thus is neglected in this discussion.) The D meson is a narrow state that decays weakly, justifying the use of the narrow width approximation and the neglect of any continuum contribution.In the vicinity of the D resonance, we thus have where P D is the neutral D meson propagator.In the narrow width approximation, we can write the B meson partial width for these decays in terms of Eqs. ( 7) and ( 8) as using where in agreement with Ref. [9].Note that Γ is dimensionless.
We can obtain additional information about the magnitudes A 12,13 and strong phases In our proof-of-principle unbinned method for γ extraction (introduced in Sec.II B), we will make two simplifications.First, we will assume that the cumulative distributions of the A 12,13 functions are measured precisely enough to be treated as exactly known.Secondly, we will formulate the method such that information about the strong phases δ 12,13 is never required.The first simplification is made for ease of presentation: in the intermediate theory expressions, we can treat A 12,13 as known functions, while in the final expressions only the cumulative distribution functions will be used (and errors on their determinations will certainly be subleading).We would prefer to relax the second simplification in the future, since we are clearly discarding useful information.

B. γ from reduced partial widths in theory
We now demonstrate that, by considering simple odd and even combinations of the partial widths of Sec.II A, the relative dependence on the parameters of the B ± → DK ± decays take on a particularly simple form.(For an alternative use of symmetry considerations to parameterize approximate flavor symmetries in 3-body decays, see Ref. [38].)With respect to the s 12 = s 13 axis of the Dalitz plane, we first form even (Σ) and odd (∆) quantities made from the widths d Γ± given in Eqs. ( 13) and ( 14): Here, the subscripts correspond to the B ± charge and agree across both sides of the equation.
Further symmetrizing and anti-symmetrizing the quantities in Eq. ( 15) with respect to the The subscripts S and A on the left-hand sides of Eqs. ( 16) and ( 17) stand for symmetric and anti-symmetric and correspond to the choices of + and −, respectively.Inserting the explicit expressions given in Eqs. ( 13) and ( 14 Note that the subtracted quantities above are not directly observed experimentally.However, while the terms subtracted away from Eqs. ( 18) and ( 21) depend on the D decay amplitudes and r B , they do not depend on γ.Moreover, the subtracted terms only depend on r B quadratically, with the effect of subtracting using an incorrectly determined value of r B suppressed.As explained above, for our proof-of-principle demonstration, we treat A 12,13 as a known function of s 12 , s 13 , determined with good enough precision from D decay data, while r B is a parameter which we fit within the method.Note further that the above subtraction is equivalent to replacing the decay widths given in Eqs. ( 13) and ( 14 and then forming combinations of them analogous to Eqs. ( 15)- (17). It allows us direct access to γ up to a four-way degeneracy equivalent to that in Ref. [9].The relations in Eqs. ( 26) and ( 27) are what form the basis of the unbinned method for extracting γ described in the subsequent sections of this work.

C. γ from cumulative reduced partial widths in practice
Our observations consist of individual B decay events and not continuous distributions.
To make connections with the results above, we introduce the cumulative reduced partial decay widths, defined as where, outside the physical region of the Dalitz plot, d Γ± /(ds 12 ds 13 ) = 0.These functions are monotonically increasing, with limiting values R ± (0, 0) = 0 and R ± (s 12 , s 13 > (m D − m π ) 2 ) = Γ(tot) ± , where Γ(tot) is the reduced partial width of B ± → DK ± .Note that Γ(tot) ± is not, in general, normalized to unity, and, thus, resulting objects do not directly correspond to cumulative distribution functions.The functions R ± (s 12 , s 13 ) continue to vary outside of the physical Dalitz region, i.e., in the rectangle around the physical region of the Dalitz plot.
With many observed B ± decay events, Eq. ( 29) will be approached, up to an overall constant, by the counting functions where the index i ± (j ± ) runs over the s 12 (s 13 ) values of all observed decays.N ± (s 12 , s 13 ) are thus the number of observed B ± → DK ± decays with (p K +p π − ) 2 < s 12 and (p K +p π + ) 2 < s 13 .
They are nonzero almost everywhere in the Dalitz plane and can be constructed with no binning and with minimal processing of the data.These functions are approximations of the continuous partial decay widths, such that, in the limit of large N (tot) ± , the total number of B ± → DK ± events, we have We now bring Eq. ( 31) through the procedure laid out in Sec.II B, namely, we symmetrize and anti-symmetrize the function with respect to phase space and initial B meson charge.
which are the cumulative versions of the functions in Eqs. ( 15)- (17).We may also find the cumulative versions of Eqs. ( 22) and ( 23) by subtracting away the appropriate cumulative flavor-tagged D 0 meson decays.To do this, we define where is necessary for the proper normalization of Eqs. ( 36) and (37).The forms of the subtracted parts in Eqs. ( 36) and ( 37) follow from Eqs. ( 24) and ( 25).Here, N (tot) D is the total number of (independently) observed D 0 → K S π − π + events, while N D (s 12 , s 13 ) is defined in exactly the same way as N ± (s 12 , s 13 ) in Eq. ( 31) except with summations now over all D 0 → K S π − π + events.
Note that the normalization factor r ± in Eq. ( 38) is measured in the experiment and is simply given by the ratio of measured D 0 → K S π − π + and (reduced) decay rates.For the purposes of our proof-of-principle demonstration, we set r ± to its expected measured value as predicted by γ, δ B , and r B values and do not take into account experimental errors.Likewise, for the purposes of the calculation of r ± , we use the D 0 → K S π − π + amplitude model of Ref. [39], including the strong phases.However, in the algorithm itself, we use the discrete data N D (s 12 , s 13 ) only and no phase information.The experimental determination of N D (s 12 , s 13 ) can make use of both D 0 and s D 0 decays, since, in the above discussion, we neglect CP violation in D decays.Note also that the discussion of experimental effects -such as backgrounds, efficiencies, and resolutions -is outside of the scope of the present manuscript.
In the next step, we use Eqs.( 36) and (37) in Eqs. ( 33)-( 35) to get the cumulative functions R ΣS | sub and R ∆A | sub .Due to the linearity of the integral operation in Eq. ( 29), the relations in Eqs. ( 26)- (28) all hold if we replace the functions therein with their corresponding cumulative counterparts.That is, we have and Stopping to examine the ratios in Eq. ( 39) closely, we emphasize the following observation: Up to a δ B -and γ-dependent rescaling, the cumulative functions within each R Σ and R ∆ pair are the same.For instance, note that the first equation in Eq. ( 39) tells us that R ΣS | sub = − cot δ B cot γ R ΣA , where the left-hand side depends only on a correct determination of r B (due to the subtraction, see the first line in Eq. ( 22)) and the right-hand side only on δ B and γ.As a result, we recast the problem of measuring γ as one of finding values of γ, δ B , and r B that give the two pairs of functions R Σ and R ∆ , rescaled following Eq.( 39), the highest statistical significance of having been drawn from the same underlying distributions.Note that, below, we make use of Eq. ( 39) rather than Eq. ( 40), since Eq. ( 39) allows for the extraction of all three parameters γ, δ B , and r B , while Eq. ( 40) is sensitive only to γ and r B .

III. EXTRACTION OF γ AS AN OPTIMIZATION PROBLEM
The strategy of Sec.II C for extracting γ, δ B , and r B is condensed in Eq. ( 39).One way to practically employ this equation is to vary these three parameters such that the functions are minimized.The respective minima of D Σ (γ, δ B , r B ) and D ∆ (γ, δ B , r B ) should be reached for the same values of γ, δ B , and r B , within statistical uncertainties.
This procedure is analogous to the minimization of the Kolmogorov-Smirnov (KS) test statistic.In its original, one-dimensional formulation, the KS test takes two empirical cumulative distribution functions (CDFs), F 1 (x) and F 2 (x), and computes, as its test statistic, their maximum difference: Two-dimensional realizations of the KS test, which are most relevant to our functions in Eqs. ( 41) and (42), have been described in Refs.[40][41][42].Note importantly that Eqs.(41) and (42) are not exactly KS test statistics, as, unlike in the KS test, the functions R ΣS | sub , R ΣA , R ∆A | sub , and R ∆S are not CDFs because they are not positive definite.
One complication that arises for two-dimensional cumulative functions is how to deal with the orientation of the integration in Eq. ( 29).That is, including Eq. ( 29), we may define R ± (s 12 , s 13 ) in an infinite number of ways, each one an equally valid alternative to constructing test statistics D R Σ (γ, δ B , r B ) and D R ∆ (γ, δ B , r B ).This freedom in unbinned methods of extracting γ corresponds to the infinite number of possible binnings of phase space in the binned methods.In this present work, we limit our analysis to two additional definitions of cumulative R ± (s 12 , s 13 ) functions, given by We recall that, outside the Dalitz plot, d Γ± /(ds 12 ds 13 ) = 0. Generalizing Eqs. ( 41) and (42) in this fashion, this means that we additionally minimize as well as the analogously-defined s D Σ (γ, δ B , r B ) and s D ∆ (γ, δ B , r B ) test statistics.In total, we therefore consider three out of the infinite different integration orderings, i.e., the orderings specified in Eqs. ( 29), ( 44) and (45).For finite data, the unbinned extraction of γ works most effectively if one takes many more orderings into account and chooses the one that best minimizes the corresponding D Σ (γ, δ B , r B ) and D ∆ (γ, δ B , r B ) functions.
To compute the s R and R versions of The forms of these functions follow trivially from the integrations in Eqs.(44) and (45).
Using the D meson decay data, we may also define s N D (s 12 , s 13 ) and N D (s 12 , s 13 ) in the same way as above for the computation of the s R and R versions of Eqs. ( 36) and (37).
In summary, we extract a measurement of γ as follows.Varying γ, δ B , and r B , we determine the locations of the minima As the minima of D Σ (γ, δ B , r B ) and D ∆ (γ, δ B , r B ) are at the same point for each choice of R, we combine the functions in the above fashion for symmetry reasons.With infinite data, we would expect to achieve for a particular set of parameter values γ, δ B , and r B .With finite data, some integration orderings will achieve more effective minimizations in Eqs. ( 49)-( 51), meaning that deeper minima are attained.We therefore identify the "best" of the integration orderings by the one that gives the deepest minimum, corresponding to the most reliable value of γ, which we quote as our final result.

IV. NUMERICAL RESULTS
As a proof of principle of our unbinned methodology, we generate 1000 sets of toy Monte Carlo Dalitz plots with fixed input values for γ, δ B , and r B given in Table I.To do this, we implement the Dalitz plot amplitude model for D → K S π − π + from Ref. [39]; see Refs.[43,44] for further details. 2 We apply our unbinned procedure to each of these sets of generated data, arriving at 1000 extractions of the three parameters γ, δ B , and r B using Eqs.( 49)-(51).
Each of the Monte Carlo samples has 7000 B + decay events, 7000 B − decay events, and 5500 (independently-measured) D decay events.Note that we use input values for γ, δ B , and r B that are quite far away from the values realized in nature.This is because the main goal of the present Monte Carlo study is to demonstrate that an unbinned extraction of γ is possible.From the outset, it is clear that, in its present form, this model-independent unbinned method is much less sensitive to γ than the optimized binned one is.I.Each of the Monte Carlo samples has 7000 B + decay events, 7000 B − decay events, and 5500 (independently-measured) D decay events.
In order to identify the most effective integration ordering, we calculate the average values of D min , D min , and s D min over these 1000 Dalitz plots and choose the smallest one.For the optimal integration ordering, we histogram the obtained global minima and extract our measurements of γ, δ B , and r B as the averages of these histograms.The respective errors are given by the left and right bounds on the middle 68 % of entries in each histogram, resulting, in general, in asymmetric errors.In future experimental analyses, this statistical treatment can be replaced by a more sophisticated procedure.
In order to find the global minimum of each set of generated data according to Eqs. ( 49)-(51), we vary γ, δ B , and r B simultaneously.We vary γ in steps of 2 • in the interval [91  I. Further, we vary r B in steps of 0.01 in the interval [0.8, 1.0], corresponding to the value of r B = 0.9 in Table I.These choices were made because of the high computational cost of computing the functions in Eqs.(41) and ( 42) for R, s R, and R (we perform simulations on a personal laptop), as well as some degeneracies that appear because Eq. ( 39) only constrains trigonometric functions of γ and δ B .For example, one such degeneracy occurs due to the fact that Additionally, because it is impractical to implement a computation of Eqs. ( 31) and (48) at all points (s 12 , s 13 ), we instead sample the functions at a discrete set of points in the rectangle surrounding the physical region of the Dalitz plot and interpolate.In our implementation, we use grid points with a horizontal and vertical separation of 0.01 GeV 2 .
We show the results of these computations for our considered scenario in Table I.In this case, the optimal ordering turns out to be R.In particular, although the average value of D min is nearly the same as that of D min , the average value of s D min is larger than that of D min and D min by about an order of magnitude.We give the resulting histograms of the 1000 extracted values of γ, δ B , and r B in Fig. 1.From Table I, we see that the output achieved by our unbinned analysis technique agrees with the input.

V. CONCLUSIONS
In this work, we have introduced a new, model-independent, unbinned method designed to extract γ from B ± → DK ± → (K S π − π + ) D K ± decays.This development contributes to the long-term effort towards achieving ultimate precision in the determination of γ, enabling unprecedented tests of the Standard Model.It is presently unclear if our method can provide a superior statistical error as compared to the other two theoretically clean methods.
As mentioned in the introduction, each approach requires a different kind of statistical optimization.For us, the required optimization is in the choice of test statistic, which can be formed from variants of cumulative distribution functions such as Eq. ( 29).
On the other hand, our method does not involve additional optimization of auxiliary variables that specify the analysis (such as shapes of bins or Fourier modes), unlike both the classic BPGGSZ method and the method of Ref. [35].However, we are still in the early stages of developing a competitive alternative.Using toy Monte Carlo data, we have demonstrated as a proof of principle that this method returns values for γ, δ B , and r B that are in agreement with the input values chosen for the generated data.Future work is required to see how one can optimize the test statistic for this particular method.
Our method is not yet optimized for the highest sensitivity to γ since it does not include all the possibly relevant observables.For instance, by forming ratios such as those in Eq. ( 26), one reduces the number of observables sensitive to γ.That not all of the available information is being used is further signaled by the fact that this method requires no information about the phases of the D decay amplitudes, while these phases are an integral part of the more sensitive binned methods.The competitiveness of the unbinned method could thus be greatly enhanced in the future by extending this approach to include data from correlated charm decays.
Just as the effectiveness of binned methods depends on the choice of the binning, the effectiveness of our unbinned method depends on the integration ordering of the cumulative functions.In principle, there are an infinite number of ways to perform this integration over the Dalitz plot.As such, it is not presently clear whether the binned or the unbinned methods will ultimately give the most competitive results.

δ 12 ,
13 from D decay data.In particular, D * + → D 0 π + decays tell us about A 12,13 .Since data on D → K S π − π + decays is abundant, this gives rather precise, direct information on A 12,13 .Measurements of theD → K S π − π + Dalitz plot distributions in coherent ψ(3770) → D s D decays, where at least one of the two D decays is in the K S π − π + channel, give modelindependent information about δ 12,13 (though limited in statistics).Still, at the moment, determinations of D decay parameters are a subdominant source of error, almost an order of magnitude smaller than the statistical error incurred with the binned extraction of γ [14, 15, 37].
is easy to see that the ratios dΣ S ds 12 ds 13 sub and R ∆S , we follow the steps laid out in Sec.II C but use, respectively, the following modified versions of the counting

TABLE I .
Input values for the parameters γ, δ B , and r B used for the generation of the toy Monte Carlo data and the corresponding output of the implementation of our unbinned algorithm.and δ B in steps of 2 • in the interval [1 • , 89 • ].That is, we search for the global minimum in the complete quadrant of the input values given in Table • , 179 • ]