1 Introduction

The B meson decays \(B^\pm \rightarrow D K^\pm \) are mediated through a combination of \(b \rightarrow c\bar{u}s\) and \(b \rightarrow u\bar{c}s\) transitions (and their conjugates). Both receive tree-level contributions with comparable suppression factors from the Cabibbo–Kobayashi–Maskawa (CKM) matrix. It has been recognized for over four decades that the interference of the resulting diagrams allows for efficient access to CP violation in the CKM matrix [1, 2]. Specifically, these decay transitions provide sensitivity to the phase \(\gamma \) (also referred to as \(\phi _3\)) [3, 4] with negligible theoretical error [5], where

$$\begin{aligned} \gamma = \arg \left( -\frac{V_{ud} V_{ub}^*}{V_{cd} V_{cb}^*}\right) . \end{aligned}$$
(1)

The method of Refs. [3, 4] uses two-body D decays to CP eigenstates, which is conceptually simple but suffers from suppressed sensitivity to \(\gamma \) due to a large hierarchy between the two interfering amplitudes. In Refs. [6, 7], this complication was ameliorated through the use of interference between Cabbibo-allowed and doubly-Cabbibo-suppressed flavor eigenstates of the neutral D mesons. However, the single best measurement of \(\gamma \) is currently due to the BPGGSZ method [8,9,10,11,12], where \(D^0\) and decay into multi-body CP self-conjugate states (such as \(K_S \pi ^- \pi ^+\)) with large interference between the two parton-level weak transitions. All of these methods have by now received many years of experimental effort [10, 13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31,32], with the resulting error on \(\gamma \) at the several percent level.

In the BPGGSZ method, the D decays are binned in the Dalitz plane in order to convert dependence on the full amplitude and phase into a finite number of bin-averaged values. Symmetry properties of the CP-conjugate decay ensure that the number of independent quantities is smaller than the number of bins, provided the binning is symmetric with respect to CP conjugation. Given the leading role of the BPGGSZ method in the overall uncertainty of the measurement of \(\gamma \), it is perhaps not surprising that the method has received significant experimental and theoretical attention, resulting in several improvements over the years.

Arguably the most significant is the optimized choice of binning, pioneered in Refs. [33, 34]. There, the shape of the bins in the \(D \rightarrow K_S \pi ^- \pi ^+\) Dalitz plot is initialized to the isocontours of the strong phase difference \(\Delta \delta _D = \delta _{13,12} - \delta _{12,13}\) (see Eqs. (8) and (9) for definitions) between CP symmetric points in the \(D^0\) and \({\bar{D}}^0\) Dalitz plots. The bins are then continuously deformed in order to optimize sensitivity to \(\gamma \). This procedure minimizes the washout of sensitivity to \(\gamma \) that occurs when \(\Delta \delta _D\) is bin-averaged. Binning in this manner is extremely powerful and is estimated to lead to only \(\sim {10\,\mathrm{\%}}\) lower statistical sensitivity to \(\gamma \) for the choice of \(2\times 8\) bins than if the unbinned amplitude model were used [34]. (The amplitude model provides the smallest statistical error, but induces difficult-to-quantify systematical errors.)

An alternative approach, eliminating the bin dependence of the BPGGSZ method, was presented in Ref. [35]. In that work, the binning of the D decay Dalitz plot was replaced by a Fourier transform in phase space variables. Truncation of the series gives a setup with a finite number of unknown parameters, such that it is possible to extract \(\gamma \). The choice of bins is thus replaced by the choice of variables in which to perform the Fourier transform, as well as of the order at which to truncate the series. That is, some information from the Dalitz plane (in this case, the higher frequency components in the Fourier expansion) is still removed at the data processing step. Moreover, choosing a coordinate basis for the Fourier transform is driven by the same modeling of the strong phase that is used in the determination of optimized Dalitz plot bins. While the Fourier transform method has not yet been implemented on experimental data, the study of Ref. [35] indicates that it can be highly competitive with the binned BPGGSZ method.

In this manuscript, we present a proof of principle for a novel \(\gamma \) extraction method that requires neither binning nor truncation. Given the highly optimized binning procedure currently used in experimental implementations of the BPGGSZ method, our main goal here is conceptual. Our hope is that, eventually, this new method will be able to improve on the BPGGSZ method and/or the method of Ref. [35]. This is because the existing methods all require removal of a certain amount of information during data processing. While this step can be optimized, the lost information cannot be recovered by any statistical treatment. We replace this with an alternative data processing procedure that, in principle, removes no decay information. (Note, however, that not all relevant information is used in the current implementation.) Our method makes the problem of \(\gamma \) determination equivalent to the question of whether two inequivalent measurements are sampling the same function. This structure allows for the application of a wide range of nonparametric methods to the problem, replacing the optimization of the data processing step with the conceptually different procedure of test statistic optimization.

The explicit analysis we introduce in the main part of the paper indeed falls short in terms of its statistical sensitivity to \(\gamma \) compared to the methods above. Nevertheless, we find that a toy example of a method without binning or truncation is still useful, as it makes clear that such approximations can be replaced with the problem of finding an optimal test statistic (that is, an observable optimally sensitive to \(\gamma \)). We wish to stress that while each of the available methods – BPGGSZ, Ref. [35], and the methods presented in this paper – require some optimization in order to reduce the statistical error on \(\gamma \), these optimizations take on qualitatively different forms. For the BPGGSZ method, this is the number of bins and their shapes. For Ref. [35], it is the choice of Fourier transform variables and the order of truncation. For our method, it is the choice of test statistic. The question of which method can ultimately give the smallest statistical error on measurements of \(\gamma \) depends critically on this optimization step. We hope to improve on our proof-of-principle implementation with a follow-up work in which we perform such an optimization for our method.

The paper is structured as follows. In Sect. 2.1, we review the dependence of the \(B^\pm \rightarrow (K_S\pi ^-\pi ^+)_DK^\pm \) decay rates on the CKM angle \(\gamma \). In Sect. 2.2, we present a carefully chosen combination of reduced differential partial decay widths in order to extract \(\cot ^2\gamma \). In Sect. 2.3, we derive the implications for cumulative reduced partial decay widths and present the fundamental idea of our algorithm. Namely, we extract \(\gamma \) as a parameter that brings two functions of empirical cumulative probability distributions into as much agreement as possible. We implement this idea by employing a measure adapted from the Kolmogorov–Smirnov test statistic in Sect. 3. To demonstrate our proof of principle, we show numerical results based on toy Monte Carlo data in Sect. 4. Conclusions are given in Sect. 5.

2 Theory

2.1 Notation

To set the stage, we review how sensitivity to the CKM unitarity triangle angle \(\gamma \) is achieved in \(B^\pm \rightarrow DK^\pm \) transitions, following the notation of Ref. [9]. Consider the CP-conjugate cascade decays

$$\begin{aligned} B^\pm \rightarrow DK^\pm \rightarrow (K_S \pi ^- \pi ^+)_D K^\pm . \end{aligned}$$
(2)

A Dalitz plot analysis of \(K_S \pi ^- \pi ^+\), characterizing the intermediate neutral D meson, allows us to fully specify the \(\gamma \)-dependence of the process. Focusing first on the initial two-body decay, we define

(3)
(4)
(5)

\(A_B\) is real by convention, with \(\delta _B\) the difference between the strong phases of the \(D^0 K\) and amplitudes. The amplitudes in Eq. (4) carry a weak phase which agrees with \(\gamma \) in Eq. (1) up to \(O(\lambda ^4)\) corrections. These have been computed to give relative shifts in the determination of \(\gamma \) of \(\sim {2 \times 10^{-3}}\), which we ignore going forward [5]. Due to the color and CKM suppression of the amplitudes in Eq. (4), the theoretical expectation is that \(r_B\) is small \((r_B \sim {0.1}-{0.2}),\) in agreement with the experimental determination [13],

$$\begin{aligned} r_B = {}{0.0984}{}^{+0.0027}_{-0.0026}. \end{aligned}$$
(6)

The smallness of \(r_B\) reduces sensitivity to \(\gamma \) in all methods using \(B^\pm \rightarrow DK^\pm \). In the following analysis, we neglect \(D^0\) mixing, which contributes at second order in the mixing parameters and yields a subleading correction of less than 1 % in the extraction of \(\gamma \) [36].

For the subsequent three-body decay of the intermediate neutral D meson,

$$\begin{aligned} D \rightarrow K_S(p_1) \pi ^-(p_2) \pi ^+(p_3), \end{aligned}$$
(7)

we define the amplitudes

(8)
(9)

where \(s_{ij} = (p_i + p_j)^2\) is the invariant mass squared of the ij system. We define \(A_{12,13}\) and \(\delta _{12,13}\) to be real functions with \(A_{12,13} \ge 0\) and \(\delta _{12,13}\in [0, 2\pi )\). The simple relationship between the amplitudes in Eqs. (8) and (9) follows from the CP symmetry of the strong interaction and the fact that the final state \(K_S \pi ^- \pi ^+\) has zero spin. (CP violation in D decays is very small and thus is neglected in this discussion.)

The D meson is a narrow state that decays weakly, justifying the use of the narrow width approximation and the neglect of any continuum contribution. In the vicinity of the D resonance, we thus have

$$\begin{aligned}&\mathcal {A}(B^- \rightarrow (K_S \pi ^- \pi ^+)_D K^-) \nonumber \\&\quad = A_B P_D \left[ A_{12,13}e^{i\delta _{12,13}} + A_{13,12} r_B e^{i(\delta _B - \gamma + \delta _{13,12})}\right] , \end{aligned}$$
(10)
$$\begin{aligned}&\mathcal {A}(B^+ \rightarrow (K_S \pi ^- \pi ^+)_D K^+) \nonumber \\&\quad = A_B P_D \left[ A_{13,12}e^{i\delta _{13,12}} + A_{12,13} r_B e^{i(\delta _B + \gamma + \delta _{12,13})}\right] , \end{aligned}$$
(11)

where \(P_D\) is the neutral D meson propagator. In the narrow width approximation, we can write the B meson partial width for these decays in terms of Eqs. (8) and (9) as

(12)

using

$$\begin{aligned} P_D^2 = \frac{\pi }{m_D \Gamma _D} \delta (s_{123} - m_D^2), \end{aligned}$$
(13)

where \(s_{123} = (p_1 + p_2 + p_3)^2\) denotes the invariant mass squared of the \(K_S \pi ^- \pi ^+\) system. Since the \(\mathcal {A}(D)\)-independent prefactors are the same for all decays, we can ignore them by defining reduced partial decay widths

$$\begin{aligned} \frac{\textrm{d}{\hat{\Gamma }_-}}{\textrm{d}{s_{12}} \textrm{d}{s_{13}}} =&\, A_{12,13}^2 +r_B^2 A_{13,12}^2+ 2r_B A_{12,13}A_{13,12} \nonumber \\&\cos {(\delta _B - \gamma + \delta _{13,12} - \delta _{12,13})}, \end{aligned}$$
(14)
$$\begin{aligned} \frac{\textrm{d}{\hat{\Gamma }_+}}{\textrm{d}{s_{12}} \textrm{d}{s_{13}}} =&\, A_{13,12}^2 + r_B^2 A_{12,13}^2+ 2r_B A_{12,13}A_{13,12} \nonumber \\&\cos {(\delta _B + \gamma + \delta _{12,13} - \delta _{13,12})}, \end{aligned}$$
(15)

in agreement with Ref. [9]. Note that \(\hat{\Gamma }\) is dimensionless.

We can obtain additional information about the magnitudes \(A_{12,13}\) and strong phases \(\delta _{12,13}\) from D decay data. In particular, \(D^{*+} \rightarrow D^0\pi ^+\) decays tell us about \(A_{12,13}\). Since data on \(D \rightarrow K_S\pi ^-\pi ^+\) decays is abundant, this gives rather precise, direct information on \(A_{12,13}\). Measurements of the \(D \rightarrow K_S\pi ^-\pi ^+\) Dalitz plot distributions in coherent decays, where at least one of the two D decays is in the \(K_S\pi ^-\pi ^+\) channel, give model-independent information about \(\delta _{12,13}\) (though limited in statistics). Still, at the moment, determinations of D decay parameters are a subdominant source of error, almost an order of magnitude smaller than the statistical error incurred with the binned extraction of \(\gamma \) [14, 15, 37].

In our proof-of-principle unbinned method for \(\gamma \) extraction (introduced in Sect. 2.2), we will make two simplifications. First, we will assume that the cumulative distributions of the \(A_{12,13}\) functions are measured precisely enough to be treated as exactly known. Secondly, we will formulate the method such that information about the strong phases \(\delta _{12,13}\) is never required. The first simplification is made for ease of presentation: in the intermediate theory expressions, we can treat \(A_{12,13}\) as known functions, while in the final expressions only the cumulative distribution functions will be used (and errors on their determinations will certainly be subleading). We would prefer to relax the second simplification in the future, since we are clearly discarding useful information.

2.2 \(\gamma \) from reduced partial widths in theory

We now demonstrate that, by considering simple odd and even combinations of the partial widths of Sect. 2.1, the relative dependence on the parameters of the \(B^\pm \rightarrow DK^\pm \) decays take on a particularly simple form. (For an alternative use of symmetry considerations to parameterize approximate flavor symmetries in 3-body decays, see Ref. [38]). With respect to the \(s_{12} = s_{13}\) axis of the Dalitz plane, we first form even \((\Sigma )\) and odd \((\Delta )\) quantities made from the widths \(\textrm{d}{\hat{\Gamma }_{\pm }}\) given in Eqs. (14) and (15):

$$\begin{aligned}&\textrm{d}{\Sigma _{\pm }}(s_{12}, s_{13}),\ \textrm{d}{\Delta _{\pm }}(s_{12}, s_{13})\nonumber \\&\quad \equiv \frac{\textrm{d}{\hat{\Gamma }_{\pm }}(s_{12}, s_{13}) \pm \textrm{d}{\hat{\Gamma }_{\pm }}(s_{13}, s_{12})}{2}. \end{aligned}$$
(16)

Here, the subscripts correspond to the \(B^\pm \) charge and agree across both sides of the equation. Further symmetrizing and anti-symmetrizing the quantities in Eq. (16) with respect to the B meson charge yields

$$\begin{aligned} \textrm{d}{\Sigma _{S,A}}(s_{12}, s_{13})&\equiv \frac{\textrm{d}{\Sigma _+}(s_{12}, s_{13}) \pm \textrm{d}{\Sigma _-}(s_{12}, s_{13})}{2}, \end{aligned}$$
(17)
$$\begin{aligned} \textrm{d}{\Delta _{S,A}}(s_{12}, s_{13})&\equiv \frac{\textrm{d}{\Delta _+}(s_{12}, s_{13}) \pm \textrm{d}{\Delta _-}(s_{12}, s_{13})}{2}. \end{aligned}$$
(18)

The subscripts S and A on the left-hand sides of Eqs. (17) and (18) stand for symmetric and anti-symmetric and correspond to the choices of \(+\) and −, respectively. Inserting the explicit expressions given in Eqs. (14) and (15), we find

$$\begin{aligned} \frac{\textrm{d}{\Sigma _S}}{\textrm{d}{s_{12}}\textrm{d}{s_{13}}}&= \frac{1 + r_B^2}{2}(A_{12,13}^2 + A_{13,12}^2)\nonumber \\&\quad + 2r_B A_{12,13}A_{13,12}\cos {(\delta _{13,12} - \delta _{12,13})} \cos {\delta _B}\cos {\gamma }, \end{aligned}$$
(19)
$$\begin{aligned} \frac{\textrm{d}{\Sigma _A}}{\textrm{d}{s_{12}}\textrm{d}{s_{13}}}&= -2r_B A_{12,13}A_{13,12}\cos {(\delta _{13,12} - \delta _{12,13})} \sin {\delta _B} \sin {\gamma }, \end{aligned}$$
(20)
$$\begin{aligned} \frac{\textrm{d}{\Delta _S}}{\textrm{d}{s_{12}}\textrm{d}{s_{13}}}&= 2 r_B A_{12,13}A_{13,12} \sin {(\delta _{13,12} - \delta _{12,13})}\cos {\delta _B} \sin {\gamma }, \end{aligned}$$
(21)
$$\begin{aligned} \frac{\textrm{d}{\Delta _A}}{\textrm{d}{s_{12}}\textrm{d}{s_{13}}}&= \frac{1 - r_B^2}{2}(A_{13,12}^2 - A_{12,13}^2)\nonumber \\&\quad + 2r_B A_{12,13} A_{13,12} \sin {(\delta _{13,12} - \delta _{12,13})} \sin {\delta _B} \cos {\gamma }. \end{aligned}$$
(22)

We finally define the quantities \(\textrm{d}{\Sigma _S} |_{\textrm{sub}}\) and \(\textrm{d}{\Delta _A} |_{\textrm{sub}}\) by subtracting away the first terms in Eqs. (19) and (22), resulting in

$$\begin{aligned} \left. \frac{{\textrm{d}}\Sigma _S}{\textrm{d}{s_{12}}\textrm{d}{s_{13}}}\right| _{\textrm{sub}}&\equiv \frac{{\textrm{d}}\Sigma _S}{\textrm{d}{s_{12}}\textrm{d}{s_{13}}} - \frac{1 + r_B^2}{2}(A_{12,13}^2 + A_{13,12}^2)\nonumber \\&= 2r_B A_{12,13}A_{13,12}\cos {(\delta _{13,12} - \delta _{12,13})} \cos {\delta _B}\cos {\gamma }, \end{aligned}$$
(23)
$$\begin{aligned} \left. \frac{{\textrm{d}}\Delta _A}{\textrm{d}{s_{12}}\textrm{d}{s_{13}}}\right| _{\textrm{sub}}&\equiv \frac{{\textrm{d}}\Delta _A}{\textrm{d}{s_{12}}\textrm{d}{s_{13}}} -\frac{1 - r_B^2}{2}(A_{13,12}^2 - A_{12,13}^2) \nonumber \\&= 2r_B A_{12,13} A_{13,12} \sin {(\delta _{13,12} - \delta _{12,13})} \sin {\delta _B} \cos {\gamma }. \end{aligned}$$
(24)

Note that the subtracted quantities above are not directly observed experimentally. However, while the terms subtracted away from Eqs. (19) and (22) depend on the D decay amplitudes and \(r_B\), they do not depend on \(\gamma \). Moreover, the subtracted terms only depend on \(r_B\) quadratically, with the effect of subtracting using an incorrectly determined value of \(r_B\) suppressed. As explained above, for our proof-of-principle demonstration, we treat \(A_{12,13}\) as a known function of \(s_{12}\), \(s_{13}\), determined with good enough precision from D decay data, while \(r_B\) is a parameter which we fit within the method. Note further that the above subtraction is equivalent to replacing the decay widths given in Eqs. (14) and (15) by

$$\begin{aligned} \left. \frac{\textrm{d}{\hat{\Gamma }_-}}{\textrm{d}{s_{12}} \textrm{d}{s_{13}}}\right| _{\textrm{sub}}&\equiv \frac{\textrm{d}{\hat{\Gamma }_-}}{\textrm{d}{s_{12}} \textrm{d}{s_{13}}} - ( A_{12,13}^2 + r_B^2 A_{13,12}^2), \end{aligned}$$
(25)
$$\begin{aligned} \left. \frac{\textrm{d}{\hat{\Gamma }_+}}{\textrm{d}{s_{12}} \textrm{d}{s_{13}}}\right| _{\textrm{sub}}&\equiv \frac{\textrm{d}{\hat{\Gamma }_+}}{\textrm{d}{s_{12}} \textrm{d}{s_{13}}} - (A_{13,12}^2 + r_B^2 A_{12,13}^2), \end{aligned}$$
(26)

and then forming combinations of them analogous to Eqs. (16)–(18).

It is easy to see that the ratios

$$\begin{aligned} \left. \frac{{\textrm{d}}\Sigma _S}{\textrm{d}{s_{12}}\textrm{d}{s_{13}}}\right| _{\textrm{sub}} \Bigg / \frac{{\textrm{d}}\Sigma _A}{\textrm{d}{s_{12}}\textrm{d}{s_{13}}}&= -\cot {\delta _B}\cot {\gamma }, \end{aligned}$$
(27)
$$\begin{aligned} \left. \frac{{\textrm{d}}\Delta _A}{\textrm{d}{s_{12}}\textrm{d}{s_{13}}}\right| _{\textrm{sub}} \Bigg / \frac{{\textrm{d}}\Delta _S}{\textrm{d}{s_{12}}\textrm{d}{s_{13}}}&= \tan {\delta _B} \cot {\gamma }, \end{aligned}$$
(28)

take on constant values in the Dalitz plane. Furthermore, the product of these two ratios,

$$\begin{aligned} \left( \left. \frac{{\textrm{d}}\Sigma _S}{\textrm{d}{s_{12}}\textrm{d}{s_{13}}}\right| _{\textrm{sub}} \left. \frac{{\textrm{d}}\Delta _A}{\textrm{d}{s_{12}}\textrm{d}{s_{13}}}\right| _{\textrm{sub}}\right) \!\Bigg /\! \left( \frac{{\textrm{d}}\Sigma _A}{\textrm{d}{s_{12}}\textrm{d}{s_{13}}} \frac{{\textrm{d}}\Delta _S}{\textrm{d}{s_{12}}\textrm{d}{s_{13}}}\right) = -\cot ^2{\gamma },\nonumber \\ \end{aligned}$$
(29)

allows us direct access to \(\gamma \) up to a four-way degeneracy equivalent to that in Ref. [9]. The relations in Eqs. (27) and (28) are what form the basis of the unbinned method for extracting \(\gamma \) described in the subsequent sections of this work.

2.3 \(\gamma \) from cumulative reduced partial widths in practice

Our observations consist of individual B decay events and not continuous distributions. To make connections with the results above, we introduce the cumulative reduced partial decay widths, defined asFootnote 1

$$\begin{aligned} R_{\pm }(s_{12}, s_{13}) \equiv \int _0^{s_{12}} \textrm{d}{s_{12}'} \int _0^{s_{13}} \textrm{d}{s_{13}'} \frac{\textrm{d}{\hat{\Gamma }_{\pm }}}{\textrm{d}{s_{12}'}\textrm{d}{s_{13}'}}, \end{aligned}$$
(30)

where, outside the physical region of the Dalitz plot, \(\textrm{d}{\hat{\Gamma }_{\pm }}/(\textrm{d}{s_{12}} \textrm{d}{s_{13}})= 0\). These functions are monotonically increasing, with limiting values \(R_{\pm }(0, 0) = 0\) and \(R_{\pm }(s_{12}, s_{13} > (m_D - m_\pi )^2) = \hat{\Gamma }_{\pm }^{(\text {tot})}\), where

$$\begin{aligned} \hat{\Gamma }_{\pm }^{(\text {tot})} = \int \textrm{d}{\hat{\Gamma }_{\pm }} (s_{12}, s_{13})\, \end{aligned}$$
(31)

is the reduced partial width of \(B^\pm \rightarrow DK^\pm \). Note that \(\hat{\Gamma }_{\pm }^{(\text {tot})}\) is not, in general, normalized to unity, and, thus, resulting objects do not directly correspond to cumulative distribution functions. The functions \(R_{\pm }(s_{12}, s_{13})\) continue to vary outside of the physical Dalitz region, i.e., in the rectangle around the physical region of the Dalitz plot.

With many observed \(B^\pm \) decay events, Eq. (30) will be approached, up to an overall constant, by the counting functions

$$\begin{aligned} N_{\pm }(s_{12}, s_{13}) = \sum _{\begin{array}{c} i_{\pm }< s_{12}\\ j_{\pm } < s_{13} \end{array}} 1, \end{aligned}$$
(32)

where the index \(i_{\pm }\) \((j_{\pm })\) runs over the \(s_{12}\) \((s_{13})\) values of all observed decays. \(N_{\pm }(s_{12}, s_{13})\) are thus the number of observed \(B^\pm \rightarrow DK^\pm \) decays with \((p_K + p_{\pi ^-})^2 < s_{12}\) and \((p_K + p_{\pi ^+})^2 < s_{13}\). They are nonzero almost everywhere in the Dalitz plane and can be constructed with no binning and with minimal processing of the data. These functions are approximations of the continuous partial decay widths, such that, in the limit of large \(N_{\pm }^{\text {(tot)}}\), the total number of \(B^\pm \rightarrow DK^\pm \) events, we have

$$\begin{aligned} \frac{R_{\pm }(s_{12}, s_{13})}{\hat{\Gamma }_{\pm }^{(\text {tot})}} \approx \frac{N_{\pm }(s_{12}, s_{13})}{N_{\pm }^{(\text {tot})}}. \end{aligned}$$
(33)

We now bring Eq. (32) through the procedure laid out in Sect. 2.2, namely, we symmetrize and anti-symmetrize the function with respect to phase space and initial B meson charge. This gives

$$\begin{aligned} R_{\Sigma S, \Sigma A}(s_{12}, s_{13})&= \frac{1}{2}(\Sigma _+(s_{12}, s_{13}) \pm \Sigma _-(s_{12}, s_{13})), \end{aligned}$$
(34)
$$\begin{aligned} R_{\Delta S, \Delta A}(s_{12}, s_{13})&= \frac{1}{2}(\Delta _+(s_{12}, s_{13}) \pm \Delta _-(s_{12}, s_{13})), \end{aligned}$$
(35)

with

$$\begin{aligned}{} & {} \Sigma _{\pm }(s_{12}, s_{13}),\ \Delta _{\pm }(s_{12}, s_{13})\nonumber \\{} & {} \qquad = \frac{N_{\pm }(s_{12}, s_{13}) \pm N_{\pm }(s_{13}, s_{12})}{2}, \end{aligned}$$
(36)

which are the cumulative versions of the functions in Eqs. (16)–(18). We may also find the cumulative versions of Eqs. (23) and (24) by subtracting away the appropriate cumulative flavor-tagged \(D^0\) meson decays. To do this, we define

$$\begin{aligned} N_-(s_{12}, s_{13})|_{\textrm{sub}}&= N_-(s_{12}, s_{13}) - r_-\frac{N_-^{(\text {tot})}}{N_D^{(\text {tot})}} (N_D(s_{12}, s_{13}) \nonumber \\&\quad + r_B^2 N_D(s_{13}, s_{12})), \end{aligned}$$
(37)
$$\begin{aligned} N_+(s_{12}, s_{13})|_{\textrm{sub}}&= N_+(s_{12}, s_{13}) - r_+\frac{N_+^{(\text {tot})}}{N_D^{(\text {tot})}} (N_D(s_{13}, s_{12})\nonumber \\&\quad + r_B^2 N_D(s_{12}, s_{13})), \end{aligned}$$
(38)

where

$$\begin{aligned} r_{\pm } = \frac{\int \textrm{d}{s_{12}}\textrm{d}{s_{13}} |\mathcal {A}(D^0 \rightarrow K_S \pi ^- \pi ^+)|^2}{\hat{\Gamma }_{\pm }^{(\text {tot})}}, \end{aligned}$$
(39)

is necessary for the proper normalization of Eqs. (37) and (38). The forms of the subtracted parts in Eqs. (37) and (38) follow from Eqs. (25) and (26). Here, \(N_D^{(\text {tot})}\) is the total number of (independently) observed \(D^0 \rightarrow K_S \pi ^- \pi ^+\) events, while \(N_D(s_{12}, s_{13})\) is defined in exactly the same way as \(N_{\pm }(s_{12}, s_{13})\) in Eq. (32) except with summations now over all \(D^0 \rightarrow K_S \pi ^- \pi ^+\) events.

Note that the normalization factor \(r_{\pm }\) in Eq. (39) is measured in the experiment and is simply given by the ratio of measured \(D^0 \rightarrow K_S\pi ^-\pi ^+\) and (reduced) \(B^\pm \rightarrow (K_S\pi ^-\pi ^+)_D K^\pm \) decay rates. For the purposes of our proof-of-principle demonstration, we set \(r_{\pm }\) to its expected measured value as predicted by \(\gamma \), \(\delta _B\), and \(r_B\) values and do not take into account experimental errors. Likewise, for the purposes of the calculation of \(r_{\pm }\), we use the \(D^0 \rightarrow K_S\pi ^-\pi ^+\) amplitude model of Ref. [39], including the strong phases. However, in the algorithm itself, we use the discrete data \(N_D(s_{12},s_{13})\) only and no phase information. The experimental determination of \(N_D(s_{12},s_{13})\) can make use of both \(D^0\) and decays, since, in the above discussion, we neglect CP violation in D decays. Note also that the discussion of experimental effects – such as backgrounds, efficiencies, and resolutions – is outside of the scope of the present manuscript.

In the next step, we use Eqs. (37) and (38) in Eqs. (34)–(36) to get the cumulative functions \(R_{\Sigma S}|_{\textrm{sub}}\) and \(R_{\Delta A}|_{\textrm{sub}}\). Due to the linearity of the integral operation in Eq. (30), the relations in Eqs. (27)–(29) all hold if we replace the functions therein with their corresponding cumulative counterparts. That is, we have

$$\begin{aligned} \frac{R_{\Sigma S}|_{\textrm{sub}}}{R_{\Sigma A}} = -\cot \delta _B \cot \gamma , \quad \frac{R_{\Delta A}|_{\textrm{sub}}}{R_{\Delta S}} = \tan \delta _B \cot \gamma , \end{aligned}$$
(40)

and

$$\begin{aligned} \left( \frac{R_{\Sigma S}|_{\textrm{sub}}}{R_{\Sigma A}}\right) \left( \frac{R_{\Delta A}|_{\textrm{sub}}}{R_{\Delta S}}\right) = -\cot ^2 \gamma . \end{aligned}$$
(41)

Stopping to examine the ratios in Eq. (40) closely, we emphasize the following observation: Up to a \(\delta _B\)- and \(\gamma \)-dependent rescaling, the cumulative functions within each \(R_{\Sigma }\) and \(R_{\Delta }\) pair are the same. For instance, note that the first equation in Eq. (40) tells us that \(R_{\Sigma S}|_\mathrm{{sub}} = -\cot \delta _B \cot \gamma \, R_{\Sigma A}\), where the left-hand side depends only on a correct determination of \(r_B\) (due to the subtraction, see the first line in Eq. (23)) and the right-hand side only on \(\delta _B\) and \(\gamma \). As a result, we recast the problem of measuring \(\gamma \) as one of finding values of \(\gamma \), \(\delta _B\), and \(r_B\) that give the two pairs of functions \(R_{\Sigma }\) and \(R_\Delta \), rescaled following Eq. (40), the highest statistical significance of having been drawn from the same underlying distributions. Note that, below, we make use of Eq. (40) rather than Eq. (41), since Eq. (40) allows for the extraction of all three parameters \(\gamma \), \(\delta _B\), and \(r_B\), while Eq. (41) is sensitive only to \(\gamma \) and \(r_B\).

3 Extraction of \(\gamma \) as an optimization problem

The strategy of Sect. 2.3 for extracting \(\gamma \), \(\delta _B\), and \(r_B\) is condensed in Eq. (40). One way to practically employ this equation is to vary these three parameters such that the functions

$$\begin{aligned} D_{\Sigma }(\gamma , \delta _B, r_B)&\equiv \max _{s_{12},s_{13}} \big |R_{\Sigma S}|_{\textrm{sub}}(s_{12},s_{13})\nonumber \\&\quad - (-\cot \delta _B \cot \gamma ) R_{\Sigma A}(s_{12},s_{13})\big |, \end{aligned}$$
(42)
$$\begin{aligned} D_\Delta (\gamma , \delta _B, r_B)&\equiv \max _{s_{12},s_{13}} \big |R_{\Delta A}|_{\textrm{sub}}(s_{12},s_{13}) \nonumber \\&\quad - (\tan \delta _B \cot \gamma ) R_{\Delta S}(s_{12},s_{13}) \big |, \end{aligned}$$
(43)

are minimized. The respective minima of \(D_{\Sigma }(\gamma , \delta _B, r_B)\) and \(D_\Delta (\gamma , \delta _B, r_B)\) should be reached for the same values of \(\gamma \), \(\delta _B\), and \(r_B\), within statistical uncertainties.

This procedure is analogous to the minimization of the Kolmogorov–Smirnov (KS) test statistic. In its original, one-dimensional formulation, the KS test takes two empirical cumulative distribution functions (CDFs), \(F_{1}(x)\) and \(F_{2}(x)\), and computes, as its test statistic, their maximum difference:

$$\begin{aligned} D^{\textrm{KS}} \equiv \max _x |F_{1}(x) - F_{2}(x)|. \end{aligned}$$
(44)

Two-dimensional realizations of the KS test, which are most relevant to our functions in Eqs. (42) and (43), have been described in Refs. [40,41,42]. Note importantly that Eqs. (42) and (43) are not exactly KS test statistics, as, unlike in the KS test, the functions \(R_{\Sigma S}|_{\textrm{sub}}\), \(R_{\Sigma A}\), \(R_{\Delta A}|_{\textrm{sub}}\), and \(R_{\Delta S}\) are not CDFs because they are not positive definite.

One complication that arises for two-dimensional cumulative functions is how to deal with the orientation of the integration in Eq. (30). That is, including Eq. (30), we may define \(R_{\pm }(s_{12}, s_{13})\) in an infinite number of ways, each one an equally valid alternative to constructing test statistics \(D_{\Sigma }^R(\gamma , \delta _B, r_B)\) and \(D_\Delta ^R(\gamma , \delta _B, r_B)\). This freedom in unbinned methods of extracting \(\gamma \) corresponds to the infinite number of possible binnings of phase space in the binned methods. In this present work, we limit our analysis to two additional definitions of cumulative \(R_{\pm }(s_{12}, s_{13})\) functions, given by

(45)
$$\begin{aligned} \widetilde{R}_{\pm }(s_{12}, s_{13})&\equiv \int _{s_{12}}^\infty \textrm{d}{s_{12}'} \int _{s_{13}}^\infty \textrm{d}{s_{13}'} \frac{\textrm{d}{\hat{\Gamma }_{\pm }}}{\textrm{d}{s_{12}'}\textrm{d}{s_{13}'}}. \end{aligned}$$
(46)

We recall that, outside the Dalitz plot, \(\textrm{d}{\hat{\Gamma }_{\pm }}/(\textrm{d}{s_{12}}\textrm{d}{s_{13}}) = 0\). Generalizing Eqs. (42) and (43) in this fashion, this means that we additionally minimize

$$\begin{aligned}&\widetilde{D}_{\Sigma }(\gamma , \delta _B, r_B)\nonumber \\&\quad = \max _{s_{12}, s_{13}} \left| {\widetilde{R}_{\Sigma S}|_{\textrm{sub}}(s_{12}, s_{13}) - (-\cot {\delta _B}\cot {\gamma }) \widetilde{R}_{\Sigma A}(s_{12}, s_{13})}\right| , \end{aligned}$$
(47)
$$\begin{aligned}&\widetilde{D}_\Delta (\gamma , \delta _B, r_B)\nonumber \\&\quad =\max _{s_{12}, s_{13}} \left| {\widetilde{R}_{\Delta A}|_{\textrm{sub}}(s_{12}, s_{13}) -(\tan {\delta _B}\cot {\gamma }) \widetilde{R}_{\Delta S}(s_{12}, s_{13}) }\right| , \end{aligned}$$
(48)

as well as the analogously-defined and test statistics. In total, we therefore consider three out of the infinite different integration orderings, i.e., the orderings specified in Eqs. (30), (45) and (46). For finite data, the unbinned extraction of \(\gamma \) works most effectively if one takes many more orderings into account and chooses the one that best minimizes the corresponding \(D_{\Sigma }(\gamma , \delta _B, r_B)\) and \(D_\Delta (\gamma , \delta _B, r_B)\) functions.

To compute the and \(\widetilde{R}\) versions of \(R_{\Sigma S}|_{\textrm{sub}}\), \(R_{\Sigma A}\), \(R_{\Delta A}|_{\textrm{sub}}\), and \(R_{\Delta S}\), we follow the steps laid out in Sect. 2.3 but use, respectively, the following modified versions of the counting function in Eq. (32):

(49)

The forms of these functions follow trivially from the integrations in Eqs. (45) and (46). Using the D meson decay data, we may also define and \(\widetilde{N}_{D}(s_{12}, s_{13})\) in the same way as above for the computation of the and \(\widetilde{R}\) versions of Eqs. (37) and (38).

In summary, we extract a measurement of \(\gamma \) as follows. Varying \(\gamma \), \(\delta _B\), and \(r_B\), we determine the locations of the minima

$$\begin{aligned} D_{\textrm{min}}&= \min _{\gamma , \delta _B, r_B}\left( D_{\Sigma }^R(\gamma , \delta _B, r_B) + D_\Delta ^R(\gamma , \delta _B, r_B)\right) , \end{aligned}$$
(50)
$$\begin{aligned} \widetilde{D}_{\textrm{min}}&= \min _{\gamma , \delta _B, r_B}\left( D_{\Sigma }^{\widetilde{R}}(\gamma , \delta _B, r_B) + D_\Delta ^{\widetilde{R}}(\gamma , \delta _B, r_B)\right) , \end{aligned}$$
(51)
(52)

As the minima of \(D_{\Sigma }(\gamma , \delta _B, r_B)\) and \(D_\Delta (\gamma , \delta _B, r_B)\) are at the same point for each choice of R, we combine the functions in the above fashion for symmetry reasons. With infinite data, we would expect to achieve

(53)

for a particular set of parameter values \(\gamma \), \(\delta _B\), and \(r_B\). With finite data, some integration orderings will achieve more effective minimizations in Eqs. (50)–(52), meaning that deeper minima are attained. We therefore identify the “best” of the integration orderings by the one that gives the deepest minimum, corresponding to the most reliable value of \(\gamma \), which we quote as our final result.

4 Numerical results

Fig. 1
figure 1

Histograms of the global minima for \(\gamma \) (a), \(\delta _B\) (b), and \(r_B\) (c) obtained by applying our method to 1000 sets of toy Monte Carlo data. The input parameters are fixed to the values shown in Table 1. Each of the Monte Carlo samples has 7000 \(B^+\) decay events, 7000 \(B^-\) decay events, and 5500 (independently-measured) D decay events

As a proof of principle of our unbinned methodology, we generate 1000 sets of toy Monte Carlo Dalitz plots with fixed input values for \(\gamma \), \(\delta _B\), and \(r_B\) given in Table 1. To do this, we implement the Dalitz plot amplitude model for \(D \rightarrow K_S\pi ^-\pi ^+\) from Ref. [39]; see Refs. [43, 44] for further details.Footnote 2 We apply our unbinned procedure to each of these sets of generated data, arriving at 1000 extractions of the three parameters \(\gamma \), \(\delta _B\), and \(r_B\) using Eqs. (50)–(52). Each of the Monte Carlo samples has 7000 \(B^+\) decay events, 7000 \(B^-\) decay events, and 5500 (independently-measured) D decay events. Note that we use input values for \(\gamma \), \(\delta _B\), and \(r_B\) that are quite far away from the values realized in nature. This is because the main goal of the present Monte Carlo study is to demonstrate that an unbinned extraction of \(\gamma \) is possible. From the outset, it is clear that, in its present form, this model-independent unbinned method is much less sensitive to \(\gamma \) than the optimized binned one is.

In order to identify the most effective integration ordering, we calculate the average values of \(D_{\textrm{min}}\), \(\widetilde{D}_{\textrm{min}}\), and over these 1000 Dalitz plots and choose the smallest one. For the optimal integration ordering, we histogram the obtained global minima and extract our measurements of \(\gamma \), \(\delta _B\), and \(r_B\) as the averages of these histograms. The respective errors are given by the left and right bounds on the middle \({68\,\mathrm{\%}}\) of entries in each histogram, resulting, in general, in asymmetric errors. In future experimental analyses, this statistical treatment can be replaced by a more sophisticated procedure.

In order to find the global minimum of each set of generated data according to Eqs. (50)–(52), we vary \(\gamma \), \(\delta _B\), and \(r_B\) simultaneously. We vary \(\gamma \) in steps of \(2^{\circ }\) in the interval \([91^{\circ }, 179^{\circ }]\) and \(\delta _B\) in steps of \(2^{\circ }\) in the interval \([1^{\circ }, 89^{\circ }]\). That is, we search for the global minimum in the complete quadrant of the input values given in Table 1. Further, we vary \(r_B\) in steps of 0.01 in the interval [0.8, 1.0], corresponding to the value of \(r_B = 0.9\) in Table 1. These choices were made because of the high computational cost of computing the functions in Eqs. (42) and (43) for R, , and \(\widetilde{R}\) (we perform simulations on a personal laptop), as well as some degeneracies that appear because Eq. (40) only constrains trigonometric functions of \(\gamma \) and \(\delta _B\). For example, one such degeneracy occurs due to the fact that

$$\begin{aligned} \cot (\pi - \theta ) = -\cot \theta , \quad \tan (\pi - \theta ) = -\tan \theta . \end{aligned}$$
(54)

Additionally, because it is impractical to implement a computation of Eqs. (32) and (49) at all points \((s_{12}, s_{13})\), we instead sample the functions at a discrete set of points in the rectangle surrounding the physical region of the Dalitz plot and interpolate. In our implementation, we use grid points with a horizontal and vertical separation of 0.01 GeV\(^{2}\).

Table 1 Input values for the parameters \(\gamma \), \(\delta _B\), and \(r_B\) used for the generation of the toy Monte Carlo data and the corresponding output of the implementation of our unbinned algorithm

We show the results of these computations for our considered scenario in Table 1. In this case, the optimal ordering turns out to be R. In particular, although the average value of \(\widetilde{D}_{\textrm{min}}\) is nearly the same as that of \(D_{\textrm{min}}\), the average value of is larger than that of \(\widetilde{D}_{\textrm{min}}\) and \(D_{\textrm{min}}\) by about an order of magnitude. We give the resulting histograms of the 1000 extracted values of \(\gamma \), \(\delta _B\), and \(r_B\) in Fig. 1. From Table 1, we see that the output achieved by our unbinned analysis technique agrees with the input.

5 Conclusions

In this work, we have introduced a new, model-independent, unbinned method designed to extract \(\gamma \) from \(B^\pm \rightarrow DK^\pm \rightarrow (K_S \pi ^- \pi ^+)_D K^\pm \) decays. This development contributes to the long-term effort towards achieving ultimate precision in the determination of \(\gamma \), enabling unprecedented tests of the Standard Model. It is presently unclear if our method can provide a superior statistical error as compared to the other two theoretically clean methods. As mentioned in the introduction, each approach requires a different kind of statistical optimization. For us, the required optimization is in the choice of test statistic, which can be formed from variants of cumulative distribution functions such as Eq. (30).

On the other hand, our method does not involve additional optimization of auxiliary variables that specify the analysis (such as shapes of bins or Fourier modes), unlike both the classic BPGGSZ method and the method of Ref. [35]. However, we are still in the early stages of developing a competitive alternative. Using toy Monte Carlo data, we have demonstrated as a proof of principle that this method returns values for \(\gamma \), \(\delta _B\), and \(r_B\) that are in agreement with the input values chosen for the generated data. Future work is required to see how one can optimize the test statistic for this particular method.

Our method is not yet optimized for the highest sensitivity to \(\gamma \) since it does not include all the possibly relevant observables. For instance, by forming ratios such as those in Eq. (27), one reduces the number of observables sensitive to \(\gamma \). That not all of the available information is being used is further signaled by the fact that this method requires no information about the phases of the D decay amplitudes, while these phases are an integral part of the more sensitive binned methods. The competitiveness of the unbinned method could thus be greatly enhanced in the future by extending this approach to include data from correlated charm decays.

Just as the effectiveness of binned methods depends on the choice of the binning, the effectiveness of our unbinned method depends on the integration ordering of the cumulative functions. In principle, there are an infinite number of ways to perform this integration over the Dalitz plot. As such, it is not presently clear whether the binned or the unbinned methods will ultimately give the most competitive results.