First measurement of the CKM angle $\phi_3$ with $B^{\pm}\to D(K_{\rm S}^0\pi^+\pi^-\pi^0)K^{\pm}$ decays

We present the first model-independent measurement of the CKM unitarity triangle angle $\phi_3$ using $B^{\pm}\to D(K_{\rm S}^0\pi^+\pi^-\pi^0)K^{\pm}$ decays, where $D$ indicates either a $D^{0}$ or $\overline{D}^{0}$ meson. Measurements of the strong-phase difference of the $D \to K_{\rm S}^0\pi^+\pi^-\pi^0$ amplitude obtained from CLEO-c data are used as input. This analysis is based on the full Belle data set of $772\times 10^{6}$ $B\overline{B}$ events collected at the $\Upsilon(4S)$ resonance. We obtain $\phi_3 = (5.7~^{+10.2}_{-8.8} \pm 3.5 \pm 5.7)^{\circ}$ and the suppressed amplitude ratio $r_{B} = 0.323 \pm 0.147 \pm 0.023 \pm 0.051$. Here the first uncertainty is statistical, the second is the experimental systematic, and the third is due to the precision of the strong-phase parameters measured from CLEO-c data. The 95% confidence interval on $\phi_3$ is $(-29.7,~109.5)^{\circ}$, which is consistent with the current world average.


Introduction
The description of CP violation in the standard model (SM) can be tested via measurements of observables related to the Cabibbo-Kobayashi-Maskawa (CKM) matrix [1,2]. One such test is the measurement of the unitarity-triangle angle φ 3 ≡ arg(−V ud V * ub /V cd V * cb ), also denoted as γ. Here, V ij refers to the CKM matrix element. The angle φ 3 is accessible through tree-level amplitudes, and the associated theoretical uncertainty is negligible O(10 −7 ) [3]. A comparison of the direct measurements of φ 3 with the value inferred from other measurements related to the CKM matrix [4], which are more likely to be influenced by beyond-SM physics [5,6], provides a probe for new physics. The current experimental uncertainty on φ 3 [4] limits such tests, motivating more precise measurements of the angle.
The measurement of φ 3 is possible when there is interference between the transitions b → cus and b → ucs. This is the case in the decay B + → DK + , where D is a neutral charm meson decaying to a final state common to both D 0 and D 0 . Here and elsewhere in this paper, inclusion of charge-conjugate final states is implied unless explicitly stated otherwise. Currently, the most precise measurement of φ 3 [7] exploits the self-conjugate final state D → K 0 S π + π − , where the CP asymmetry in different regions of the D meson Dalitz plot is measured to determine φ 3 [8,9]. The analysis requires knowledge of the strong-phase difference between the D 0 and D 0 decay amplitudes, and measured values of the strong-phase differences averaged over Dalitz plot bins are used as input [10]. Given the success of such analyses in obtaining φ 3 [7,11], other self-conjugate final states can be studied in a similar fashion to improve the determination of φ 3 .
In this paper, we present the first measurements of the decay B + → D(K 0 S π + π − π 0 )K + to determine φ 3 using the same formalism as with B + → D(K 0 S π + π − )K + [8,9]. The decay D → K 0 S π + π − π 0 is a suitable addition because it has a branching fraction of 5.2% [12], which is large compared to that of other multibody final states including K 0 S π + π − . The decay occurs through many intermediate resonances, such as K 0 S ω and K * ± ρ ∓ , that lead to variations of the strong-phase difference over the phase space, a requirement for extracting φ 3 from a single final state. However, a significant complication is that the four-body final state requires a binning of the five-dimensional D phase space, as opposed to a twodimensional Dalitz plot for the three-body case. The measurement is performed with an e + e − collision data sample collected by the Belle detector at a centre-of-mass energy corresponding to the Υ(4S) resonance. The sample contains 772 × 10 6 BB events corresponding to an integrated luminosity of 711 fb −1 . As an input to the analysis, we use the strong-phase difference measurements in phase space bins [13] obtained from an analysis of CLEO-c [14][15][16][17] The remainder of this paper is arranged as follows. Section 2 describes the formalism of the method for measuring φ 3 . The inputs derived from CLEO-c data and the Belle data and detector are described in sections 3 and 4, respectively, after which an overview of the analysis strategy is presented, in section 5. The event selection criteria are given in section 6, and the signal yield determination in the flavour-tagged D sample, which is a required input to the analysis, is presented in section 7. The measurement of CP violation in the B sample in bins of the D phase space is explained in section 8 and the related systematic uncertainty estimation is listed in section 9. The extraction of the physics parameter φ 3 and the average of this result with previous Belle measurements are presented in section 10, before conclusions given in section 11.
2 Formalism for measuring φ 3 with B + → D(K 0 S π + π − π 0 )K + decays The determination of φ 3 from B + → DK + decays, where the D meson decays to a multibody self-conjugate final state, can be performed via two methods: model-dependent and -independent. The model-dependent method requires a model of the amplitudes describing 1 Normal activities of the CLEO Collaboration ceased in 2012. However, access to the data was still possible for former CLEO Collaboration members and several results have been published. Any such publication, such as ref. [13] are not official results of the CLEO Collaboration. Hence we refer to results from CLEO-c data rather than from the CLEO Collaboration.
the intermediate resonances and partial waves, assumed to be contributing to the decay, to be fitted to the distribution of events over the D phase space. Model assumptions used in the method lead to a difficult determination of systematic uncertainty and may limit the precision of the φ 3 measurement, to as much as ±8-9 • [18]. On the other hand, the model-independent method requires that measurements of CP -violating asymmetries are made in distinct regions of D meson phase space, which we refer to as bins. The binning reduces the statistical precision compared to the model-dependent method, but the uncertainty related to model assumptions is removed by using independent measurements of the average strong-phase differences, bin-by-bin. The average strong-phase differences can be determined using e + e − collision data at the open-charm threshold, which has been done for D 0 → K 0 S π + π − π 0 [13]. Therefore, given its systematic robustness, we follow the model-independent approach. We introduce the method in the remainder of this section.
The amplitude for the decay B + → DK + , D → f , where f is a common multibody final state from the D 0 and D 0 decay, can be written as where A is the amplitude for D 0 → f at a point in the allowed phase space D, A is the amplitude for D 0 → f at the same point in phase space, r B is the ratio of the absolute values of B + → D 0 K + and B + → D 0 K + decay amplitudes, and δ B is the strong-phase difference between the two B → DK amplitudes. Thus, the probability density for a decay at a point in D is where δ D and δ D are the strong phases for D 0 → f and D 0 → f decays, respectively. With this, eq. 2.2 becomes where P = |A| 2 , P = |A| 2 , x + = r B cos(δ B + φ 3 ), y + = r B sin(δ B + φ 3 ), C = cos ∆δ D and S = sin ∆δ D . For the charge-conjugate mode, B − → DK − , the density is given by the same expression, with A ↔ A and φ 3 → −φ 3 , which leads to the definitions . The partial decay rates for B ± → DK ± within the i th bin of D, which corresponds to a subset of phase space D i , are where K i and K i are the fractions of flavour-tagged D 0 and D 0 events in D i and h is the common normalization factor. A sample of D 0 → K 0 S π + π − π 0 candidates from D * + → D 0 π + decays, where the charge of the pion from the D * + decay tags the flavour of the D meson, are used to determine values of K i and K i . The c i and s i parameters are the amplitude-weighted averages of C and S over the region D i . The c i parameter is defined as and the definition of s i is the same, with C being replaced by S. Therefore, with c i , s i , K i , and K i given as external inputs to the analysis, one can determine x ± , y ± and h from a set of partial decay rate measurements, when D is divided into three or more bins. The loss of statistical precision can be mitigated by increasing the number of bins; with an increased number of bins, however, the uncertainty on the external inputs also increases, limiting the precision of the measurement. The method by which the values of x ± and y ± are used to constrain φ 3 , r B and δ B is described in section 10.

External measurements of c i and s i
The values of c i and s i for the decay D → K 0 S π + π − π 0 have been determined using e + e − collision data collected at a centre-of-mass energy corresponding to the ψ(3770) resonance [13]. The quantum correlations between the neutral D mesons produced in decays of the ψ(3770) are exploited to extract the strong-phase differences in bins of the phase space. This fourbody final state has a five-dimensional phase space, which was divided into nine exclusive bins, selected to contain different intermediate resonances, thus minimizing the strongphase variation within the bin as much as possible. The sensitivity of the binning could be improved upon using an amplitude model of D 0 → K 0 S π + π − π 0 , which is unavailable at present. The binning scheme is listed in table 1. In each successive bin, only events that do not belong to the previous bins are selected (e.g. bin 2 is populated by events with m K 0 S π − and m π + π 0 within the denoted intervals, and m π + π − π 0 not in the denoted interval for bin 1). The bins are thus exclusive.
Certain constraints are imposed in the fit, which arise from the nature of the symmetry between the bins, to extract c i and s i parameters. Bins 1, 6 and 9 are CP self-conjugate, which implies s 1 = 0, s 6 = 0, s 9 = 0.
(3.1) Bin 9 is CP self-conjugate because the region corresponding to the sum of bins 1 to 8 is CP self-conjugate. Bins 2 and 3, 4 and 5, and 7 and 8 are pairwise CP -conjugate, which imposes a relation between their s i values: where i = 2, 4 and 7. The results for c i and s i are summarized in table 2 and are shown in figure 1. In the analysis we use the same binning scheme so that c i and s i can be taken as external inputs in determining x ± and y ± .
Bin no.
Bin region Remainder -- Table 1. Specifications of the nine exclusive bins of D → K 0 S π + π − π 0 phase space. Here m L and m U are the lower and upper limit, respectively, of the invariant mass in each region [13].

Bin no.
c i s i  Table 2. Values of c i and s i reported in ref. [13]. The uncertainties are statistical and systematic, respectively. The s i results marked by * in bins 3, 5 and 8 are derived from those in other bins, according to the constraints of eq. (3.2). The statistical uncertainty on these s i values include contribution from K i values according to the error propagation formalism.

Data samples and the Belle detector
We use an e + e − collision data sample containing 772 × 10 6 BB events collected by the Belle detector at a centre-of-mass energy corresponding to the pole of the Υ(4S) resonance. Monte Carlo (MC) simulated samples are used to optimize the selection, determine selection efficiencies, and identify sources of background. The MC samples of signal and background processes are generated using EvtGen [19] with the GEANT [20] package being subsequently used to model the detector response to the decay products. PHOTOS [21] incorporates effects due to final-state radiation from charged particles.  Figure 1. Values of c i and s i reported in ref. [13]. The black and red error bars represent statistical and systematic uncertainties, respectively.
The Belle detector [22,23] was located at the interaction point of the KEKB asymmetricenergy e + e − collider [24,25]. The detector subsystems relevant for this study are: the silicon vertex detector (SVD) and central drift chamber (CDC), for charged particle tracking and measurement of energy loss due to ionization (dE/dx); the aerogel threshold Cherenkov counters (ACC) and time-of-flight (TOF) scintillation counters, for particle identification (PID); and the electromagnetic calorimeter (ECL) consisting of an array of CsI(Tl) crystals to measure photon energies. These subsystems are situated in a magnetic field of 1.5 T. A more detailed description of the Belle detector can be found in refs. [22,23].

Analysis Overview
The essence of the analysis lies in eqs. (2.5) and (2.6), which describe the partial decay rates in each bin. However, these relations do not account for experimental resolution and acceptance. For example, the invariant mass resolution causes events to be assigned to bins outside of their origin, an effect we shall call "migration". The background contributions are to be considered as well. Here we briefly summarize how these experimental effects are accounted for.

Efficiency
Three different samples are used in this analysis, each with differing selection efficiencies due to the kinematic differences between the final states: the quantum correlated DD sample from ψ(3770) decays, the Belle sample of B + → Dh + , where h = (K, π), and the Belle sample of D * + → Dπ + used to determine K i and K i . The sample of B + → Dπ + is used as a control sample, as it is topologically identical to the signal, but with negligible expected CP violation [26]. The c i and s i results measured with CLEO-c data have been corrected for efficiency. Efficiency variation among bins will not matter if the efficiency profile is the same for both B + → Dh + and flavour-tagged D samples. This is partially achieved by requiring similar kinematic properties for the D meson in both samples. The efficiency profile depends primarily on D momentum, hence we select the flavour-tagged D sample in such a way that the D momentum approximately matches that of the B + → Dh + sample. The matching is not exact, so independent efficiency corrections are applied to the yields in both samples while calculating the parameters of interest.

Momentum resolution
The finite momentum resolution causes events to migrate among the bins. The c i and s i results are obtained after applying corrections for these migration effects. The amount of migration in both B and D * samples is estimated as a migration matrix M ij . The matrix has its diagonal elements close to one, and off-diagonal elements are small. MC samples of signal events are used to obtain the migration matrix. The data yield in each bin, Y i , is modified as Y i = M ij Y j . Any difference between the invariant mass resolution in the data and MC samples must be taken into account. We find the effect of the difference in resolution is only significant in bin 1, which contains the ω resonance. This bin is narrow due to the small natural width of the ω. However, the natural width is the same order as the m π + π − π 0 resolution, so there is significant migration out of this bin that is not compensated by migration into bin 1. Therefore, the M 1j elements of the migration matrix are determined after applying a Gaussian smearing to the value of m π + π − π 0 by a scale factor. The scale factor is obtained from the observed difference in ω mass resolution between data and MC samples. The scale factors are 1.13 ± 0.02 and 1.09 ± 0.02 for the B + → Dh + and D * + → Dπ + samples, respectively.

Signal extraction
It is important to account for the background contributions in the sample while extracting the specified parameters. An extended maximum likelihood fit is performed on the data in each bin of the flavour-tagged D sample to extract the values of K i and K i . The fit to the B sample in the bins of D phase space is performed using an extended likelihood fit that simultaneously fits all bins in the B + → DK + and B + → Dπ + decay modes, so that the values of the parameters x ± and y ± that are common to the expectation for each bin yield, can be extracted, as well as the cross-feed between these samples.

Event selection
We reconstruct the decays B + → DK + and B + → Dπ + , where the neutral D meson decays to the four-body final state of K 0 S π + π − π 0 . In addition, D * + → D 0 π + decays produced via the e + e − → cc continuum process are selected to determine the K i and K i parameters.
For charged particle candidates originating directly from the B and D decays, we require that the track be within 0.5 cm and ±3.0 cm of the interaction point (IP) in the directions perpendicular to (radial) and parallel to the z-axis, respectively; the z-axis is defined to be opposite to the e + beam direction. The charged tracks are classified as pions or kaons based on information from CDC, ACC, and TOF sub-detector systems. The pion (kaon) identification efficiency is 92% (84%) and the probability of misidentification as a kaon (pion) is 15% (8%) [27].
We select K 0 S candidates from two oppositely charged tracks assumed to be pions. The invariant mass of the two tracks is required to be within the range 0.487-0.508 GeV/c 2 corresponding to ±3σ of the known K 0 S mass [12], where σ is the mass resolution. A neural network [28] based selection is applied on the daughter tracks to remove background from random combinations [29]. The input variables to the neural network are the K 0 S momentum in the lab frame, the distance between the two track helices along the z-axis at their point of closest approach, the K 0 S flight length in the radial direction, the angle between the K 0 S momentum and the vector joining the IP to the K 0 S decay vertex, the angle between pion momentum and the boost direction of lab frame in K 0 S rest frame and pion momentum in K 0 S rest frame, the distances of closest approach in the radial direction between IP and the two pion helices, the number of hits in CDC for each pion track, and the presence of hits in the SVD for each pion track. The K 0 S selection efficiency is 87%, which is determined from an MC sample of generic BB events.
The π 0 candidates are reconstructed from pairs of photons detected in the ECL. We select candidates with diphoton invariant mass M π 0 in the range 0.119-0.148 GeV/c 2 , which corresponds to 3σ about the nominal π 0 mass [12]. The photon energy thresholds are optimized separately for π 0 candidates detected in combinations of the barrel, forward endcap (FWD EC) and backward endcap (BWD EC) regions of the ECL as given in table 3 by maximizing the significance S/ √ S + B, where S and B are the number of signal and background events selected from MC samples in the signal region, respectively. (The criteria that define the signal region are described later in this section.) Studies of MC samples indicate that candidates in the other ECL sector combinations make up only 1.5% of the total, and a common energy threshold of 50 MeV is applied on these. All selected combinations of K 0 S π + π − π 0 candidates are retained for further study. In addition, kinematic constraints are applied to the K 0 S , π 0 , and D invariant masses and decay vertices to improve the momentum resolution of the B candidates, as well as the invariant masses used to bin the D phase space.
The D * + → D 0 π + decay uses the charge of the accompanying pion to identify the flavour of the D meson. This pion is referred to as a slow pion because of the limited phase space of the decay that results in it having lower momentum on average than other final-state particles. To improve the momentum resolution of the slow pion, it is required   The D and π + candidates are constrained to come from a common vertex to form the D * + candidate. On average, there are 1.6 D * + candidates in an event. If there is more than one candidate in an event, the candidate with the smallest χ 2 value from the D * + vertex fit is retained. This criterion selects the correct signal candidate in 69% of the events with multiple candidates. The overall selection efficiency is 3.7%, which includes the secondary branching fraction of K 0 S → π + π − . A D candidate is combined with a charged kaon (pion) track to form a B + → DK + (B + → Dπ + ) candidate. The invariant mass of the D candidate is required to be in the range 1.835-1.890 GeV/c 2 . The signal candidates are identified using two kinematic variables, the energy difference ∆E and beam-energy-constrained mass M bc , which are is the energy (momentum) of the B candidate and E beam is the beam energy in the centre-of-mass frame . We select candidates that satisfy the criteria M bc > 5.27 GeV/c 2 and −0.13 < ∆E < 0.30 GeV. The asymmetric ∆E window is chosen to avoid the peaking structure appearing at lower values from partially reconstructed B + → D ( * ) K ( * )+ decays. The signal region used while performing optimization of the selection is |∆E| < 0.05 GeV. The average B candidate multiplicity is 1.3. In events with more than one candidate, we retain the candidate with the smallest value of ( Here, the masses M PDG i are those reported by the Particle Data Group [12] and the resolutions σ M bc , σ M D , and σ M π 0 are obtained from MC simulated samples of signal events. The best candidate selection criterion is 80% efficient in selecting the correctly reconstructed candidate. The background from e + e − → qq (q = u, d, s, c) continuum processes is suppressed by exploiting the difference in event topology compared to e + e − → Υ(4S) → BB events. The continuum events are jet-like in nature, whereas BB events have a spherical topology, due to the low momentum of the B mesons produced via the Υ(4S) resonance. A neuralnetwork-based algorithm [28] is used to discriminate between continuum background and B events. We also use variables related to the displaced vertices of B decays from the IP and the associated leptons/kaons from the non-signal B meson in the event, which give an additional handle to distinguish continuum events.
The eight input variables to the neural network are the likelihood ratio obtained via Fisher discriminants [30] formed from modified Fox-Wolfram moments [31,32], the absolute value of the cosine of the angle between the B candidate and the z axis in the e + e − centreof-mass frame, the absolute value of the cosine of the angle between the thrust axis of the B candidate and that of the rest of the event in the centre-of-mass frame, the vertex separation between the two B candidates [33] along z-axis, the absolute value of the B flavour-dilution factor [34], the difference between the sum of the charges of particles in the hemisphere about the D direction in the centre-of-mass frame and the one in the opposite hemisphere, excluding the particles used for the reconstruction of B, the product of the charge of the B and the sum of the charges of all kaons not used for reconstruction of B, and the cosine of the angle between the D direction and the direction opposite to that of the Υ(4S) in the B rest frame.
Signal and continuum MC samples are used to train the neural network. We require the neural network output, C NN , to be greater than −0.6, which reduces the continuum background by 67% with a loss of only 5% of the signal. The overall selection efficiency is 4.7% and 5.3% for B + → DK + and B + → Dπ + decays, respectively. These efficiencies include the secondary branching fraction of K 0 S → π + π − . The efficiencies in each bin and the migration matrix for the B + → Dh + selection are given in appendix A. 7 Determination of K i and K i from the D * + sample The fractions of D 0 and D 0 events in each D phase space bin, represented as K i and K i , are measured from the selected sample of D * + → Dπ + candidates. The yield of signal events is obtained from a two-dimensional unbinned extended maximum-likelihood fit to the distribution of M D and ∆M for the selected candidates. The fit is performed independently in each bin. In general, there are two types of background: combinatorial, which is due to the random combination of final-state particles to form a D * + candidate, and random-slowpion, in which a correctly reconstructed D meson combines with a π + , which is not from a common D * + decay, to form a candidate. The combinatorial background peaks neither in the M D nor ∆M distributions, whereas the random-slow-pion background peaks only in the M D distribution. The signal component of the M D distribution is described by a probability density function (PDF) that is the sum of a Crystal Ball (CB) [35] function and two Gaussian functions with a common mean. The combinatorial background PDF is parametrized by a first-order polynomial. The signal PDF is also used to model the random-slow-pion background distribution in M D . The ∆M signal PDF is described by the sum of an asymmetric Gaussian and three Gaussian functions with a common mean. The combinatorial background ∆M distribution is parametrized by the sum of a threshold function and two Gaussian PDFs. The threshold function is where m π is the mass of a charged pion [12], and α and β are shape parameters. In the final fit to data, the shape parameters are fixed to the values obtained from MC. The Gaussian functions describe a small peak in the ∆M combinatorial distribution, which is due to misreconstructed π 0 candidates. The parameters of the Gaussian functions and the fraction of candidates in the peak are fixed to the values obtained from a MC sample. The random-slow-pion background PDF is the same as the threshold function used to describe the combinatorial background. The signal M D and ∆M PDFs are correlated such that the width of the ∆M distribution depends upon M D . The width of the core Gaussian in the ∆M signal PDF is parametrized as where a 0 and a 2 are parameters to be determined from data. The correlation between M D and ∆M distributions is found to be negligible in studies of background MC samples. Therefore, the one-dimensional PDFs are multiplied to obtain the total background PDF. The yields, except that describing the peaking component in the combinatorial background ∆M distribution and the shape parameters a 0(2) , as well as the means of the signal in both M D and ∆M are floated in the fit; all other parameters are fixed to the values obtained from fits to the corresponding MC sample. In each bin, the fit is performed simultaneously for D 0 and D 0 categories to obtain the signal yield. Figure 3 shows the fit projections compared to the data within bin 1. These projections are signal-enhanced by The dotted red, blue and magenta curves represent the signal, combinatorial and random-slow-pion backgrounds, respectively. The pull between the fit and the data is shown below the distributions.
considering events in the signal region of the variable that is not plotted; the signal regions are defined as 1.86 < M D < 1.87 GeV/c 2 and 0.144 < ∆M < 0.146 GeV/c 2 . The large statistics of the sample makes it difficult for the model to fit data exactly, resulting in systematic deviations in the pull values from zero in the tails. Studies of MC samples have shown that the signal yield is unbiased and this systematic deviation in the pull values has negligible effect on the measured K i and K i values. The efficiency-and migration-corrected yields are then used to determine the values of K i and K i , which are given in table 4. The values of K i and K i are in reasonable agreement with those reported in ref. [13]; the only deviation larger than 3σ is in bin 9, which contains only 1.2% of the data.
8 Determination of (x ± , y ± ) from the B ± → Dh ± sample We select both B + → DK + and B + → Dπ + decays because they have an identical topology, but the latter is less sensitive to CP -violation measurements because r Dπ B is approximately twenty times smaller than r DK B . However, the B + → Dπ + branching fraction is an order of magnitude larger than that of B + → DK + and hence serves as an excellent calibration sample for the signal determination procedure. Furthermore, there is a significant background from B + → Dπ + decays in the B + → DK + sample from the misidentification of the charged pion as a kaon; a simultaneous fit to both samples allows this cross-feed to be directly determined from data.
The signal yield in each bin is obtained via a simultaneous two-dimensional fit to the nine D phase space bins with the data divided into B + → DK + , B − → DK − , B + → Dπ + and B − → Dπ − candidates, so there are 36 samples in total. The signal extraction is done Bin no.  Table 4. D 0 and D 0 yield in each bin of D phase space along with K i and K i values measured in D * tagged data sample.
by fitting ∆E and C NN . The distribution of C NN cannot be described readily by an analytic PDF. Therefore, we transform C NN as where C NN,low = −0.6 and C NN,high = 0.9985 are the minimum and maximum values of C NN in the sample, respectively. The signal and background distributions of C NN can be described by combinations of Gaussian PDFs. The three background components considered are: • continuum background from e + e − → qq processes, where q = (u, d, s, c) • combinatorial BB background, in which the final state particles could be coming from both B mesons in an event; and • cross-feed peaking background from B + → Dh + , where h = π, K, in which the charged kaon is misidentified as a pion or vice versa.
There is no significant correlation between ∆E and C NN , so the two-dimensional PDF for each of the components is the product of one-dimensional ∆E and C NN PDFs. The sum of a CB function and two Gaussian functions with a common mean is used as the PDF to model the ∆E signal component in both B samples. The sum of a Gaussian and an asymmetric Gaussian with different mean values is used to parametrize the PDF that describes the C NN signal component. The continuum background distribution is modeled with a first-order polynomial in ∆E and by the sum of two Gaussian PDFs with different mean values in C NN . The ∆E distribution of combinatorial BB background in B + → Dπ + is described by an exponential function. There is a small peaking structure due to misreconstructed π 0 events, and this is modeled by a CB function. A first-order polynomial is added to the above two PDFs in the case of B + → DK + decays. The for each of the samples is modeled by an asymmetric Gaussian function. The cross-feed peaking background in ∆E is modeled with the sum of three Gaussian functions, whereas the signal PDF itself is used for the C NN distribution.
All yields are determined from the fit to data. The signal mean value and polynomial parameter for continuum background ∆E distribution are determined from the fit to data, while all other shape parameters are fixed to those obtained from fits to appropriate MC samples. A scaling factor is applied on the ∆E signal resolution, which is a free parameter in the fit. All C NN parameters are fixed to the values obtained from MC. An additional shift is applied on the continuum background mean value as well as a scaling factor to the resolution. Both these parameters are determined from data, which ensures that any possible data-MC difference is taken into account. We do not perform an independent fit in each bin because the event yields become too small to determine all the free parameters. Therefore, common shape parameters are used for each bin except for the combinatorial BB background component in bin 1. A separate exponential parameter is used in bin 1 due to the difference in slope compared to other bins. These exponential parameters are also floated in the fit in addition to those mentioned earlier. The signal-enhanced fit projections for the data in bin 1 are shown in figure 4 and 5, where the signal regions are defined as |∆E| < 0.05 GeV and C NN > 0. The fitted signal yields are summarized in table 5. The total numbers of B ± → Dπ ± and B ± → DK ± signal events are 9981 ± 134 and 815 ± 51, respectively.
The Cartesian parameters x ± and y ± are extracted from the simultaneous fit by ex- Signal-enhanced fit projections of (a) ∆E and (b) C NN for the B ± → DK ± data sample in bin 1. The black points with error bars are the data and the solid blue curves are the total fit. The dotted red, blue, magenta, and green curves represent the signal, continuum, combinatorial BB backgrounds and cross-feed peaking background components, respectively. The pull between the data and the fit is also shown for both the projections. Table 5. Signal yields in each D phase space bin for B ± → Dπ ± and B ± → DK ± data samples obtained from a simultaneous fit to the nine bins.

E (GeV) ∆
pressing the signal yield using eqs. (2.5) and (2.6); the procedure includes corrections for efficiency and migration between bins. The input parameters to the expressions in eqs. (2.5) and (2.6) include the values of K i and K i obtained from the flavour-tagged D sample and the D strong-phase difference parameters c i and s i [13]. The results are summarized in table 6, and the statistical likelihood contours are shown in figure 6. The statistical cor-  Table 6. x ± and y ± parameters from a combined fit to B ± → Dπ ± and B ± → DK ± data samples. The first uncertainty is statistical, the second is systematic, and the third is due to the uncertainty on the c i , s i measurements.  x + y + x − y − x + 1 −0.364 0.314 0.050 Table 7. Statistical correlation matrix for (x + , y + , x − , y − ) measured from the B ± → Dπ ± data sample relation matrices are given in tables 7 and 8. The measured and expected yields for the binned B + and B − data are compared in figures 7 and 8. Table 8. Statistical correlation matrix for (x + , y + , x − , y − ) measured from the B ± → DK ± data sample Bin number 1

Systematic uncertainties
We consider several possible sources of systematic uncertainty, as listed in table 9, along with their contributions. The remainder of this section describes how these uncertainties are estimated.
The limited size of the signal MC sample used for estimating the efficiency and the migration matrix is a source of systematic uncertainty. Efficiencies in B and D * samples are varied by their statistical uncertainty (±1σ) in each bin independently. The resultant negative and positive deviations in (x ± , y ± ) are separately summed in quadrature. Similarly, the migration matrix elements are varied by their statistical uncertainty in B and D * samples, one element at a time. The resultant positive and negative deviations are considered separately.
The systematic uncertainty due to the difference in mass resolution between data and the MC samples is considered by varying the width on the π + π − π 0 invariant mass distribution by the uncertainty on the resolution scale factor obtained in data, when compared to that in MC. The resultant deviations in (x ± , y ± ) are taken as the systematic uncertainty from this source. All the other resonances are wide and the resolution difference is an order of magnitude smaller than the resolution, thus the modelling of resolution does not affect our measurements. The systematic effect of the uncertainty on the K i and K i values is estimated by varying them by their statistical uncertainties independently. The resultant sum of deviations in quadrature is taken as the associated systematic uncertainty.
Modelling the data with PDFs that have parameters fixed to values obtained from MC samples is another source of systematic uncertainty. There are 14 signal and 23 background shape parameters fixed in the B ± → Dh ± simultaneous fit. These are fixed to the values obtained from MC samples. The uncertainty due to PDF modelling is taken into account by repeating the fit by individually varying the fixed parameters by ±1σ, where σ is the uncertainty on these parameters in MC component fits, and taking the difference in quadrature as the uncertainty. Any possible bias in the fit is studied with a set of pseudoexperiments with different input values for (x ± , y ± ). The fit is found to give an unbiased response within the statistical uncertainty from the finite number of pseudo-experiments, and this uncertainty is taken as the systematic uncertainty from this source.
The kaon identification efficiency and pion fake rate used in the fit are also fixed parameters that are determined from control samples of D * + → D 0 π + , D 0 → K − π + . They are varied by ±1σ and the resultant deviations in the nominal (x ± , y ± ) values are assigned as the systematic uncertainty. The uncertainty on the c i , s i inputs reported in ref. [13] are also considered by varying c i , s i by their respective uncertainties and then considering the corresponding deviations in (x ± , y ± ) from the nominal values as the systematic uncertainty. Here, the correlation between c i , s i is taken into account. The effect of the difference in the efficiency variation across the bins for B and D * samples is studied. We find no deviation  Table 9. Systematic uncertainties from various sources in B ± → Dπ ± and B ± → DK ± data samples.
in K i and K i values within their statistical uncertainty when the D * efficiencies are varied by the maximum deviation found between the samples or D momentum range is changed to 1-3 GeV/c.
10 Determination of φ 3 , r B and δ B We use the frequentist treatment, which includes the Feldman-Cousins ordering [37], to obtain the physical parameters from the measured parameters in B ± → DK ± sample; this is the same procedure as was used in ref. [11]. We do not use the B ± → Dπ ± sample to constrain φ 3 , which has been the case in previous Belle analyses [11,38]. We note that the constraints presented by the LHCb Collaboration [39] allow values up to r Dπ B < 0.028 at a 2σ confidence level, which is five times larger than the expectation; if the value of r Dπ B is significantly larger than expected then future analyses  Table 10. (φ 3 , δ B , r B ) obtained from the B ± → DK ± data sample. The first uncertainty is statistical, second is systematic and, the third one is due to the uncertainty on c i , s i measurements.
could include B ± → Dπ ± channel to determine φ 3 . The confidence level is calculated as where p(z|µ) is the probability density to observe the measurements z given the set of physical parameters µ. The integration domain D(µ) is given by the likelihood ratio ordering in the Feldman-Cousins method. The PDF p(z|µ) is a multivariate Gaussian PDF with the uncertainties and correlations between (x ± , y ± ) taken from the measurements. We obtain the parameters µ = (φ 3 , r B , δ B ) from the fit as given in table 10. The systematic uncertainty is estimated by varying the z parameters by their corresponding systematic uncertainties. Figure 9 shows the confidence level contours representing one, two, and three standard deviations in (φ 3 , r B ) and (φ 3 , δ B ) planes. We performed a check of the assumption that the (x ± , y ± ) likelihood can be approximated to be Gaussian when using the Feldman-Cousins method to extract (φ 3 , r B , δ B ). The check used the measured confidence intervals in (φ 3 , r B , δ B ) to generate an ensemble of simulated data sets. Each simulated data set was then fit to form a distribution of (x ± , y ± ),  Figure 10. Distribution of p-value for φ 3 from multibody D final states at Belle, which is shown by the solid blue curve. The results from B → DK ( * ) decays with D → K 0 S π + π − are shown by the solid green curve and the D → K 0 S π + π − π 0 final states are shown by the solid brown curve [11,38].
which was found to be consistent with the (x ± , y ± ) confidence intervals measured. Hence we conclude that the reported confidence intervals for (φ 3 , r B , δ B ) are appropriate.
There is a two-fold ambiguity in φ 3 and δ B results with φ 3 + 180 • and δ B + 180 • . We choose the solution that satisfies 0 • < φ 3 < 180 • . This result includes the current world-average value [36] within two standard deviations. We observe that there is a local minimum of the likelihood around φ 3 = 75 • and δ B = 155 • .
We combine the results presented here with the model-independent B + → D(K 0 S π + π − )K + [11] and B 0 → D 0 (K 0 S π + π − )K * 0 [38] results from Belle. Without our measurement, the combination leads to φ 3 = (78 +14 −15 ) • . Including our measurement, the combination gives φ 3 = (74 +13 −14 ) • . The distributions of p-values for the φ 3 measurements from the individual D final states and the combination are given in figure 10. The separate measurements and the combination likelihood contours in the (φ 3 , r B ) plane are shown in figure 11.

Conclusion
We have performed the first measurement of the unitarity triangle angle φ 3 using a modelindependent analysis of B + → D(K 0 S π + π − π 0 )K + decays using the full data sample collected by the Belle detector, which corresponds to 772×10 6 BB events. The D strong-phase difference measurements for D → K 0 S π + π − π 0 [13] are used as external inputs to the analysis. The result obtained is φ 3 = (5.7 +10.2 −8.8 ±3.5±5.7) • . The first uncertainty is statistical, the second is systematic, and the third is due to the uncertainty on the c i and s i measurements. The ratio of the suppressed and favoured amplitudes is r B = 0.323 ± 0.147 ± 0.023 ± 0.051.
This measurement can be improved upon once a suitable amplitude model for D 0 → K 0 S π + π − π 0 is available to provide guidance in choosing a more sensitive binning. Fur-  Figure 11. Projections of the confidence contours in the φ 3 − r B plane from multibody D final states at Belle, which is shown by the blue contours. The results from B → DK ( * ) decays with D → K 0 S π + π − are shown by the green contours and the D → K 0 S π + π − π 0 final states are shown by the brown contours. The solid and dashed curves correspond to one and two standard deviation contours, respectively [11,38].
thermore, the larger sample of e + e − → ψ(3770) data that has been collected by BESIII will determine c i and s i more precisely, thus reducing the systematic uncertainty. The results presented here, combined with the improvements in binning and the increased sample of B decays that will be available at Belle II, mean that model-independent analysis of B + → D(K 0 S π + π − π 0 )K + is a very promising addition to the suite of modes to be used to determine φ 3 to a precision of 1-2 • [40].   Table 11. Efficiency in each bin of the D phase space for D * ± → Dπ ± , B ± → DK ± , and B ± → Dπ ± decays determined from the corresponding signal MC samples.