Measurements of the branching fractions of the decays $B^{0}_{s} \to D^{\mp}_{s} K^{\pm} $ and $B^{0}_{s} \to D^{-}_{s} \pi^{+}$

The decay mode $\B^{0}_{s} \to D^{\mp}_{s} K^{\pm} $ allows for one of the theoretically cleanest measurements of the CKM angle $\gamma$ through the study of time-dependent $\ensuremath{CP}\xspace$ violation. This paper reports a measurement of its branching fraction relative to the Cabibbo-favoured mode $\B^{0}_{s} \to D^{-}_{s} \pi^{+}$ based on a data sample of 0.37 fb$^{-1}$ proton-proton collisions at $\sqrt{s} = 7$ TeV collected in 2011 with the LHCb detector. In addition, the ratio of $\ensuremath{\mathrm{B}}\xspace$ meson production fractions $\ensuremath{f_s/f_d}$, determined from semileptonic decays, together with the known branching fraction of the control channel $B^{0} \to D^{-} \pi^{+}$, is used to perform an absolute measurement of the branching fractions: $B (\B^0_s \to D^-_s \pi^+) \;= (2.95 \pm 0.05 \pm 0.17^{\,+\,0.18}_{\,-\,0.22}) \times 10^{-3}$, $B (\B^0_s \to D^\mp_s K^\pm) = (1.90 \pm 0.12 \pm 0.13^{\,+\,0.12}_{\,-\,0.14}) \times 10^{-4}\,$, where the first uncertainty is statistical, the second the experimental systematic uncertainty, and the third the uncertainty due to $f_s/f_d$.


Introduction
Unlike the flavour-specific decay B 0 s → D − s π + , the Cabibbo-suppressed decay B 0 s → D ∓ s K ± proceeds through two different tree-level amplitudes of similar strength: a b → cus transition leading to B 0 s → D − s K + and a b → ūcs transition leading to B 0 s → D + s K − .These two decay amplitudes can have a large CP -violating interference via B 0 s − B0 s mixing, allowing the determination of the CKM angle γ with negligible theoretical uncertainties through the measurement of tagged and untagged time-dependent decay rates to both the D − s K + and D + s K − final states [1].Although the B 0 s → D ∓ s K ± decay mode has been observed by the CDF [2] and BELLE [3] collaborations, only the LHCb experiment has both the necessary decay time resolution and access to large enough signal yields to perform the time-dependent CP measurement.In this analysis, the B 0 s → D ∓ s K ± branching fraction is determined relative to B 0 s → D − s π + , and the absolute B 0 s → D − s π + branching fraction is determined using the known branching fraction of B 0 → D − π + and the production fraction ratio f s /f d [4].The two measurements are then combined to obtain the absolute branching fraction of the decay B 0 s → D ∓ s K ± .Charge conjugate modes are implied throughout.Our notation B 0 → D − π + , which matches that of Ref. [5], encompasses both the Cabibbo-favoured B 0 → D − π + mode and the doubly-Cabibbo-suppressed B 0 → D + π − mode.
The LHCb detector [6] is a single-arm forward spectrometer covering the pseudorapidity range 2 < η < 5, designed for studing particles containing b or c quarks.The detector includes a high-precision tracking system consisting of a silicon-strip vertex detector surrounding the pp interaction region, a large-area silicon-strip detector located upstream of a dipole magnet with a bending power of about 4 Tm, and three stations of silicon-strip detectors and straw drift tubes placed downstream.The combined tracking system has a momentum resolution ∆p/p that varies from 0.4% at 5 GeV/c to 0.6% at 100 GeV/c, an impact parameter resolution of 20 µm for tracks with high transverse momentum, and a decay time resolution of 50 fs.Charged hadrons are identified using two ring-imaging Cherenkov detectors.Photon, electron and hadron candidates are identified by a calorimeter system consisting of scintillating-pad and pre-shower detectors, an electromagnetic calorimeter, and a hadronic calorimeter.Muons are identified by a muon system composed of alternating layers of iron and multiwire proportional chambers.
The LHCb trigger consists of a hardware stage, based on information from the calorimeter and muon systems, followed by a software stage which applies a full event reconstruction.Two categories of events are recognised based on the hardware trigger decision.The first category are events triggered by tracks from signal decays which have an associated cluster in the calorimeters, and the second category are events triggered independently of the signal decay particles.Events which do not fall into either of these two categories are not used in the subsequent analysis.The second, software, trigger stage requires a two-, three-or four-track secondary vertex with a large value of the scalar sum of the transverse momenta (p T ) of the tracks, and a significant displacement from the primary interaction.At least one of the tracks used to form this vertex is required to have p T > 1.7 GeV/c, an impact parameter χ 2 > 16, and a track fit χ 2 per degree of free-dom χ 2 /ndf < 2. A multivariate algorithm is used for the identification of the secondary vertices [7].Each input variable is binned to minimise the effect of systematic differences between the trigger behaviour on data and simulated events.
The samples of simulated events used in this analysis are based on the Pythia 6.4 generator [8], with a choice of parameters specifically configured for LHCb [9].The EvtGen package [10] describes the decay of the B mesons, and the Geant4 package [11] simulates the detector response.QED radiative corrections are generated with the Photos package [12].
The analysis is based on a sample of pp collisions corresponding to an integrated luminosity of 0.37 fb −1 , collected at the LHC in 2011 at a centre-of-mass energy √ s = 7 TeV.The decay modes B 0 s → D − s π + and B 0 s → D ∓ s K ± are topologically identical and are selected using identical geometric and kinematic criteria, thereby minimising efficiency corrections in the ratio of branching fractions.The decay mode B 0 → D − π + has a similar topology to the other two, differing only in the Dalitz plot structure of the D decay and the lifetime of the D meson.These differences are verified, using simulated events, to alter the selection efficiency at the level of a few percent, and are taken into account.
B 0 s (B 0 ) candidates are reconstructed from a D − s (D − ) candidate and an additional pion or kaon (the "bachelor" particle), with the D − s (D − ) meson decaying in the K + K − π − (K + π − π − ) mode.All selection criteria will now be specified for the B 0 s decays, and are implied to be identical for the B 0 decay unless explicitly stated otherwise.All final-state particles are required to satisfy a track fit χ 2 /ndf < 4 and to have a high transverse momentum and a large impact parameter χ 2 with respect to all primary vertices in the event.In order to remove backgrounds which contain the same final-state particles as the signal decay, and therefore have the same mass lineshape, but do not proceed through the decay of a charmed meson, the flight distance χ 2 of the D − s from the B 0 s is required to be larger than 2.Only D − s and bachelor candidates forming a vertex with a χ 2 /ndf < 9 are considered as B 0 s candidates.The same vertex quality criterion is applied to the D − s candidates.The B 0 s candidate is further required to point to the primary vertex imposing θ flight < 0.8 degrees, where θ flight is the angle between the candidate momentum vector and the line between the primary vertex and the B 0 s vertex.The B 0 s candidates are also required to have a χ 2 of their impact parameter with respect to the primary vertex less than 16.
Further suppression of combinatorial backgrounds is achieved using a gradient boosted decision tree technique [13] identical to the decision tree used in the previously published determination of f s /f d with the hadronic decays [14].The optimal working point is evaluated directly from a sub-sample of B 0 s → D − s π + events, corresponding to 10% of the full dataset used, distributed evenly over the data taking period and selected using particle identification and trigger requirements.The chosen figure of merit is the significance of the B 0 s → D ∓ s K ± signal, scaled according to the Cabibbo suppression relative to the B 0 s → D − s π + signal, with respect to the combinatorial background.The significance exhibits a wide plateau around its maximum, and the optimal working point is chosen at the point in the plateau which maximizes the signal yield.Multiple candidates occur in about 2% of the events and in such cases a single candidate is selected at random.

Particle identification
Particle identification (PID) criteria serve two purposes in the selection of the three signal decays When applied to the decay products of the D − s or D − , they suppress misidentified backgrounds which have the same bachelor particle as the signal mode under consideration, henceforth the "cross-feed" backgrounds.When applied to the bachelor particle (pion or kaon) they separate the Cabibbo-favoured from the Cabibbo-suppressed decay modes.All PID criteria are based on the differences in log-likelihood (DLL) between the kaon, proton, or pion hypotheses.Their efficiencies are obtained from calibration samples of D * + → (D 0 → K − π + )π + and Λ → pπ − signals, which are themselves selected without any PID requirements.These samples are split according to the magnet polarity, binned in momentum and p T , and then reweighted to have the same momentum and p T distributions as the signal decays under study.
The selection of a pure B 0 → D − π + sample can be accomplished with minimal PID requirements since all cross-feed backgrounds are less abundant than the signal.The Λ is suppressed by requiring that both pions produced in the D − decay satisfy DLL π−p > −10, and the B 0 → D − K + background is suppressed by requiring that the bachelor pion satisfies DLL K−π < 0.
The selection of a pure , whereas the combinatorial background contributes to a lesser extent.The D − contamination in the D − s data sample is reduced by requiring that the kaon which has the same charge as the pion in In addition, the other kaon is required to satisfy DLL K−π > 0. This helps to suppress combinatorial as well as doubly misidentified backgrounds.For the same reason the pion is required to have DLL K−π < 5.The contamination of Λ It is suppressed by demanding that the bachelor satisfies the criterion , is obtained by requiring that the bachelor satisfies DLL K−π < 0. The efficiency and misidentification probabilities for the PID criterion used to select the bachelor, D − , and D − s candidates are summarised in Table 1.

Mass fits
The fits to the invariant mass distributions of the B 0 s → D − s π + and B 0 s → D ∓ s K ± candidates require knowledge of the signal and background shapes.The signal lineshape is taken from a fit to simulated signal events which had the full trigger, reconstruction, and selection chain applied to them.Various lineshape parameterisations have been examined.The best fit to the simulated event distributions is obtained with the sum of two Crystal Table 1: PID efficiency and misidentification probabilities, separated according to the up (U) and down (D) magnet polarities.The first two lines refer to the bachelor track selection, the third line is the D − efficiency and the fourth the D − s efficiency.Probabilities are obtained from the efficiencies in the D * + calibration sample, binned in momentum and p T .Only bachelor tracks with momentum below 100 GeV/c are considered.The uncertainties shown are the statistical uncertainties due to the finite number of signal events in the PID calibration samples.

PID Cut
Efficiency (%) Ball functions [15] with a common peak position and width, and opposite side power-law tails.Mass shifts in the signal peaks relative to world average values [5], arising from an imperfect detector alignment [16], are observed in the data and are accounted for.A constraint on the D − s meson mass is used to improve the B 0 s mass resolution.Three kinds of backgrounds need to be considered: fully reconstructed (misidentified) backgrounds, partially reconstructed backgrounds with or without misidentification (e.g.
, and combinatorial backgrounds.The three most important fully reconstructed backgrounds are The mass distribution of the B 0 → D − π + events does not suffer from fully reconstructed backgrounds.In the case of the B 0 → D − s K + decay, which is fully reconstructed under its own mass hypothesis, the signal shape is fixed to be the same as for B 0 s → D ∓ s K ± and the peak position is varied.The shapes of the misidentified backgrounds B 0 → D − π + and B 0 s → D − s π + are taken from data using a reweighting procedure.First, a clean signal sample of B 0 → D − π + and B 0 s → D − s π + decays is obtained by applying the PID selection for the bachelor track given in Sect. 2. The invariant mass of these decays under the wrong mass hypothesis depends on the momentum of the misidentified particle.This momentum distribution must therefore be reweighted by taking into account the momentum dependence of the misidentifaction rate.This dependence is obtained using a dedicated calibration sample of prompt D * + decays.The mass distributions under the wrong mass hypothesis are then reweighted using this momentum distribution to obtain the B 0 → D − π + and B 0 s → D − s π + mass shapes under the B 0 s → D − s π + and B 0 s → D ∓ s K ± mass hypotheses, respectively.
For partially reconstructed backgrounds, the probability density functions (PDFs) of the invariant mass distributions are taken from samples of simulated events generated in specific exclusive modes and are corrected for mass shifts, momentum spectra, and PID efficiencies in data.The use of simulated events is justified by the observed good agreement between data and simulation.
The combinatorial background in the B 0 s → D − s π + and B 0 → D − π + fits is modelled by an exponential function where the exponent is allowed to vary in the fit.The resulting shape and normalisation of the combinatorial backgrounds are in agreement within one standard deviation with the distribution of a wrong-sign control sample (where the D − s and the bachelor track have the same charges).The shape of the combinatorial background in the B 0 s → D ∓ s K ± fit cannot be left free because of the partially reconstructed backgrounds which dominate in the mass region below the signal peak.In this case, therefore, the combinatorial slope is fixed to be flat, as measured from the wrong sign events.
In the b → D * − s p shapes with equal weight.The signal yields are obtained from unbinned extended maximum likelihood fits to the data.In order to achieve the highest sensitivity, the sample is separated according to the two magnet polarities, allowing for possible differences in PID performance and in running conditions.A simultaneous fit to the two magnet polarities is performed for each decay, with the peak position and width of each signal, as well as the combinatorial background shape, shared between the two.
The fit under the B 0 s → D − s π + hypothesis requires a description of the B 0 → D − π + background.A fit to the B 0 → D − π + spectrum is first performed to determine the yield of signal B 0 → D − π + events, shown in Fig. 1.The expected B 0 → D − π + contribution under the B 0 s → D − s π + hypothesis is subsequently constrained with a 10% uncertainty to account for uncertainties on the PID efficiencies.The fits to the B 0 s → D − s π + candidates are shown in Fig. 1 and the fit results for both decay modes are summarised in Table 2.The peak position of the signal shape is varied, as are the yields of the different partially reconstructed backgrounds (except B 0 → D − π + ) and the shape of the combinatorial background.The width of the signal is fixed to the values found in the B 0 → D − π + fit (17.2 MeV/c 2 ), scaled by the ratio of widths observed in simulated events between B 0 → D − π + and B 0 s → D − s π + decays (0.987).The accuracy of these fixed parameters is evaluated using ensembles of simulated experiments described in Sect. 4. The yield of B 0 → D − s π + is fixed to be 2.9% of the B 0 s → D − s π + signal yield, based on the world average branching fraction of B 0 → D − s π + of (2.16 ± 0.26) × 10 −5 , the value of f s /f d given in [4], and the value of the branching fraction computed in this paper.The shape used to fit this component is the sum of two Crystal Ball functions obtained from the B 0 s → D − s π + sample with the peak position fixed to the value obtained with the fit of the B 0 → D − π + , whose yield is allowed to vary, is included in the fit (with the mass shape obtained using the reweighting procedure on simulated events described previously) and results in a negligible contribution, as expected.
The fits for the B 0 s → D ∓ s K ± candidates are shown in Fig. 2 and the fit results Table 2: Results of the mass fits to the B 0 → D − π + , B 0 s → D − s π + , and B 0 s → D ∓ s K ± candidates separated according to the up (U) and down (D) magnet polarities.In the B 0 s → D ∓ s K ± case, the number quoted for B 0 s → D − s π + also includes a small number of B 0 → D − π + events which have the same mass shape (20 events from the expected misidentification).See Table 3 for the constrained values used in the B 0 s → D ∓ s K ± decay fit for the partially reconstructed backgrounds and the are collected in Table 2.There are numerous reflections which contribute to the mass distribution.The most important reflection is B 0 s → D − s π + , whose shape is taken from the earlier B 0 s → D − s π + signal fit, reweighted according to the efficiencies of the applied PID requirements.Furthermore, the yield of the B 0 → D − K + reflection is constrained to the values in Table 3.In addition, there is potential cross-feed from partially reconstructed modes with a misidentified pion such as B 0 s → D − s ρ + , as well as several small contributions from partially reconstructed backgrounds with similar mass shapes.The yields of these modes, whose branching fractions are known or can be estimated (e.g.B 0 s → D − s ρ + , B 0 s → D − s K * + ), are constrained to the values in Table 3, based on criteria such as relative branching fractions and reconstruction efficiencies and PID probabilities.An important cross-check is performed by comparing the fitted value of the yield of misidentified B 0 s → D − s π + events (318 ± 30) to the yield expected from PID efficiencies (370 ± 11) and an agreement is found.

Systematic uncertainties
The major systematic uncertainities on the measurement of the relative branching fraction of B 0 s → D ∓ s K ± and B 0 s → D − s π + are related to the fit, PID calibration, and trigger and offline selection efficiency corrections.Systematic uncertainties related to the fit are evaluated by generating large sets of simulated experiments using the nominal fit, and then fitting them with a model where certain parameters are varied.To give two examples, the signal width is deliberately fixed to a value different from the width used in the generation, or the combinatorial background slope in the B 0 s → D ∓ s K ± fit is fixed to the combinatorial background slope found in the B 0 s → D − s π + fit.The deviations of the peak position of the pull distributions from zero are then included in the systematic uncertainty.s → D ∓ s K ± candidates.The stacked background shapes follow the same top-to-bottom order in the legend and the plot.For illustration purposes the plot includes events from both magnet polarities, but they are fitted separately as described in the text.
In the case of the B 0 s → D ∓ s K ± fit the presence of constraints for the partially reconstructed backgrounds must be considered.The generic extended likelihood function can Table 3: Gaussian constraints on the yields of partially reconstructed and misidentified backgrounds applied in the B 0 s → D ∓ s K ± fit, separated according to the up (U) and down (D) magnet polarities.

Background type
Table 4: Relative systematic uncertainities on the branching fraction ratios.
where the first factor is the extended Poissonian likelihood in which N is the total number of fitted events, given by the sum of the fitted component yields N = k N k .The fitted data sample contains N obs events.The second factor is the product of the j external constraints on the yields, j < k, where G stands for a Gaussian PDF, and N c ± σ N 0 is the constraint value.The third factor is a product over all events in the sample, P is the total PDF of the fit, P (m i ; λ) = k N k P k (m i ; λ k ), and λ is the vector of parameters that define the mass shape and are not fixed in the fit.Each simulated dataset is generated by first varing the component yield N k using a Poissonian PDF, then sampling the resulting number of events from P k , and repeating the procedure for all components.In addition, constraint values N j c used when fitting the simulated dataset are generated by drawing from G(N ; N j 0 , σ N j 0 ), where N j 0 is the true central value of the constraint, while in the nominal fit to the data N j c = N j 0 .The sources of systematic uncertainty considered for the fit are signal widths, the slope of the combinatorial backgrounds, and constraints placed on specific backgrounds.The largest deviations are due to the signal widths and the fixed slope of the combinatorial background in the B 0 s → D ∓ s K ± fit.The systematic uncertainty related to PID enters in two ways: firstly as an uncertainty on the overall efficiencies and misidentification probabilities, and secondly from the shape for the misidentified backgrounds which relies on correct reweighting of PID efficiency versus momentum.The absolute errors on the individual K and π efficiencies, after reweighting of the D * + calibration sample, have been determined for the momentum spectra that are relevant for this analysis, and are found to be 0.5% for DLL K−π < 0 and 0.5% for DLL K−π > 5.
The observed signal yields are corrected by the difference observed in the (non-PID) selection efficiencies of different modes as measured from simulated events: A systematic uncertainty is assigned on the ratio to account for percent level differences between the data and the simulation.These are dominated by the simulation of the hardware trigger.All sources of systematic uncertainty are summarized in Table 4.

Determination of the branching fractions
The B 0 s → D ∓ s K ± branching fraction relative to B 0 s → D − s π + is obtained by correcting the raw signal yields for PID and selection efficiency differences where X is the efficiency to reconstruct decay mode X and N X is the number of observed events in this decay mode.The PID efficiencies are given in Table 1, and the ratio of the two selection efficiencies is 0.943 ± 0.013.The ratio of the branching fractions of B 0 s → D ∓ s K ± relative to B 0 s → D − s π + is determined separately for the down (0.0601±0.0056) and up (0.0694±0.0066) magnet polarities and the two results are in good agreement.The quoted errors are purely statistical.The combined result is where the first uncertainty is statistical and the second is the total systematic uncertainty from Table 4.The relative yields of B 0 s → D − s π + and B 0 → D − π + are used to extract the branching fraction of B 0 s → D − s π + from the following relation where the first uncertainty is statistical, the second is the experimental systematics (as listed in Table 4) plus the uncertainty arising from the B 0 → D − π + branching fraction, and the third is the uncertainty (statistical and systematic) from the semileptonic f s /f d measurement.Both measurements are significantly more precise than the existing world averages [5].
reduced by applying a requirement of DLL K−p > 0 to the candidates that, when reconstructed under the Λ − c → pK + π − mass hypothesis, lie within ±21 MeV/c 2 of the Λ − c mass.Because of its larger branching fraction, B 0 s an additional complication arises due to backgrounds from Λ 0 b → D − s p and Λ 0 b → D * − s p, which fall in the signal region when misreconstructed.To avoid a loss of B 0 s → D ∓ s K ± signal, no requirement is made on the DLL K−p of the bachelor particle.Instead, the Λ 0 b → D − s p mass shape is obtained from simulated Λ 0 b → D − s p decays, which are reweighted in momentum using the efficiency of the DLL K−π > 5 requirement on protons.The Λ 0 b → D * − s p mass shape is obtained by shifting the Λ 0 b → D − s p mass shape downwards by 200 MeV/c 2 .The branching fractions of Λ 0 b → D − s p and Λ 0 b → D * − s p are assumed to be equal, motivated by the fact that the decays B 0 → D − D + s and B 0 → D − D * + s (dominated by similar tree topologies) have almost equal branching fractions.Therefore the overall mass shape is formed by summing the Λ 0 b → D − s p and Λ 0

Figure 1 :
Figure 1: Mass distribution of the B 0 → D − π + candidates (top) and B 0s → D − s π + candidates (bottom).The stacked background shapes follow the same top-to-bottom order in the legend and the plot.For illustration purposes the plot includes events from both magnet polarities, but they are fitted separately as described in the text.

Figure 2 :
Figure 2: Mass distribution of the B 0s → D ∓ s K ± candidates.The stacked background shapes follow the same top-to-bottom order in the legend and the plot.For illustration purposes the plot includes events from both magnet polarities, but they are fitted separately as described in the text.