Measurement of CP asymmetry in $B^0_s \rightarrow D^{\mp}_s K^{\pm}$ decays

We report on measurements of the time-dependent CP violating observables in $B^0_s\rightarrow D^{\mp}_s K^{\pm}$ decays using a dataset corresponding to 1.0 fb$^{-1}$ of pp collisions recorded with the LHCb detector. We find the CP violating observables $C_f=0.53\pm0.25\pm0.04$, $A^{\Delta\Gamma}_f=0.37\pm0.42\pm0.20$, $A^{\Delta\Gamma}_{\bar{f}}=0.20\pm0.41\pm0.20$, $S_f=-1.09\pm0.33\pm0.08$, $S_{\bar{f}}=-0.36\pm0.34\pm0.08$, where the uncertainties are statistical and systematic, respectively. Using these observables together with a recent measurement of the $B^0_s$ mixing phase $-2\beta_s$ leads to the first extraction of the CKM angle $\gamma$ from $B^0_s \rightarrow D^{\mp}_s K^{\pm}$ decays, finding $\gamma$ = (115$_{-43}^{+28}$)$^\circ$ modulo 180$^\circ$ at 68% CL, where the error contains both statistical and systematic uncertainties.


Introduction
Time-dependent analyses of tree-level B 0 (s) → D ∓ (s) π ± , K ± decays 1 are sensitive to the angle γ ≡ arg(−V ud V * ub /V cd V * cb ) of the unitarity triangle of the Cabibbo-Kobayashi-Maskawa (CKM) matrix [1,2] through CP violation in the interference of mixing and decay amplitudes [3][4][5]. The determination of γ from such tree-level decays is important because it is not sensitive to potential effects from most models of physics beyond the Standard Model (BSM). The value of γ hence provides a reference against which other BSM-sensitive measurements can be compared.
Due to the interference between mixing and decay amplitudes, the physical CP violating observables in these decays are functions of a combination of γ and the relevant mixing phase, namely γ + 2β (β ≡ arg(−V cd V * cb /V td V * tb )) in the B 0 and γ − 2β s (β s ≡ arg(−V ts V * tb /V cs V * cb )) in the B 0 s system. A measurement of these physical observables can therefore be interpreted in terms of γ or β (s) by using an independent measurement of the other parameter as input.
The leading order Feynman diagrams contributing to the interference of decay and mixing in B 0 s → D ∓ s K ± are shown in Fig. 1. In contrast to B 0 → D ( * )∓ π ± decays, here both the B 0 s → D − s K + (b → csū) and B 0 s → D + s K − (b → ucs) amplitudes are of the same order in the sine of the Cabibbo angle λ = 0.2252 ± 0.0007 [11,12], O(λ 3 ), and the amplitude ratio of the interfering diagrams is approximately |V ub V cs /V cb V us | ≈ 0.4. Moreover, the decay width difference in the B 0 s system, ∆Γ s , is nonzero [13], which allows a determination of γ − 2β s from the sinusoidal and hyperbolic terms in the decay time evolution, up to a two-fold ambiguity.
This paper presents the first measurements of the CP violating observables in B 0 s → D ∓ s K ± decays using a dataset corresponding to 1.0 fb −1 of pp collisions recorded with the LHCb detector at √ s = 7 TeV, and the first determination of γ − 2β s in these decays. 1 Inclusion of charge conjugate modes is implied except where explicitly stated.

Decay rate equations and CP violation observables
The time-dependent decay rates of the initially produced flavour eigenstates |B 0 s (t = 0) and |B 0 s (t = 0) are given by where λ f ≡ (q/p)(A f /A f ) and A f (A f ) is the decay amplitude of a B 0 s to decay to a final state f (f ). Γ s is the average B 0 s decay width, and ∆Γ s is the positive [14] decay-width difference between the heavy and light mass eigenstates in the B 0 s system. The complex coefficients p and q relate the B 0 s meson mass eigenstates, |B L,H , to the flavour eigenstates, |B 0 s and |B 0 with |p| 2 + |q| 2 = 1. Similar equations can be written for the CP -conjugate decays replacing C f by C f , S f by S f , and A ∆Γ f by A ∆Γ f . In our convention f is the D − s K + final state and f is D + s K − . The CP asymmetry observables C f , S f , A ∆Γ f , C f , S f and A ∆Γ f are given by The equality C f = −C f results from |q/p| = 1 and |λ f | = | 1 λ f |, i.e. the assumption of no CP violation in either the decay or mixing amplitudes. The CP observables are related to the magnitude of the amplitude ratio r DsK ≡ |λ DsK | = |A(B 0 s → D − s K + )/A(B 0 s → D − s K + )|, the strong phase difference δ, and the weak phase difference γ − 2β s by the following equations:

Analysis strategy
To measure the CP violating observables defined in Sec. 1.1, it is necessary to perform a fit to the decay-time distribution of the selected B 0 s → D ∓ s K ± candidates. Before this, however, it is necessary to distinguish the signal and background candidates in the selected sample. This analysis uses three variables to maximise sensitivity when discriminating between signal and background: the B 0 s mass; the D − s mass; and the log-likelihood difference L(K/π) between the pion and kaon hypotheses for the companion particle.
In Sec. 4, the signal and background shapes needed for the analysis are obtained in each of the variables. Section 5 describes how a simultaneous extended maximum likelihood fit (in the following referred to as multivariate fit) to these three variables is used to determine the yields of signal and background components in the samples of B 0 s → D − s π + and B 0 s → D ∓ s K ± candidates. Section 6 describes how to obtain the flavour at production of the B 0 s → D ∓ s K ± candidates using a combination of flavour-tagging algorithms, whose performance is calibrated with data using flavour-specific control modes. The decay-time resolution and acceptance are determined using a mixture of data control modes and simulated signal events, described in Sec. 7.
Finally, Sec. 8 describes the fits to the decay-time distribution of the B 0 s → D ∓ s K ± candidates which extract the CP violating observables. The first fit, henceforth referred to as the sFit, uses the results of the multivariate fit to obtain the so-called sWeights [15] which allow the background components to be statistically subtracted [16]. The sFit to the decay-time distribution is therefore performed using only the probability density function (PDF) of the signal component. The second fit, henceforth referred to as the cFit, uses the various shapes and yields of the multivariate fit result for the different signal and background components. The cFit subsequently performs a six-dimensional fit to these variables, the decay-time distribution and uncertainty, and the probability that the initial B 0 s flavour is correctly determined, in which all contributing signal and background components are described with their appropriate PDFs.

Detector and software
The LHCb detector [17] is a single-arm forward spectrometer covering the pseudorapidity range 2 < η < 5, designed for the study of particles containing b or c quarks. The detector includes a high-precision tracking system consisting of a silicon-strip vertex detector surrounding the pp interaction region [18], a large-area silicon-strip detector located upstream of a dipole magnet with a bending power of about 4 Tm, and three stations of silicon-strip detectors and straw drift tubes [19] placed downstream of the magnet. The tracking system provides a measurement of momentum, p, with a relative uncertainty that varies from 0.4% at low momentum to 0.6% at 100 GeV/c. The minimum distance of a track to a primary pp collision vertex, the impact parameter, is measured with a resolution of (15 + 29/p T ) µm, where p T is the component of p transverse to the beam, in GeV/c. Different types of charged hadrons are distinguished using information from two ring-imaging Cherenkov detectors [20]. The magnet polarity is reversed regularly to control systematic effects.
The trigger [21] consists of a hardware stage, based on information from the calorimeter and muon systems, followed by a software stage, which applies a full event reconstruction. The software trigger requires a two-, three-or four-track secondary vertex with a large sum of the transverse momentum of the charged particles and a significant displacement from the primary pp interaction vertices (PVs). A multivariate algorithm [22] is used for the identification of secondary vertices consistent with the decay of a b hadron.
In the simulation, pp collisions are generated using Pythia [23] with a specific LHCb configuration [24]. Decays of hadronic particles are described by EvtGen [25], in which final state radiation is generated using Photos [26]. The interaction of the generated particles with the detector and its response are implemented using the Geant4 toolkit [27] as described in Ref. [28].

Event selection
The event selection begins by building D − s → K − K + π − , D − s → K − π + π − , and D − s → π − π + π − candidates from reconstructed charged particles. These D − s candidates are subsequently combined with a fourth particle, referred to as the "companion", to form B 0 s → D ∓ s K ± and B 0 s → D − s π + candidates. The flavour-specific Cabibbo-favoured decay mode B 0 s → D − s π + is used as a control channel in the analysis, and is selected identically to B 0 s → D ∓ s K ± except for the PID criteria on the companion particle. The decay-time and B 0 s mass resolutions are improved by performing a kinematic fit [29] in which the B 0 s candidate is constrained to originate from its associated proton-proton interaction, i.e. the one with the smallest IP with respect to the B 0 s candidate, and the B 0 s mass is computed with a constraint on the D − s mass. The B 0 s → D − s π + mode is used for the optimisation of the selection and for studying and constraining physics backgrounds to the B 0 s → D ∓ s K ± decay. The B 0 s → D ∓ s K ± and B 0 s → D − s π + candidates are required to be matched to the secondary vertex candidates found in the software trigger. Subsequently, a preselection is applied to the B 0 s → D ∓ s K ± and B 0 s → D − s π + candidates using a similar multivariate displaced vertex algorithm to the trigger selection, but with offline-quality reconstruction.
A selection using the gradient boosted decision tree (BDTG) [30] implementation in the Tmva software package [31] further suppresses combinatorial backgrounds. The BDTG is trained on data using the B 0 s → D − s π + , D − s → K − K + π − decay sample, which is purified with respect to the previous preselection exploiting PID information from the Cherenkov detectors. Since all channels in this analysis are kinematically similar, and since no PID information is used as input to the BDTG, the resulting BDTG performs equally well on the other D − s decay modes. The optimal working point is chosen to maximise the expected sensitivity to the CP violating observables in B 0 s → D ∓ s K ± decays. In addition, the B 0 s and D − s candidates are required to be within m(B 0 s ) ∈ [5300, 5800] MeV/c 2 and m(D − s ) ∈ [1930,2015] MeV/c 2 , respectively. Finally, the different final states are distinguished by using PID information. This selection also strongly suppresses cross-feed and peaking backgrounds from other misidentified decays of b-hadrons to c-hadrons. We will refer to such backgrounds as "fully reconstructed" if no particles are missed in the reconstruction, and "partially reconstructed" otherwise. The decay modes This part of the selection is necessarily different for each D − s decay mode, as described below.
• For D − s → π − π + π − none of the possible misidentified backgrounds fall inside the D − s mass window. Loose PID requirements are nevertheless used to identify the D − s decay products as pions in order to suppress combinatorial background.
• For D − s → K − π + π − , the relevant peaking backgrounds are Λ − c → pπ + π − in which the antiproton is misidentified, and D − → K + π − π − in which both the kaon and a pion are misidentified. As this is the smallest branching fraction D − s decay mode used, and hence that most affected by background, all D − s decay products are required to pass tight PID requirements.
• The D − s → K − K + π − mode is split into three submodes. We distinguish between the resonant D − s → φπ − and D − s → K * 0 K − decays, and the remaining decays. Candidates in which the K + K − pair falls within 20 MeV/c 2 of the φ mass are identified as a D − s → φπ − decay. This requirement suppresses most of the crossfeed and combinatorial background, and only loose PID requirements are needed. Candidates within a 50 MeV/c 2 window around the K * 0 mass are identified as a D − s → K * 0 K − decay; it is kinematically impossible for a candidate to satisfy both this and the φ requirement. In this case there is non-negligible background from misidentified D − → K + π − π − and Λ − c → pπ − K + decays which are suppressed through tight PID requirements on the D − s kaon with the same charge as the D − s pion. The remaining candidates, referred to as nonresonant decays, are subject to tight PID requirements on all decay products to suppress cross-feed backgrounds. Figure 2 shows the relevant mass distributions for candidates passing and failing this PID selection. Finally a loose PID requirement is made on the companion track. After all selection requirements, fewer than 2% of retained events contain more than one signal candidate. All candidates are used in the subsequent analysis.

Signal and background shapes
The signal and background shapes are obtained using a mixture of data-driven approaches and simulation. The simulated events need to be corrected for kinematic differences between simulation and data, as well as for the kinematics-dependent efficiency of the PID selection requirements. In order to obtain kinematic distributions in data for this weighting, we use the decay mode B 0 → D − π + , which can be selected with very high purity without the use of any PID requirements and is kinematically very similar to the B 0 s signals. The PID efficiencies are measured as a function of particle momentum and event occupancy using prompt D * + → D 0 (K − π + )π + decays which provide pure samples of pions and kaons [32], henceforth called D * + calibration sample.

B 0 s candidate mass shapes
In order to model radiative and reconstruction effects, the signal shape in the B 0 s mass is the sum of two Crystal Ball [33] functions with common mean and oppositely oriented tails. The signal shapes are determined separately for B 0 s → D ∓ s K ± and B 0 s → D − s π + from simulated candidates. The shapes are subsequently fixed in the multivariate fit except for the common mean of the Crystal Ball functions which floats independently for B 0 The functional form of the combinatorial background is taken from the upper B 0 s sideband, with its parameters left free to vary in the subsequent multivariate fit. Each D − s mode is considered independently and parameterised by either an exponential function or by a combination of an exponential and a constant function.
The shapes of the fully or partially reconstructed backgrounds are kernel templates [34] fixed from simulated events. Exceptions to this are the , which are obtained from data. In these cases the relevant background is reconstructed with the "wrong" mass hypothesis but without the use of PID requirements, which would suppress it. The resulting shape is then weighted for the efficiency of the PID requirements from the D * + calibration samples, and a kernel template is extracted for use in the multivariate fit.

D − s candidate mass shapes
The signal shape in the D − s mass is again a sum of two Crystal Ball functions with common mean and oppositely oriented tails. The signal shapes are extracted separately for each D − s decay mode from simulated events that have the full selection chain applied to them. The shapes are subsequently fixed in the multivariate fit except for the common mean of the Crystal Ball functions, which floats independently for each D − s decay mode. The combinatorial background consists of both random combinations of tracks which do not peak in the D − s mass, and, in some D − s decay modes, backgrounds that contain a true D − s , and a random companion track. It is parameterised separately for each D − s decay mode either by an exponential function or by a combination of an exponential function and the signal D − s shape. The fully and partially reconstructed backgrounds which contain a correctly recon- are assumed to have the same mass distribution as the signal. For other backgrounds, the shapes are kernel templates taken from simulated events, as in the B 0 s mass.

Companion L(K/π) shapes
We obtain the PDFs describing the L(K/π) distributions of pions and kaons from dedicated D * + calibration samples. We obtain the PDF describing the protons using a calibration sample of Λ + c → pK − π + decays. These samples are weighted to match the signal kinematic and event occupancy distributions in the same way as the simulated events. The weighting is done separately for each signal and background component, as well as for each magnet polarity. The shapes for each magnet polarity are subsequently combined according to the integrated luminosity in each sample.
The signal companion L(K/π) shape is obtained separately for each D − s decay mode to account for small kinematic differences between them. The combinatorial background companion L(K/π) shape is taken to consist of a mixture of pions, protons, and kaons, and its normalisation is left floating in the multivariate fit. The companion L(K/π) shape for fully or partially reconstructed backgrounds is obtained by weighting the PID calibration samples to match the event distributions of simulated events, for each background type.
The total PDF for the multivariate fit is built from the product of the signal and background PDFs, since correlations between the fitting variables are measured to be small in simulation. These product PDFs are then added for each D − s decay mode, and almost all background yields are left free to float. The only exceptions are those backgrounds whose yield is below 2% of the signal yield. These are These background yields are fixed from known branching fractions and relative efficiencies measured using simulated events. The multivariate fit results in a signal yield of 28 260 ± 180 B 0 s → D − s π + and 1770 ± 50 B 0 s → D ∓ s K ± decays, with an effective purity of 85% for B 0 s → D − s π + and 74% for B 0 s → D ∓ s K ± . The multivariate fit is checked for biases using large samples of data-like pseudoexperiments, and none is found. The results of the multivariate fit are shown in Fig. 3 for both the B 0 s → D − s π + and B 0 s → D ∓ s K ± , summed over all D − s decay modes. Data  Data

Flavour Tagging
The identification of the B 0 s initial flavour is performed by means of two flavour-tagging algorithms. The opposite side (OS) tagger determines the flavour of the non-signal bhadron produced in the proton-proton collision using the charge of the lepton (µ, e) produced in semileptonic B decays, or that of the kaon from the b → c → s decay chain, or the charge of the inclusive secondary vertex reconstructed from b-decay products. The same side kaon (SSK) tagger searches for an additional charged kaon accompanying the fragmentation of the signal B 0 s or B 0 s . Each of these algorithms has an intrinsic mistag rate ω = (wrong tags)/(all tags). This can be due to tracks from the underlying event, particle misidentifications, or flavour oscillations of neutral B mesons. For each B 0 s candidate the tagging algorithms predict the mistag probability, η, using a neural network whose inputs are the kinematic, geometric, and PID properties of the tagging particle(s). The neural network is trained on simulated events.
The estimated η is treated as a per-candidate variable, thus adding an observable to the fit. Due to variations in the properties of tagging tracks for different channels, the predicted mistag probability η is usually not exactly the (true) mistag rate ω, which requires η to be calibrated using flavour specific, and therefore self-tagging, decays. The statistical uncertainty on C f , S f , and S f scales with 1/ √ ε eff , defined as ε eff = ε tag (1 − 2ω) 2 where ε tag is the efficiency to tag an event. Therefore, the tagging algorithms are tuned for maximum effective tagging power.

Tagging calibration
The calibration for the OS tagger is performed using several control channels: B + → J/ψ K + , B + → D 0 π + , B 0 → D * − µ + ν µ , B 0 → J/ψ K * 0 and B 0 s → D − s π + . This calibration of η is done for each control channel using the linear function where the values of p 0 and p 1 are called calibration parameters, and η is the mean of the η distribution predicted by a tagger in a specific control channel. Systematic uncertainties are assigned to account for possible dependences of the calibration parameters on the final state considered, on the kinematics of the B 0 s candidate and on the event properties. The corresponding values of the calibration parameters are summarised in Table 1. For each control channel the relevant calibration parameters are reported with their statistical and systematic uncertainties. These are averaged to give the reference values including a systematic uncertainty accounting for kinematic differences between different channels. The resulting calibration parameters for the B 0 s → D ∓ s K ± fit are: p 0 = 0.3834 ± 0.0014 ± 0.0040 and p 1 = 0.972 ± 0.012 ± 0.035, where the p 0 for each control channel needs to be translated to the η of B 0 s → D − s π + , the channel which is most similar to the signal channel B 0 s → D ∓ s K ± . This is achieved by the transformation p 0 → p 0 + p 1 ( η − 0.3813) in each control channel. Table 1: Calibration parameters of the combined OS tagger extracted from different control channels. In each entry the first uncertainty is statistical and the second systematic.
Control channel η p 0 − η p 1 B + → J/ψ K + 0.3919 0.0008 ± 0.0014 ± 0.0015 0.982 ± 0.017 ± 0.005 B + → D 0 π + 0.3836 0.0018 ± 0.0016 ± 0.0015 0.972 ± 0.017 ± 0.005 B 0 → J/ψ K * 0 0.390 0.0090 ± 0.0030 ± 0.0060 0.882 ± 0.043 ± 0.039 B 0 → D * − µ + ν µ 0.3872 0.0081 ± 0.0019 ± 0.0069 0.946 ± 0.019 ± 0.061 B 0 s → D − s π + 0.3813 0.0159 ± 0.0097 ± 0.0071 1.000 ± 0.116 ± 0.047 Average 0.3813 0.0021 ± 0.0014 ± 0.0040 0.972 ± 0.012 ± 0.035 The SSK algorithm uses a neural network to select fragmentation particles, giving improved flavour tagging power [35] with respect to earlier cut-based [36] algorithms. It is calibrated using the B 0 s → D − s π + channel, resulting in η = 0.4097, p 0 = 0.4244 ± 0.0086 ± 0.0071 and p 1 = 1.255 ± 0.140 ± 0.104, where the first uncertainty is statistical and second systematic. The systematic uncertainties include the uncertainty on the decay-time resolution, the B 0 s → D − s π + fit model, and the backgrounds in the B 0 s → D − s π + fit. Figure 4 shows the measured mistag probability as a function of the mean predicted mistag probability in B 0 s → D − s π + decays for the OS and SSK taggers. The data points show a linear correlation corresponding to the functional form in Eq. 7. We additionally validate that the obtained tagging calibration parameters can be used in B 0 s → D ∓ s K ± decays by comparing them for B 0 s → D ∓ s K ± and B 0 s → D − s π + in simulated events; we find excellent agreement between the two. We also evaluate possible tagging asymmetries between B and B mesons for the OS and SSK taggers by performing the calibrations split by B meson flavour. The OS tagging asymmetries are measured using B + → J/ψ K + decays, while the SSK tagging asymmetries are measured using prompt D ± s mesons whose p T distribution has been weighted to match the B 0 s → D − s π + signal. The resulting initial flavour asymmetries for p 0 , p 1 and ε tag are taken into account in the decay-time fit.

Combination of OS and SSK taggers
Since the SSK and OS taggers rely on different physical processes they are largely independent, with a correlation measured as negligible. The tagged candidates are therefore split into three different samples depending on the tagging decision: events only tagged by the OS tagger (OS-only), those only tagged by the SSK tagger (SSK-only), and those tagged by both the OS and SSK taggers (OS-SSK). For the candidates that have decisions from both taggers a combination is performed using the calibrated mistag probabilities. The combined tagging decision and calibrated mistag rate are used in the final time-dependent fit, where the calibration parameters are constrained using the combination of their associated statistical and systematic uncertainties. The tagging performances, as well as the effective tagging power, for the three sub-samples and their combination as measured using B 0 s → D − s π + events are reported in Table 2.

Mistag distributions
Because the fit uses the per-candidate mistag prediction, it is necessary to model the distribution of this observable for each event category (SS-only, OS-only, OS-SSK for the signal and each background category). The mistag probability distributions for all B 0 s decay modes, whether signal or background, are obtained using sWeighted B 0 s → D − s π + events. The mistag probability distributions for combinatorial background events are obtained from the upper B 0 s mass sideband in B 0 s → D − s π + decays. For B 0 and Λ 0 b backgrounds the mistag distributions are obtained from sWeighted B 0 → D − π + events. For the SSK tagger this is justified by the fact that these backgrounds differ by only one spectator quark and should therefore have similar properties with respect to the fragmentation of the ss pair. For the OS tagger, the predicted mistag distributions mainly depend on the kinematic properties of the B candidate, which are similar for B 0 and Λ 0 b backgrounds.

Decay-time resolution and acceptance
The decay-time resolution of the detector must be accounted for because of the fast B 0 s -B 0 s oscillations. Any mismodelling of the resolution function also potentially biases and affects the precision of the time-dependent CP violation observables. The signal decay-time PDF is convolved with a resolution function that has a different width for each candidate, making use of the per-candidate decay-time uncertainty estimated by the decay-time kinematic fit. This approach requires the per-candidate decay-time uncertainty to be calibrated. The calibration is performed using prompt D − s mesons combined with a random track and kinematically weighted to give a sample of "fake B 0 s " candidates, which have a true lifetime of zero. From the spread of the observed decay times, a scale factor to the estimated decay time resolution is found to be 1.37 ± 0.10. Here the uncertainty is dominated by the systematic uncertainty on the similarity between the kinematically weighted "fake B 0 s " candidates and the signal. As with the per-candidate mistag, the distribution of per-candidate decay-time uncertainties is modelled for the signal and each type of background. For the signal these distributions are taken from sWeighted data, while for the combinatorial background they are taken from the B 0 s mass sidebands. For other backgrounds, the decay-time error distributions are obtained from simulated events, which are weighted for the data-simulation differences found in B 0 s → D − s π + signal events. In the case of background candidates which are either partially reconstructed or in which a particle is misidentified, the decay-time is incorrectly estimated because either the measured mass of the background candidate, the measured momentum, or both, are systematically misreconstructed. For example, in the case of B 0 s → D − s π + as a background to B 0 s → D ∓ s K ± , the momentum measurement is unbiased, while the reconstructed mass is systematically above the true mass, leading to a systematic increase in the reconstructed decay-time. This effect causes an additional non-Gaussian smearing of the decay-time distribution, which is accounted for in the decay time resolution by nonparametric PDFs obtained from simulated events, referred to as k-factor templates.
The decay-time acceptance of B 0 s → D ∓ s K ± candidates cannot be floated because its shape is heavily correlated with the CP observables. In particular the upper decay-time acceptance is correlated with A ∆Γ f and A ∆Γ f . However, in the case of B 0 s → D − s π + , the acceptance can be measured by fixing Γ s and floating the acceptance parameters. The decay-time acceptance in the B 0 s → D ∓ s K ± fit is fixed to that found in the B 0 s → D − s π + data fit, corrected by the acceptance ratio in the two channels in simulated signal events. These simulated events have been weighted in the manner described in Sec. 4. In all cases, the acceptance is described using segments of smooth polynomial functions ("splines"), which can be implemented in an analytic way in the decay-time fit [37]. The spline boundaries ("knots") were chosen in an ad hoc fashion to model reliably the features of the acceptance shape, and placed at 0.5, 1.0, 1. with the published LHCb measurement of ∆m s = 17.768 ± 0.023 ± 0.006 ps −1 [38]. The time fit to the B 0 s → D − s π + data together with the measured decay-time acceptance is shown in Fig. 5. 8 Decay-time fit to B 0 s → D ∓ s K ± As described previously, two decay-time fitters are used: in one all signal and background time distributions are described (cFit), and in a second the background is statistically subtracted using the sPlot technique [15] where only the signal time distributions are described (sFit). In both cases an unbinned maximum likelihood fit is performed to the CP observables defined in Eq. 5, and the signal decay-time PDF is identical in the two fitters. Both the signal and background PDFs are described in the remainder of this section, but it is important to bear in mind that none of the information about the background PDFs or fixed background parameters is relevant for the sFit. When performing the fits to the decay-time distribution, the following parameters are fixed from independent measurements [12,13,39]: Here ρ(Γ s , ∆Γ s ) is the correlation between these two measurements, Γ Λ 0 b is the decay-width of the Λ 0 b baryon, Γ d is the B 0 decay width, and ∆m s is the B 0 s oscillation frequency. The signal production asymmetry is fixed to zero because the fast B 0 s oscillations wash out any initial asymmetry and make its effect on the CP observables negligible. The signal detection asymmetry is fixed to (1.0 ± 0.5)%, with the sign convention in which positive detection asymmetries correspond to a higher efficiency to reconstruct positive kaons [40,41]. The background production and detection asymmetries are floated within constraints of ±1% for B 0 s and B 0 decays, and ±3% for Λ 0 b decays. The signal and background mistag and decay-time uncertainty distributions, including k-factors, are modelled by kernel templates as described in Sec. 6 and 7. The tagging calibration parameters are constrained to the values obtained from the control channels for all B 0 s decay modes, and fixed to p 0 = 0.5, p 1 = 0 for the B 0 and Λ 0 b decays. All modes use the same spline-based decay-time acceptance function described in Sec. 7.
The backgrounds from B 0 s decay modes are all flavour-specific, and are modelled by the decay-time PDF used for B 0 s → D − s π + decays convolved with the appropriate decay-time resolution and k-factors model for the given background. The backgrounds from Λ 0 b decay modes are all described by a single exponential convolved with the appropriate decay-time resolution and k-factor models. The B 0 → D − K + background is flavour specific and is described with the same PDF as B 0 s → D − s π + , except with ∆m d instead of ∆m s in the oscillating terms, Γ d instead of Γ s and the appropriate decay-time resolution and k-factor kernel templates. The B 0 → D − π + background, on the other hand, is not a flavour specific decay, and is itself sensitive to CP violation as discussed in Sec. 1. Its decay-time PDF therefore includes nonzero S f and S f terms which are constrained to their world-average values [12]. The decay-time PDF of the combinatorial background used in the cFit is a double exponential function split by the tagging category of the event, whose parameters are measured using events in the B 0 s mass sidebands. All decay-time PDFs include the effects of flavour tagging, are convolved with a single Gaussian representing the per-candidate decay-time resolution, and are multiplied by the decay-time acceptance described in Sec. 7. Once the decay-time PDFs are constructed, the sFit proceeds by fitting the signal PDF to the sWeighted B 0 s → D ∓ s K ± candidates. The cFit, on the other hand, performs a six-dimensional fit to the decay time, decay-time error, predicted mistag, and the three variables used in the multivariate fit. The B 0 s mass range is restricted to m(B 0 s ) ∈ [5320, 5420] MeV/c 2 , and the yields of the different signal and background components are fixed to those found in this fit range in the multivariate fit. The decay-time range of the fit is τ (B 0 s ) ∈ [0.4, 15.0] ps in both cases. The results of the cFit and sFit for the CP violating observables are given in Table 3, and their correlations in Table 4. The fits to the decay-time distribution are shown in Fig. 6 alongside the folded asymmetry plots for D + s K − and D − s K + final states. The folded asymmetry plots show the difference in the rates of B 0 s and B 0 s tagged D + s K − and D − s K + candidates, plotted in slices of 2π/∆m s , where the sWeights obtained with the multivariate fit have been used to subtract background events. The plotted asymmetry function is drawn using the sFit central values of the CP observables, and is normalised using the expected dilution due to mistag and time resolution.

Parameter
C

Systematic uncertainties
Systematic uncertainties arise from the fixed parameters ∆m s , Γ s , and ∆Γ s , and from the limited knowledge of the decay time resolution and acceptance. These uncertainties are estimated using large sets of simulated pseudoexperiments, in which the relevant parameters are varied. The pseudoexperiments are generated with the average of the cFit and sFit central values reported in Sec. 8. They are subsequently processed by the full data fitting procedure: first the multivariate fit to obtain the sWeights, and then the decay time fits. The fitted values of the observables are compared between the nominal fit, where all fixed parameters are kept at their nominal values, and the systematic fit, where each parameter is varied according to its systematic uncertainty. A distribution is formed by normalising the resulting differences to the uncertainties measured in the nominal fit, and the mean and width of this distribution are added in quadrature and conservatively assigned as the systematic uncertainty. The systematic uncertainty on the acceptance is strongly anti-correlated with that due to the fixed value of Γ s . This is because the acceptance parameters are determined from the fit to B 0 s → D − s π + data, where Γ s determines the expected exponential slope, so that the acceptance parameterises any difference between the observed and the expected slope. The systematic pseudoexperiments are also used to compute the systematic covariance matrix due to each source of uncertainty.
The total systematic covariance matrix is obtained by adding the individual covariance matrices. The resulting systematic uncertainties are shown in Tables 5 and 6 relative to the corresponding statistical uncertainties. The contributions from Γ s and ∆Γ s are listed independently for comparison to convey a feeling for their relative importance. For this comparison, Γ s and ∆Γ s are treated as uncorrelated systematic effects. When computing the total, however, the correlations between these two, as well as between them and the acceptance parameters, are accounted for, and the full systematic uncertainty which enters into the total is listed as "acceptance, Γ s , ∆Γ s ". The cFit contains fixed parameters describing the decay time of the combinatorial background. These parameters are found to be correlated to the CP parameters, and a systematic uncertainty is assigned.
The result is cross-checked by splitting the sample into two subsets according to the two magnet polarities, the hardware trigger decision, and the BDTG response. There is   good agreement between the cFit and the sFit in each subsample. However, when the sample is split by BDTG response, the weighted averages of the subsamples show a small discrepancy with the nominal fit for C f , S f , and S f , and a corresponding systematic uncertainty is assigned. In addition, fully simulated signal and background events are fitted in order to check for systematic effects due to neglecting correlations between the different variables in the signal and background PDFs. No bias is found. A potential source of systematic uncertainty is the imperfect knowledge on the tagging parameters p 0 and p 1 . Their uncertainties are propagated into the nominal fits by means of Gaussian constraints, and are therefore included in the statistical error. A number of other possible systematic effects were studied, but found to be negligible. These include possible production and detection asymmetries, and missing or imperfectly modelled backgrounds. Potential systematic effects due to fixed background yields are evaluated by generating pseudoexperiments with the nominal value for these yields, and fitting back with the yields fixed to twice their nominal value. No significant bias is observed and no systematic uncertainty assigned. No systematic uncertainty is attributed to the imperfect knowledge of the momentum and longitudinal scale of the detector since both effects are taken into account by the systematic uncertainty in ∆m s .
Both the cFit and sFit are found to be unbiased through studies of large ensembles of pseudoexperiments generated at the best-fit point in data. In addition, differences between the cFit and sFit are evaluated from the distributions of the per-pseudoexperiment differences of the fitted values. Both fitters return compatible results. Indeed, an important result of this analysis is that the sFit technique has been successfully used in an environment with such a large number of variables, parameters and categories. The sFit technique was able to perform an accurate subtraction of a variety of time-dependent backgrounds in a multidimensional fit, including different oscillation frequencies, different tagging behaviours, and backgrounds with modified decay-time distributions due to misreconstructed particles.

Interpretation
The measurement of the CP -sensitive parameters is interpreted in terms of γ − 2β s and subsequently γ. For this purpose we have arbitrarily chosen the cFit as the nominal fit result. The strategy is to maximise the following likelihood where α = (γ, φ s , r DsK , δ) is the vector of the physics parameters, A is the vector of observables expressed through Eqs. 6, A obs is the vector of the measured CP violating observables and V is the experimental (statistical and systematic) covariance matrix. Confidence intervals are computed by evaluating the test statistic ∆χ 2 ≡ χ 2 ( α min ) − χ 2 ( α min ), where χ 2 ( α) = −2 ln L( α), in a frequentist way following Ref. [42]. Here, α min denotes the global maximum of Eq. 8, and α min is the conditional maximum when the parameter of interest is fixed to the tested value. The value of β s is constrained to the LHCb measurement from B 0 s → J/ψ K + K − and B 0 s → J/ψ π + π − decays, φ s = 0.01 ± 0.07 (stat) ± 0.01 (syst) rad [13]. Neglecting penguin pollution and assuming no BSM contribution in these decays, φ s = −2β s . The resulting confidence intervals are, at 68% CL,

Conclusion
The CP violation sensitive parameters which describe the B 0 s → D ∓ s K ± decay rates have been measured using a dataset of 1.0 fb −1 of pp collision data. Their values are found to be