First observation and measurement of the branching fraction for the decay $B^0_s \to D_s^{*\mp} K^{\pm}$

The first observation of the $B^0_s \to D_s^{*\mp} K^{\pm}$ decay is reported using 3.0$fb^{-1}$ of proton-proton collision data collected by the LHCb experiment. The $D_s^{*\mp}$ mesons are reconstructed through the decay chain $D_s^{*\mp} \to \gamma D_s^{\mp}(K^{\mp}K^{\pm}\pi^{\mp})$. The branching fraction relative to that for $B^0_s \to D_s^{*-} \pi^{+}$ is measured to be $0.068 \pm 0.005 ^{+0.003}_{-0.002}$, where the first uncertainty is statistical and the second is systematic. Using a recent measurement of $BR(B^0_s \to D_s^{*-} \pi^{+})$ the absolute branching fraction of $B^0_s \to D_s^{*\mp} K^{\pm}$ is measured as ( 16.3 $\pm$ 1.2 (stat) $^{+0.7}_{-0.5}$ (syst) $\pm$ 4.8 (norm) ) $\times$ 10$^{-5}$, where the third uncertainty is due to the uncertainty on the branching fraction of the normalisation channel.

s mesons are reconstructed through the decay chain D * ∓ s → γD ∓ s (K ∓ K ± π ∓ ). The branching fraction relative to that for B 0 s → D * − s π + decays is measured to be B(B 0 s → D * ∓ s K ± )/B(B 0 s → D * − s π + ) = 0.068 ± 0.005 +0.003 −0.002 , where the first uncertainty is statistical and the second is systematic. Using a recent measurement of B(B 0 s → D * − s π + ), the absolute branching fraction of B 0 s → D * ∓ s K ± is measured as B(B 0 s → D * ∓ s K ± ) = ( 16.3 ± 1.2 (stat) +0.7 −0.5 (syst) ± 4.8 (norm) ) × 10 −5 , where the third uncertainty is due to the uncertainty on the branching fraction of the normalisation channel.

Introduction
The weak phase γ is one of the least well-determined CKM parameters. It can be measured using time-independent decay 1 rates, such as those of B + → D 0 K + or by time-dependent studies of B 0 s → D    The B 0 s → D ∓ s K ± decay mode has already been used by LHCb to determine γ with a statistical precision of about 30 • [2], in an analysis based on data corresponding to an integrated luminosity of 1 fb −1 . An attractive feature of B 0 s → D * ∓ s K ± decays is that the theoretical formalism that relates the measured CP asymmetries to γ is the same as for B 0 s → D ∓ s K ± decays, when the angular momentum of the final state is taken into account in the time evolution of the B 0 s -B 0 s decay asymmetries. The observables of the decay B 0 s → D ( * )∓ s K ± can be related to those of B 0 → D ( * )− π + as described in Ref.
[1] through the U-spin symmetry of strong interactions. This opens the possibility of a combined extraction of γ. In addition, there is a higher sensitivity to K ± decays than in B 0 → D ( * )− π + decays due to the larger interference between the b → u and b → c amplitudes in the former.
The ratio R ≡ B(B 0 s → D ∓ s K ± )/B(B 0 s → D − s π + ) has recently been measured by LHCb [3] to be R = 0.0762 ± 0.0015 ± 0.0020, where the first uncertainty is statistical and the second systematic. This is compatible with the predicted value of R = 0.086 +0.009 −0.007 from Ref.
[1], which is based on SU (3) flavour symmetry and measurements from B factories. Under the same theoretical assumptions, the ratio and it is therefore interesting to test this prediction for vector decays.
The B 0 s → D * − s π + and B 0 s → D * ∓ s K ± decays are experimentally challenging for detectors operating at hadron colliders because they require the reconstruction of a soft photon in the D * − s → D − s γ decay. This paper describes the reconstruction of the B 0 s → D * − s π + decay, previously observed by Belle [4], as well as the first observation of the B 0 s → D * ∓ s K ± decay and the measurement of R * . This is the first step towards a measurement of the time-dependent CP asymmetry in these decays.
The pp collision data used in this analysis correspond to an integrated luminosity of 3.0 fb −1 , of which 1.0 fb −1 were collected by LHCb in 2011 at a centre-of-mass energy of √ s = 7 TeV, and the remaining 2.0 fb −1 in 2012 at √ s = 8 TeV.
The ratio of branching fractions for the decays B 0 where ε X and N X are the overall reconstruction efficiency and the observed yield, respectively, of the decay mode, and X represents either a kaon or a pion (the "bachelor" hadron) that accompanies the D * − s in the final state.

LHCb detector
The LHCb detector [5,6] is a single-arm forward spectrometer covering the pseudorapidity range 2 < η < 5, designed for the study of particles containing b or c quarks. The detector includes a high-precision tracking system consisting of a silicon-strip vertex detector surrounding the pp interaction region, a large-area silicon-strip detector located upstream of a dipole magnet with a bending power of about 4 Tm, and three stations of silicon-strip detectors and straw drift tubes placed downstream of the magnet. The tracking system provides a measurement of momentum, p, of charged particles with a relative uncertainty that varies from 0.5% at low momentum to 1.0% at 200 GeV/c. The minimum distance of a track to a primary vertex, the impact parameter, is measured with a resolution of (15 + 29/p T ) µm, where p T is the component of the momentum transverse to the beam, in GeV/c. Different types of charged hadrons are distinguished using information from two ring-imaging Cherenkov detectors. Photons, electrons and hadrons are identified by a calorimeter system consisting of scintillating-pad and preshower detectors, an electromagnetic calorimeter and a hadronic calorimeter. Muons are identified by a system composed of alternating layers of iron and multiwire proportional chambers. The online event selection is performed by a trigger which consists of a hardware stage, based on information from the calorimeter and muon systems, followed by a software stage, which applies a full event reconstruction. At the hardware trigger stage, events are required to have a muon with high p T or a hadron, photon or electron with high transverse energy in the calorimeters. For hadrons, the transverse energy threshold is 3.5 GeV. The software trigger requires a two-, three-or four-track secondary vertex with a significant displacement from the primary pp interaction vertices (PVs). At least one charged particle must have a transverse momentum p T > 1.7 GeV/c and be inconsistent with originating from a PV. A multivariate algorithm [7] is used for the identification of secondary vertices consistent with the decay of a b hadron. The p T of the photon from D * − s decay is too low to contribute to the trigger decision.
In the simulation, pp collisions are generated using Pythia [8] with a specific LHCb configuration [9]. Decays of hadronic particles are described by EvtGen [10], in which finalstate radiation is generated using Photos [11]. The interaction of the generated particles with the detector, and its response, are implemented using the Geant4 toolkit [12] as described in Ref. [13].

Event selection
Candidate B 0 s mesons are reconstructed by combining a D * − s candidate with an additional pion or kaon of opposite charge. The D * − s and D − s candidates are reconstructed in the D − s γ and K − K + π − decay modes, respectively. Each of the three D − s daughters tracks is required to have a good track quality, momentum p > 1000 MeV/c, transverse momentum p T > 100 MeV/c and a large impact parameter with respect to any PV. More stringent requirements are imposed for bachelor tracks, namely p > 5000 MeV/c and p T > 500 MeV/c. A good quality secondary vertex is required for the resulting D − s -bachelor combination. Photons are identified using energy deposits in the electromagnetic calorimeter that are not associated with any track in the tracking system. A cut on a photon confidence level variable is used to suppress background events from hadrons, electrons and "merged" π 0 decays [6]. This confidence level variable takes into account the expected absence of matching between the calorimeter cluster and any track, the energy recorded in the preshower detector and the topology of the energy deposit in the electromagnetic and hadronic calorimeters. Due to the small difference between the masses of the D * − s and D − s mesons, called ∆ M in the following, the photons from the D * − s decay have an average transverse energy of a few hundred MeV/c 2 .
Additional preselection requirements are applied to cope with a large background mainly due to genuine photons that are not D * − s decay products, or hadrons that are misidentified as photons. The reconstructed mass of the D − s candidate and the reconstructed ∆ M value are required to be in a ± 20 MeV/c 2 window around their known values [14]. The B 0 (s) →D − s K + (π + ) decays are vetoed by a cut on the invariant mass of the D − s K + (π + ) system. Particle identification (PID) [15] requirements are applied to all final-state hadrons. Finally, the maximum distance in the η-ϕ plane between the D − s and the photon is required to satisfy ∆η 2 + ∆ϕ 2 < 1, where ∆η (∆ϕ) is the pseudo-rapidity (azimuthal angle) distance between the corresponding candidates.
To further reduce the combinatorial background while preserving a high signal efficiency, a multivariate approach is used. This follows closely the selection based on a boosted decision tree (BDT) [16,17] used in the measurement of the ratio of B 0 . The algorithm is trained with simulated B 0 s → D * − s π + events as signal, and candidates in data with an invariant mass greater than 5500 MeV/c 2 as background. The five variables with the highest discriminating power are found to be the B 0 s transverse flight distance, the photon transverse momentum, the χ 2 IP of the B 0 s candidate (where χ 2 IP is defined as the difference in χ 2 of the associated PV, reconstructed with and without the considered particle), the angle between the B 0 s momentum vector and the vector connecting its production and decay vertices, and the transverse momentum of the bachelor particle. Eight additional variables, among them the transverse momenta of the remaining final-state particles, are also used. The trained algorithm is then applied to both decays. The preselection and selection for the two decay analysed for the measurement of R * differ only by the PID requirements imposed on the bachelor tracks.
The M (K − K + π − ) and ∆ M invariant mass distributions, as obtained from the decay mode B 0 s → D * − s π + , are shown in Fig. 2. These distributions have been obtained with all of the analysis requirements applied except that on the plotted variable. In both cases the B 0 s invariant mass is restricted to a ± 70 MeV/c 2 region around the known mass. A prominent peaking structure is observed in the ∆ M distribution around 145 MeV/c 2 , due to the radiative D * − s to D − s decay.

Signal yields
The signal yields are obtained using unbinned maximum likelihood fits to the candidate invariant mass distributions and are performed separately for B 0 s → D * − s π + and B 0 s → D * ∓ s K ± decays. In data and simulation the signal shapes are parametrised by a double-sided Crystal Ball (CB) function [18], which consists of a central Gaussian part, whose mean and width are free parameters, and power-law tails on both lower and upper sides, to account for energy loss due to final-state radiation and detector resolution effects. When fitting the D * − s π + and D * ∓ s K ± mass distributions, both widths of the CB are set to those obtained from the corresponding signal simulation, scaled by a variable parameter in the fit to allow for differences in the mass resolution between data and simulation.
Three background categories are identified. Partially reconstructed background decays are due to B 0 s decay modes that are similar to signal but with at least one additional photon, as for example in the case of the B 0 s → D * ∓ s ρ ± decays with ρ ± → π 0 (→ γγ) π ± . Fully reconstructed background events are due to B 0 decays to the same final states as the B 0 s signal, D * − s π + and D * − s K ± . The B 0 s → D * − s π + decay mode gives rise to a peak when the π + is misidentified as a K + . The final category is the combinatorial background where a genuine D − s meson is combined with a random (or fake) photon and a random bachelor track.
The number of partially and fully reconstructed background components is different for each of the two final states. The invariant mass shapes for these backgrounds are obtained from simulation and are represented in the fit as non-parametric probability density functions (PDFs). The yields of these background components are free parameters in the fit, with the exception of the D * − s π + , D − s ρ + and D * − s ρ + contributions in the D * ∓ s K ± fit. The size of the D * − s π + cross feed is calculated from the D * − s π + yield and the π to K misidentification probability. The D − s ρ + and D * − s ρ + contributions are determined in a similar manner, summed and fixed in the fit.
To model the combinatorial background a non-parametric PDF is used. This is obtained from the events of the ∆ M sideband in the interval [185,205] MeV/c 2 , with all other cuts unchanged.
The results of the fitting procedure applied to the two considered decay modes are shown in Fig. 3. The fitted yields are 16 513 ± 227 and 1025 ± 71 for the B 0 s → D * − s π + and B 0 s → D * ∓ s K ± cases, respectively. When χ 2 test is applied to gauge the quality of the fits, the latter fit has a χ 2 value of 88.5 for 100 bins and 7 free parameters, the quality of the former fit is equally good.
One of the distinctive features of the present analysis is the reconstruction of the decay mode D * − s → D − s γ at a hadron collider. The background-subtracted distributions for the η and p T of the photons have been obtained using the invariant mass fit results described above and the sPlot [19] method. These measured distributions are compared to the predictions of the simulation in Fig. 4. It is noted that most of the measured photons are very soft, with the average p T well below 1 GeV/c.

Systematic uncertainties
Potential systematic uncertainties on R * are those due to the background modelling and the analysis selections, including the BDT and the PID cuts. Their effects are shown in Table 1 as relative variations of the final result, with their sum in quadrature assigned as the overall systematic uncertainty. The order in which the systematic uncertainties are described in the following text corresponds to successive rows in Table 1.
Combinatorial background modelling uncertainties are studied by varying the default ∆ M range used for the combinatorial background determination, [185,205] MeV/c 2 , to [205,225] and [225,245] MeV/c 2 . An alternative modelling of this background, using a parametric shape obtained from the D − s mass sidebands, is also tested. Finally, the statistical uncertainty due to the number of events in the range [185,205] MeV/c 2 is evaluated using the bootstrap technique [20,21]. The corresponding uncertainty is taken to be the largest spread among the four differents checks.
The uncertainty due to the finite size of the simulated samples used to study the partially reconstructed backgrounds is studied using the bootstrap technique.
The uncertainties due to the D * − s π + cross feed and the D − s ρ + and D * − s ρ + contributions to the D * ∓ s K ± fit are estimated by varying their expected yields. For the D * − s π + cross feed the ±1σ variation is obtained using the D * − s π + fit results. In the D − s ρ + and D * − s ρ + cases the branching ratio uncertainties and photon kinematic distributions are different from the D * − s π + ones so the uncertainty in the yields are large. These yields are conservatively varied by ±50 %. The observed differences in the final result are assigned as the systematic uncertainties associated with these sources.
The systematic uncertainty associated with the BDT is studied by reweighted the simulation to improve the agreement with data [3].
The π and K PID efficiencies used for the bachelor track have been extracted from a D * + → D 0 π + calibration sample and parametrized as a function of several kinematic quantities of these tracks. The uncertainties in this procedure, propagated to the final result, lead to the PID systematic uncertainty.
The systematic uncertainty from the hardware trigger efficiency arises from differences in the pion and kaon trigger efficiencies which are not reproduced in the simulation. The uncertainty is scaled with the fraction of events where a signal track was responsible for triggering.

Results
The ratio of branching fractions, measured in this analysis for the first time, is .003 −0.002 (syst), where the overall systematic uncertainty is mainly due to the uncertainty on the combinatorial background estimate. The result for R * differs from the uncorrected B 0 s → D * ∓ s K ± to B 0 s → D * − s π + events ratio by a factor depending on the simulation and the PID efficiencies. This factor is determined to be 1.095 ± 0.016 and is dominated by the K to π PID efficiency ratio.
The measured value of R * is consistent with the theoretical prediction of R * = 0.099 +0.030 −0.036 [1], within the very large uncertainty of the latter. The theory is found to provide a good description of the measurements for both R * and R