Updated search for $B_c^+$ decays to two charm mesons

A data set corresponding to an integrated luminosity of $9 \text{fb}^{-1}$ of proton-proton collisions collected by the LHCb experiment has been analysed to search for $B_c^+ \to D^{(*)+}_{(s)} \overset{\scriptstyle (-)}{D}{}^{(*)0}$ decays. The decays are fully or partially reconstructed, where one or two missing neutral pions or photons from the decay of an excited charm meson are allowed. Upper limits for the branching fractions, normalised to $B^+$ decays to final states with similar topologies, are obtained for fourteen $B_c^+$ decay modes. For the decay $B_c^+ \to D_s^+ {\overline{D}}^0$, an excess with a significance of 3.4 standard deviations is found.


Introduction
Heavy-flavour states with b quarks are characterised by a relatively long lifetime and a large number of decay channels, and allow for highly sensitive studies of charge and parity (CP ) symmetry violation and quantum-loop induced amplitudes.In the B + c meson, a b quark is accompanied by a charm quark, c, forming a system where decays of both the beauty and the charm quark, as well as weak annihilation processes, contribute to the decay amplitude [1].
Transition amplitudes between up-type quarks and down-type quarks are described by the Cabibbo-Kobayashi-Maskawa (CKM) quark-mixing matrix [2,3].Figure 1 illustrates the CKM-favoured, but colour-suppressed B + c → D + s D 0 decay (unless specified otherwise, charge conjugation is implied throughout this article) and the CKM-suppressed, but colour-favoured B + c → D + s D 0 decay, which are expected to have similar amplitudes.This may result in a large, O(1), CP asymmetry for final states that are common between D 0 and D 0 decays.Consequently decays of B + c mesons to two charm mesons, D 0 , have been proposed to measure the angle γ ≡ arg(−V ud V * ub /V cd V * cb ) [4][5][6][7], one of the key parameters of the CKM matrix.Presently, the most precise determinations of γ come from measurements of the CP asymmetry in B + → ( ) [8,9].Predicted branching fractions of B + c decays to two charm mesons [10][11][12][13][14] are listed in Table 1.Final-state interactions may result in an enhancement of B + c → D + ( ) D 0 decay rates [15].Moreover, contributions from physics beyond the Standard Model could potentially affect fully hadronic B decays [16][17][18].
This article describes a search for sixteen D ( * )0 decay channels, using proton-proton (pp) collision data collected by the LHCb experiment, corresponding to an integrated luminosity of 9 fb −1 , of which 1 fb −1 was recorded at a centre-of-mass energy √ s = 7 TeV, 2 fb −1 at √ s = 8 TeV and 6 fb −1 at √ s = 13 TeV.The data taken at 7 and 8 TeV are referred to as Run 1, and the data taken at 13 TeV as Run 2. The Run 1 data has previously been analysed and no evidence of B + c → D ( * )+ (s) , (1) where f c /f u is the ratio of the B + c to B + fragmentation fraction, N denotes the measured B + (c) yields, and ε represents the detection efficiencies.The value of f c /(f u + f d ) • B(B + c → J/ψµ + ν µ ) has been measured at centre-of-mass energies of 7 and 13 TeV [20].Under the assumption of equal production from hadronisation of B + and B 0 , f u = f d , the value of f c /f u is found to be 0.73% at √ s = 7 TeV and 0.76% at √ s = 13 TeV with relative uncertainties of approximately 25%, dominated by the uncertainty on the predicted value of B(B + c → J/ψµ + ν µ ), for which no measurements are available.Earlier measurements of f c /f u at 7 and 8 TeV using fully reconstructed B + c decays found compatible values [21,22].
The invariant-mass distributions of partially reconstructed D * 0 decays overlap.Their branching fractions are measured separately by treating the contribution as arising entirely from each decay: where X 0 represents a neutral pion or a photon.Decays of B + c → D * + ( ) D * 0 with a fully reconstructed D * + → D 0 π + decay, and one missing neutral pion or photon from the ( ) D * 0 meson decay, results in measurements of The D * 0 and B + c → D * + ( ) D * 0 decays can also be observed when both excited charm mesons decay with either a photon or a neutral pion and neither of the two neutral particles are reconstructed.In such cases, the ratio R is measured: In total twenty ratios are measured, corresponding to sixteen B + c branching fractions, since B + c decays with a D * + in the final state are searched for both in fully reconstructed D * + → D 0 π + and in partially reconstructed D * + → D + X 0 decays.

Detector and simulation
The LHCb detector [23,24] is a single-arm forward spectrometer covering the pseudorapidity range 2 < η < 5, designed for the study of particles containing b or c quarks.The detector includes a high-precision tracking system consisting of a siliconstrip vertex detector surrounding the pp interaction region [25], a large-area silicon-strip detector located upstream of a dipole magnet with a bending power of about 4 Tm, and three stations of silicon-strip detectors and straw drift tubes [26,27] placed downstream of the magnet.The tracking system provides a measurement of the momentum, p, of charged particles with a relative uncertainty that varies from 0.5% at low momentum to 1.0% at 200 GeV/c.The minimum distance of a track to a primary pp collision vertex (PV), the impact parameter, is measured with a resolution of (15 + 29/p T ) µm, where p T is the component of the momentum transverse to the beam, in GeV/c.Different types of charged hadrons are distinguished using information from two ring-imaging Cherenkov detectors [28].Photons, electrons and hadrons are identified by a calorimeter system consisting of scintillating-pad and preshower detectors, an electromagnetic and a hadronic calorimeter.Muons are identified by a system composed of alternating layers of iron and multiwire proportional chambers [29].The online event selection is performed by a trigger [30], which consists of a hardware stage, based on information from the calorimeter and muon systems, followed by a software stage, which applies a full event reconstruction.
At the hardware trigger stage, events are required to have a muon with high p T or a hadron, photon or electron with high transverse energy in the calorimeters.For hadrons, the transverse energy threshold is 3.5 GeV.The software trigger requires a two-, threeor four-track secondary vertex with a significant displacement from any PV.At least one track should have p T > 1.7 GeV/c and χ 2 IP with respect to any PV greater than 16, where χ 2 IP is defined as the difference in the vertex-fit χ 2 of a given PV reconstructed with and without the considered particle.A multivariate algorithm [31,32] is used for the identification of secondary vertices consistent with the decay of a b hadron.
Simulation is used to model the effects of the detector acceptance and the imposed selection requirements, as well as for the training of the multivariate selection of the B + c signals, and for establishing the shape of the mass distributions of the signals.The Pythia [33] package, with a specific LHCb configuration [34], is used to simulate pp collisions with B + production.For B + c production, the Bcvegpy [35] generator is used, interfaced with the Pythia parton shower and hadronisation model.Decays of unstable particles are described by EvtGen [36], in which final-state radiation is generated using Photos [37].The interaction of the generated particles with the detector, and its response, are implemented using the Geant4 toolkit [38] as described in Ref. [39].The simulated B + production is corrected to match the observed spectrum of B + → D + s D 0 decays in data, using a gradient boosted reweighter (GBR) [40] technique.The weights w(p T , y) are determined separately for Run 1 and Run 2. Simulated B + c events are corrected to match the measured linear dependence of f c /(f u + f d ) on p T and y [20].In addition, corrections using control samples are applied to the simulated events to improve the agreement with data regarding particle identification (PID) variables, the momentum scale and the momentum resolution.

Candidate selection
Charm-meson candidates are formed by combining two, three or four tracks that are incompatible with originating from any reconstructed PV.The tracks are required to form a high-quality vertex and the scalar sum of their p T must exceed 1.8 GeV/c.To reduce background from misidentified particles, the pion and kaon candidates must also satisfy loose criteria on DLL Kπ , the ratio of the likelihood between the kaon and pion PID hypotheses.
The reconstructed mass of D 0 , D + s and D + candidates is required to be within ±25 MeV/c 2 of their known values [41].For channels with a fully reconstructed D * + → D 0 π + meson, the mass difference ∆m between the D * + and the D 0 candidates is required to be within ±10 MeV/c 2 of the known value [41].If more than one charm-meson candidate is formed from the same track combination, only the best according to PID information is selected.
candidate with a ( ) D 0 candidate if the combination has a p T greater than 4.0 GeV/c, forms a good-quality vertex and originates from a PV.The reconstructed decay time of the charm meson candidates with respect to the B + (c) vertex divided by its uncertainty, t/σ t , is required to exceed −3 for D + s and D 0 mesons.This requirement is increased to +3 for the longer-lived D + meson to eliminate background from B + → D 0 π + π − π + decays where the negatively charged pion is misidentified as a kaon.Candidate B + c decays that are compatible with the combination of a fully reconstructed B 0 (s) → D ( * )− (s) π + (π − π + ) decay and a charged track are rejected.To eliminate duplicate tracks, the opening angles between any pair of final-state particles are required to be at least 0.5 mrad.The invariant-mass resolution of B + (c) decays is significantly improved by applying a kinematic fit [42] where the invariant masses of the D 0 and the D ( * )+ (s) candidates are constrained to their known values [41], all particles from the D ( * )+ (s) , D 0 , and B + (c) decay are constrained to originate from their corresponding decay vertex and the B + (c) candidate is constrained to originate from the PV with which it has the smallest χ 2 IP .To reduce the combinatorial background, while maintaining high efficiency for signal, a multivariate selection based on a boosted decision tree (BDT) [43,44]   The data are divided in increasing order of signal purity into three samples having low, medium and high BDT output.Most of the sensitivity in this search comes from the data in the high BDT sample, but including data with lower signal purity increases the signal efficiency and constrains the shape of the combinatorial background.A small fraction of the events (≈ 1%) have more than one B + (c) candidate that satisfies the minimum BDT requirement.In such cases, one randomly selected candidate is retained per event.
Figure 2 shows the invariant mass distributions of selected B + (c) candidates in the highest BDT sample, summed over all D 0 final states.

Model of the B + (c) mass distributions
To measure the signal yields, a model of the B + (c) candidate mass distribution is fitted to the data in the range 5230 ≤ m(D The model consists of the following components, constrained to positive yields: the signals for fully reconstructed B + and B + c decays; the signal for B + c decays with one missing π 0 or photon; the signal for B + c decays with two missing π 0 or photons; the background from B + → D 0 K + K − π + decays; and the combinatorial background. Fully reconstructed B + and B + c signals are described by the sum of a Gaussian function and a Crystal Ball (CB) [46] function, extended to have power-law tails on both the low-mass and the high-mass sides.The CB and Gaussian components share a common peak position.The tail parameters of the CB and the ratio of the CB and Gaussian widths and integrals are determined from simulation, accounting for a dependence of both widths on the BDT output.The ratio of the B + c and B + widths is determined from simulation, while the overall width of the B + is a free parameter in the fit to data, and is found to be consistent with the simulation.The peak position of the B + signal is a free parameter in  decays are implemented as templates, obtained from kernel fits [47] to reconstructed mass distributions in simulation.Both longitudinal and transverse polarisations of D * 0 decays contribute according to free polarisation fractions with different templates.
The Cabibbo-favoured B + → D 0 K + K − π + decay is a background to the B + → D + s D 0 channel, though its yield is strongly reduced by the D + s mass requirement.This background is modelled by a single Gaussian function, with the width determined from a sample of simulated decays and the normalisation determined from the peak in the B + mass distribution in the D + s invariant mass sideband.The combinatorial background is described by the sum of an exponential function and a constant, where the parameters are allowed to differ between different D 0 decay modes, but are taken to be the same for all BDT samples of a given B + c and D 0 decay channel.Studies of the charm-meson invariant-mass sidebands support these assumptions.
An unbinned extended maximum-likelihood fit is used to simultaneously describe the mass distributions of candidates in different BDT samples and different D 0 decay modes.In these fits the background and B + yields vary independently, but the branching fraction ratios R, R (+,0) and R , defined in Eqs.1-5, are constrained to be identical between the BDT samples and D 0 decay modes.

Systematic uncertainties
Systematic uncertainties that can be expressed as a relative uncertainty on the branching fraction ratio are evaluated separately for Run 1 and Run 2, and for each B + c decay, D 0 channel and BDT sample.Their effective contributions in the fit, calculated as a weighted average over BDT samples and D 0 decay modes, are listed in Table 2.Where no uncertainty is given, this corresponds to either the absence of decays with two missing neutral particles in the D * + ( ) D 0 channel or the absence of the effect associated with an uncertainty in a given data-taking period or channel.
The uncertainty on the B + c signal shape is evaluated by changing the B + c signal shape to the sum of two Gaussian functions, and evaluating the median fractional change of the measured yield in pseudoexperiments performed with a background-only model.Uncertainties related to the B + c production spectrum are evaluated by changing the slope parameters from Ref. [20] by their quoted uncertainties.The uncertainty on the p Tand y-dependent weights used to correct the B + production spectrum in simulation is estimated by changing the settings of the GBR algorithm.Hit resolution parameterisation in the silicon vertex detector affects the χ 2 IP distribution.The uncertainty associated with the parameterisation is therefore evaluated with simulation by varying the minimal value of the χ 2 IP applied to the final-state tracks.The limited size of the simulation samples results in uncertainties that are uncorrelated between the BDT samples and D 0 decay channels on the efficiency ratios ε(B + c )/ε(B + ).All other systematic uncertainties are treated as fully correlated.A small uncertainty on the reconstruction efficiency results from that on the B + c lifetime [41].Uncertainties on the PID efficiencies cancel to first order in the ratio ε(B + c )/ε(B + ) because of the identical particle content of the final state, and the difference in relative efficiencies with and without PID corrections is used to estimate the uncertainty from the PID correction procedure.The requirement to select at most one B + (c) candidate per event introduces  an efficiency that may not be well reproduced by simulation.Therefore, the fraction of candidates removed by the requirement of at most one B + (c) candidate per event is attributed as a systematic uncertainty.Residual differences appear in the comparison of the distributions of the BDT output between background-subtracted B + signal from data and simulation.The effect on the relative efficiency is evaluated by correcting the simulation to match the distributions in data.The background from B + → D 0 K + K − π + decays to the B + → D + s D 0 signal is assigned an uncertainty of 100% of its yield, resulting in a fractional uncertainty of less than 1%.The measurements of the branching fraction ratios according to Eqs. 2 and 5 involve the value of B(D * + → D + X 0 ), the uncertainty of which [41] is taken into account.
Other uncertainties, listed below, are instead taken into account by varying the fit model.Unless specified otherwise, these uncertainties are taken into account by replacing fixed values of the model parameters by their Gaussian constraints.
The uncertainty on the combinatorial background shape is evaluated by considering a single exponential function as an alternative to the exponential plus constant model, implemented using the discrete profiling method [48].The B + shape uncertainty has a negligible effect on the B + yield but, because of its long tails, results in an uncertainty on the background shape.The effect is evaluated by assigning an uncertainty on the tail parameters determined from a fit to simulated events.The uncertainty on the α parameters of the CB function is increased by adding in quadrature the largest observed difference between data and simulation of this parameter in B + → D + s D 0 decays.The fractional uncertainty on the yield of DCS crossfeed is the sum in quadrature of the fractional uncertainty on the normalisation yield and on the branching fractions B(D 0 → K + π − (π + π − )) [41].The uncertainty on the difference between the B + c and B + peak positions, 0.5 MeV/c 2 , arises due to uncertainty on the measured masses [41,49] and the momentum scale uncertainty [50].The uncertainty on the ratio of the B + c to B + invariant-mass resolution is determined from the statistical uncertainties of the fits to simulated decays.Statistical uncertainties in the templates of D 0 and D * 0 signals have contributions from both transverse and longitudinal polarisations, which have differently shaped distributions of the reconstructed mass.This is accounted for by evaluating upper limits both for fully longitudinal and fully transverse polarisations, and reporting the least stringent upper limit.Including all model uncertainties results in an increase of the upper limits on the branching fraction ratios, discussed in Sec.6, of 7% on average.

Results and conclusions
To determine the B + c branching fraction ratios R, R (+,0) and R , fits to data are performed separately for the six B + c final states and for Run 1 and Run 2, while different D 0 decay modes and BDT samples are fit simultaneously.The results of the fits are shown in Fig. 2, where the data of the highest BDT samples and the corresponding fit results are summed over the D 0 decay channels and over data-taking periods.Detailed views of the D + The significance of the B + c signals are calculated using Wilks' theorem [51] as S = √ 2∆ log L, where ∆ log L is the difference in the logarithm of the likelihood between the signal plus background and background-only hypotheses.Systematic uncertainties are included in the calculation of the significance through nuisance parameters in a minimised profile likelihood.
Evidence is found only for the decay B + c → D + s D 0 in Run 2 data, with a significance of 3.7 standard deviations, and the measured branching fraction ratio is R(D + s D 0 ) = (3.6 +1.5+0.3 −1.2−0.2 ) × 10 −4 , where the first uncertainty is statistical and the second is systematic.The quoted significance for this channel is compatible with estimates from simulated pseudoexperiments.
The values of R, R (+,0) and R from Run 1 and Run 2 cannot be directly combined since the value of f c /f u depends on the pp centre-of-mass energy.Therefore, a combined fit of both the Run 1 and Run 2 data sets is made to the absolute B + c branching fractions, using external input for B(B + → D + s D 0 ), B(B + → D + D 0 ), B(B + → D * + D 0 ) [41], and f c /f u [20], which depends on the theory prediction of B(B + c → J/ψµ + ν µ ).Corresponding uncertainties are included.In this combined fit the excess for B + c → D + s D 0 has a significance of 3.4 standard deviations, or 2.5 standard deviations when considering the probability of the excess to appear in any of the sixteen final states considered.The corresponding value of the branching fraction is , where the first uncertainty is statistical, the second systematic and the third due to external input.
Upper limits are reported on the ratio of branching fractions for all decays, calculated at 90% and 95% confidence level (C.L.) with the frequentist CL s method [52,53], separately for Run 1 and Run 2. These limits are listed in Table 3. Limits for Run 1 are tighter than in Ref. [19] in particular for R (+,0) and R , mainly because of better constraints on the shape of the combinatorial background.
Upper limits on the absolute B + c branching fractions are based on the Run 2 dataset alone, which has nearly four times the sensitivity of the Run 1 dataset.The upper limits at 90(95%) C.L. are The reported upper limits on B + c decays with a D * + meson in the final state are based on the analyses of fully reconstructed D * + → D 0 π + decays, which have a higher sensitivity D 0 decays, covering sixteen B + c decay channels, which include partially reconstructed decays where one or two neutral pions or photons from the decay of an excited charm meson are not reconstructed.The results, based on pp collision data corresponding to 9 fb −1 of integrated luminosity, supersede an earlier LHCb measurement [19] on Run 1 data only.No signal is observed in any of the channels investigated, consistent with the Standard Model expectation.An excess with a significance of 3.4 standard deviations is found for the decay B + c → D + s D 0 , which is in tension with the theoretical expectation [11][12][13][14].
D + ( ) D 0 and D * + ( ) D 0 final states, separately for the D 0 → K − π + and D 0 → K − π + π − π + decay channels, and separately for the Run 1 and Run 2 data samples.For a given D 0 final state, the same classifier is used for both B + c → D ( * )+ (s) D 0 and B + c → D ( * )+ (s) D 0 decays.For signal decays, the BDT classifier is trained using simulated B + c events, while for background, data in the range 5350 < m(D ( * )+ (s) ( ) D 0 ) < 6200 MeV/c 2 are used.For the background sample, the charm-meson mass windows are increased from ±25 MeV/c 2 to ±75 MeV/c 2 , to increase the size of the training sample.The k-fold cross-training technique [45] with k = 5 is used to avoid biases in the calculation of the BDT output.

Figure 2 :
Figure 2: Invariant-mass distributions for the selected B + (c) candidates in the highest BDT samples for (top left) D + s D 0 , (top right) D + s D 0 , (center left) D + D 0 , (center right) D + D 0 , (bottom left) D * + D 0 and (bottom right) D * + D 0 , final states.The overlaid curves correspond to the sum of the corresponding fit results.

Figure 3 :
Figure 3: Invariant-mass distributions for the selected B + candidates in the highest BDT samples, in the region near the B + mass, for (left) D + s D 0 and (right) D + s D 0 final states.The overlaid curves correspond to the sum of the corresponding fit results.

D
* 0 decays with one missing neutral pion or photon are accounted for by allowing a small contribution from the other template.The B + c → D * +

D 0
final states near the B + mass are shown in Fig.3which validate the model of the large B + → D + s D 0 signal and its crossfeed to the D + s D 0 final state.The integrals of the fit results in a ±40 MeV/c 2 window around the B + mass differ less then 0.2% from the candidate counts.
is employed.The BDT classifier exploits kinematic and PID properties of selected candidates, namely: the fit quality of the B + (c) candidate and both charm-meson candidate vertices; the value of χ 2 IP of the B + (c) candidate; the values of t/σ t of the B + (c) and both charm-meson candidates; the reconstructed masses of the charm-meson candidates; and the reconstructed masses of the pairs of opposite-charge tracks from the D + (s) candidate.In addition, for each charm-meson candidate, the smallest value of p T and the smallest value of χ 2 IP among the decay products, and the smallest (largest) value of DLL Kπ among all kaon (pion) candidates, are included as input variables for the BDT classifier.The BDT training is performed separately for the D +

Table 2 :
Effective contributions of the systematic uncertainties which are expressed as a relative uncertainty on the branching fraction ratio, combined over all BDT samples and D 0 decay modes, given in percent.