A precise measurement of the $B^0$ meson oscillation frequency

The oscillation frequency, $\Delta m_d$, of $B^0$ mesons is measured using semileptonic decays with a $D^-$ or $D^{*-}$ meson in the final state, in a data sample of $pp$ collisions collected by the LHCb detector corresponding to an integrated luminosity of 3.0$\mbox{fb}^{-1}$. A combination of the two decay modes gives $\Delta m_d = (505.0 \pm 2.1 \pm 1.0) \rm \,ns^{-1}$, where the first uncertainty is statistical and the second is systematic. This is the most precise single measurement of this parameter. It is compatible with the current world average and has similar precision.


Introduction
Flavour oscillation, or mixing, of neutral meson systems gives mass eigenstates that are different from flavour eigenstates. In the B 0 -B 0 system, the mass difference between mass eigenstates, ∆m d , is directly related to the square of the product of the CKM matrix elements V tb and V * td , and is therefore sensitive to fundamental parameters of the Standard Model, as well as to non-perturbative strong-interaction effects and the square of the top quark mass [1]. Measurements of mixing of neutral B mesons were published for the first time by UA1 [2] and ARGUS [3]. Measurements of B 0 -B 0 mixing have been performed by CLEO [4], experiments at LEP and SLC [5], experiments at the Tevatron [6, 7], the B Factories experiments [8,9] and, most recently, at LHCb [10-12]. The combined world average value for the mass difference, ∆m d = (510 ± 3) ns −1 , has a relative precision of 0.6% [13]. This paper reports a measurement of ∆m d based on B 0 → D − µ + ν µ X and B 0 → D * − µ + ν µ X decays, 1 where X indicates any additional particles that are not reconstructed. The data sample used for this measurement was collected at LHCb during LHC Run 1 at √ s = 7 (8) TeV in 2011 (2012), corresponding to integrated luminosities of 1.0 (2.0) fb −1 .
The relatively high branching fraction for semileptonic decays of B 0 mesons, along with the highly efficient lepton identification and flavour tagging capabilities at LHCb, results in abundant samples of B 0 → D ( * )− µ + ν µ X decays, where the flavour of the B 0 meson at the time of production and decay can be inferred. In addition, the decay time t of B 0 mesons can be determined with adequate resolution, even though the decay is not fully reconstructed, because of the potential presence of undetected particles. It is therefore possible to precisely measure ∆m d as the frequency of matter-antimatter oscillations in a time-dependent analysis of the decay rates of unmixed and mixed events, where the state assignment is based on the flavours of the B 0 meson at production and decay, which may be the same (unmixed) or opposite (mixed). In Eqn. 1, Γ d = 1/τ B 0 is the decay width of the B 0 meson, τ B 0 being its lifetime. Also, in Eqn. 1 the difference in the decay widths of the mass eigenstates, ∆Γ d , and CP violation in mixing are neglected, due to their negligible impact on the results. The flavour asymmetry between unmixed and mixed events is A description of the LHCb detector and the datasets used in this measurement is given in Sec. 2. Section 3 presents the selection criteria, the flavour tagging algorithms, and the method chosen to reconstruct the B 0 decay time. The fitting strategy and results are described in Sec. 4. A summary of the systematic uncertainties is given in Sec. 5, and conclusions are reported in Sec. 6.

Detector and simulation
The LHCb detector [14,15] is a single-arm forward spectrometer covering the pseudorapidity range 2 < η < 5, designed for the study of particles containing b or c quarks. The detector includes a high-precision tracking system consisting of a silicon-strip vertex detector surrounding the pp interaction region, a large-area silicon-strip detector located upstream of a dipole magnet with a bending power of about 4 Tm, and three stations of silicon-strip detectors and straw drift tubes placed downstream of the magnet. The tracking system provides a measurement of momentum, p, of charged particles with a relative uncertainty that varies from 0.5% at low momentum to 1.0% at 200 GeV/c. The minimum distance of a track to a primary vertex (PV), the impact parameter (IP), is measured with a resolution of (15 + 29/p T ) µm, where p T is the component of the momentum transverse to the beam, in GeV/c. Different types of charged hadrons are distinguished using information from two ring-imaging Cherenkov (RICH) detectors. Photons, electrons and hadrons are identified by a calorimeter system consisting of scintillating-pad and preshower detectors, an electromagnetic calorimeter and a hadronic calorimeter. Muons are identified by a system composed of alternating layers of iron and multiwire proportional chambers.
The online event selection is performed by a trigger [16], which consists of a hardware stage, based on information from the calorimeter and muon systems, followed by a software stage, which applies a full event reconstruction. Candidate events are first required to pass the hardware trigger, which selects muons with a transverse momentum p T > 1.48 GeV/c in the 7 TeV data or p T > 1.76 GeV/c in the 8 TeV data. The software trigger requires a two-, three-or four-track secondary vertex, where one of the tracks is identified as a muon, with a significant displacement from the primary pp interaction vertices. At least one charged particle must have a transverse momentum p T > 1.7 GeV/c and be inconsistent with originating from a PV. As it will be explained later, the software trigger selection introduces a bias on the ∆m d measurement, which is corrected for. A multivariate algorithm [17] is used for the identification of secondary vertices consistent with the decay of a b hadron.
The method chosen to reconstruct the B 0 decay time relies on Monte Carlo simulation. Simulation is also used to estimate the main background sources and to verify the fit model. In the simulation, pp collisions are generated using Pythia [18] with a specific LHCb configuration [19]. Decays of hadronic particles are described by EvtGen [20], in which final-state radiation is generated using Photos [21]. The interaction of the generated particles with the detector, and its response, are implemented using the Geant4 toolkit [22] as described in Ref. [23]. Large samples of mixtures of semileptonic decays resulting in a D − or a D * − meson in the final state were simulated and the assumptions used to build these samples are assessed in the evaluation of systematic uncertainties.

Event selection
For charged particles used to reconstruct signal candidates, requirements are imposed on track quality, momentum, transverse momentum, and impact parameter with respect to any PV. Tracks are required to be identified as muons, kaons or pions. The charm mesons are reconstructed through the D − → K + π − π − decay, or through the D * − → D 0 π − , D 0 → K + π − decay chain. The masses of the reconstructed D − and D 0 mesons should be within 70 MeV/c 2 and 40 MeV/c 2 of their known values [13], while the mass difference between the reconstructed D * − and D 0 mesons should lie between 140 MeV/c 2 and 155 MeV/c 2 . For D − and D 0 candidates, the scalar sum of the p T of the daughter tracks should be above 1800 MeV/c. A good quality vertex fit is required for the D − , D 0 , and D * − candidates, and for the D ( * )− µ + combinations. When more than one combination is found in an event, the one with the smallest vertex χ 2 (hereafter referred to as the B candidate) is chosen. The reconstructed vertices of D − , D 0 , and B candidates are required to be significantly displaced from their associated PV, where the associated PV is that which has the smallest χ 2 increase when adding the candidate. For D − and D 0 candidates, a large IP with respect to the associated PV is required in order to suppress charm mesons promptly produced in pp collisions. The momentum of the B candidate, and its flight direction measured using the PV and the B vertex positions, are required to be aligned. These selection criteria reduce to the per-mille level or lower the contribution of D ( * )− decays where the charmed meson originates from the PV. The invariant mass of the B candidate is required to be in the range [3.0, 5.2] GeV/c 2 .
Backgrounds from B → J/ψ X decays, where one of the muons from the J/ψ → µ + µ − decay is correctly identified and the other misidentified as a pion and used to reconstruct a D ( * )− , are suppressed by applying a veto around the J/ψ mass. Similarly, a veto around the Λ + c mass is applied to suppress semileptonic decays of the Λ 0 b baryon, in which the proton of the subsequent Λ + c decay into pK − π + is misidentified as a pion. The dominant background is due to B + → D ( * )− µ + ν µ X decays, where additional particles coming from the decay of higher charm resonances, or from multi-body decays of B + mesons, are neglected. The fractions of B + decays in the D − and D * − samples are expected to be 13% and 10%, based on the branching fractions of signal and background, with uncertainties at the 10% level. This background is reduced by using a multivariate discriminant based on a boosted decision tree (BDT) algorithm [24,25], which exploits information on the B candidate, kinematics of the higher charm resonances and isolation criteria for tracks and composite candidates in the B decay chain. Training of the BDT classifier is carried out using simulation samples of B 0 → D * − µ + ν µ X signal and B + → D * − µ + ν µ X background. The variables used as input for the BDT classifier are described in the Appendix. Only candidates with BDT output larger than −0.12 (−0.16) are selected in the 2011 (2012) data sample for the B 0 → D − µ + ν µ X mode. The BDT output is required to be larger than −0.3 in both 2011 and 2012 data samples for the B 0 → D * − µ + ν µ X mode. The impact of this requirement on signal efficiency and background retention can be seen in Fig. 3. The background from B + decays is reduced by 70% in both modes. Combinatorial background is evaluated by using reconstructed candidates in the D ( * )− signal mass sidebands. Backgrounds due to decays of B 0 s and Λ 0 b into similar final states to those of the signal are studied through simulations. The decay time of the B 0 meson is calculated as t = (M B 0 · L)/(p rec · c/k), where M B 0 is the mass of the B 0 , taken from Ref. [13], L is the measured decay length and p rec is the magnitude of the visible momentum, measured from the D ( * )− meson and the muon. The correction factor k is determined from simulation by dividing the visible B 0 momentum by its true value and taking the average, k = p rec /p true . This correction represents the dominant source of uncertainty in the determination of the decay time of the B 0 meson for t > 1.5 ps. Since the k-factor depends strongly on the decay kinematics, it is parametrised by a fourth-order polynomial as a function of the visible mass of the B 0 candidate as explained in the Appendix.
The B 0 flavour at production is determined by using information from the other b hadron present in the event. The decision of flavour tagging algorithms [26] based on the charge of leptons, kaons and of an inclusively reconstructed detached vertex, is used for the B 0 → D * − µ + ν µ X channel. In the B 0 → D − µ + ν µ X channel, which is subject to a larger B + background contamination, the decision of the tagging algorithm based on the detached vertex is excluded in order to avoid spurious background asymmetries. The statistical uncertainty on ∆m d decreases as T −1/2 where the tagging power is defined as T = ε tag (1 − 2ω) 2 , where ε tag is the tagging efficiency and ω is the mistag rate. To increase the statistical precision, the events are grouped into four tagging categories of increasing predicted mistag probability η, defined by η ∈ The mistag probability η is evaluated for each B candidate from event and taggers properties and was calibrated on data using control samples [26]. The average mistag rates for signal and background are taken as free parameters when fitting for ∆m d . The combined tagging power [26] for the B 0 → D − µ + ν µ X mode is (2.38 ± 0.05)% and (2.46 ± 0.04)% in 2011 and 2012. For the B 0 → D * − µ + ν µ X mode, the tagging power in 2011 and 2012 is (2.55 ± 0.07)% and (2.32 ± 0.04)%.

Fit strategy and results
The fit proceeds as follows. First, D ( * )− mesons originating from semileptonic B 0 or B + decays are separated from the background coming from combinations of tracks not associated to a charm meson decay, by a fit to the invariant mass distributions of the selected candidates. This fit assigns to each event a covariance-weighted quantity sWeight, which is used in the subsequent fits to subtract statistically the contribution of the background by means of the sPlot procedure [27]. Then, the contribution of D ( * )− from B + decays is determined in a fit to the distributions of the BDT classifier output weighted by signal sWeights. Next, a cut is applied on the BDT output in order to suppress the B + background, the mass distributions are fitted again, and new sWeights are determined. Finally, the oscillation frequency ∆m d is determined by a fit to the decay time distribution of unmixed and mixed candidates, weighted for the signal sWeights determined in the previous step.
An extended binned maximum likelihood fit to the data distributions is performed for each stage, simultaneously for the four tagging categories defined above. Data samples collected in 2011 and 2012 are treated separately. Figure 1 shows the results of the fits to the D − candidate mass distributions for In these fits, the distributions of D − from B 0 and B + decays are summed as they are described by the same probability density function (PDF): the sum of two Gaussian functions and a Crystal Ball function [28]. The yields corresponding to the D − peak are (5.30 ± 0.02) × 10 5 and (1.393 ± 0.003) × 10 6 in 2011 and 2012 data, respectively. The combinatorial background, which contributes typically 6% under the D − peak, is modelled with an exponential distribution. For the B 0 → D * − µ + ν µ X samples, a simultaneous fit to the distributions of the K + π − invariant mass, m K + π − , and the invariant mass difference of K + π − π − and K + π − combinations, δm = m K + π − π − − m K + π − , is performed. Three different components are considered: the signal D * from B 0 or B + decays and two background sources. The PDF for the mass distributions of D * from B decays is defined by the sum of two Gaussian functions and a Crystal Ball function in the m K + π − mass projection and by two Gaussian functions and a Johnson function [29] in the δm mass projection. Background candidates containing a D 0 originating from a b hadron decay without an intermediate D * resonance, which contribute about 15% in the full δm mass range, are described by the same distribution as that of the signal for m K + π − , and by an empirical function based on a phase-space distribution for δm. A combinatorial background component which contributes typically 0.8% under the D * peak is modelled with an exponential distribution for m K + π − and the same empirical distribution for δm as used for the D 0 background. All parameters that describe signal and background shapes are allowed to vary freely in the invariant mass fits. The results of the 2011 and 2012 fits for these parameters are compatible within the statistical uncertainties. Figure 2 shows the results of the fit to the B 0 → D * − µ + ν µ X samples, projected onto the two mass observables. The yields corresponding to the D * peak are (2.514 ± 0.006) × 10 5 and (5.776 ± 0.009) × 10 5 in 2011 and 2012 data.
The fraction of B + background in data, α B + , is determined with good precision by fitting the distribution of the BDT classifier, where templates for signal and B + background are obtained from simulation. Fits are performed separately in tagging categories for 2011 and 2012 data, giving fractions of B + of 6% and 3% on average for the B 0 → D − µ + ν µ X and the B 0 → D * − µ + ν µ X modes with relative variation of the order of 10% between samples. The results of the fits to 2012 data for both modes are given in Fig. 3. Limited knowledge of the exclusive decays used to build the simulation templates leads to systematic uncertainties of 0.5% and 0.4% on the B + fractions for B 0 → D − µ + ν µ X and B 0 → D * − µ + ν µ X. In the decay time fit, the B + fractions are kept fixed. The statistical and systematic uncertainties on α B + lead to a systematic uncertainty on ∆m d , which is reported in Sec. 5.
The oscillation frequency ∆m d is determined from a binned maximum likelihood fit to the distribution of the B 0 decay time t of candidates classified as mixed (q = −1) or unmixed (q = 1) according to the flavour of the B 0 meson at production and decay time.
The total PDF for the fit is given by where the time distributions for signal and background are given by Here N and N B + are normalisation factors, and Γ d and Γ u are fixed in the fit to their world average values [13], where Γ u = 1/τ B + , with τ B + being the lifetime of the B + meson. The mistag fractions for signal and B + components, ω sig and ω B + , vary freely in the fit. To account for the time resolution, both distributions in Eq. 4 are convolved with a resolution model that takes into account uncertainties on both the decay length and the momentum. The distributions used in the fit are therefore obtained by a double convolution. The contribution accounting for the decay length resolution is described by a triple Gaussian function with an effective width corresponding to a time resolution of 75 fs, as determined from simulation. The contribution accounting for the uncertainty on the momentum is described by the distribution of p rec /(k · p true ), obtained from the simulation. This second convolution is dominant above 1.5 ps. Finally, the function P is multiplied by an acceptance function a(t) to account for the effect of the trigger and offline selection and reconstruction. The acceptance is described by a sum of cubic spline polynomials [30], which may be different for signal and B + background. The ratios between spline coefficients of the B + background acceptance and those of the signal acceptance are fixed to the values predicted by simulation. The spline coefficients for signal are then determined for each tagging category directly from the tagged time-dependent fit to data.  The fitting strategy is validated with simulation. A bias is observed in the ∆m d value, due to a correlation between the decay time and its resolution, which is not taken into account when parameterizing the signal shape. Simulation shows that this correlation is introduced by the requirements of the software trigger and offline selection on the impact parameters of D − and D 0 with respect to the PV. Values for this bias, of up to 4 ns −1 with a 10% uncertainty, are determined for each mode and for each year by fitting the true and corrected time distributions and taking the differences between the resulting values of ∆m d . The uncertainty on the bias is treated as a systematic uncertainty on ∆m d .
The values of ∆m d , obtained from the time-dependent fit and corrected for the fit bias, are reported in Table 1. Systematic uncertainties are discussed below. The four        Table 2: Sources of systematic uncertainties on ∆m d , separated into those that are correlated and uncorrelated between the two decay channels B 0 → D − µ + ν µ X and B 0 → D * − µ + ν µ X.

Systematic uncertainties
The contribution of each source of systematic uncertainty is evaluated by using a large number of parameterized simulations. The difference between the default ∆m d value and the result obtained when repeating the fits after having adjusted the inputs to those corresponding to the systematic variation under test, is taken as a systematic uncertainty. Systematic uncertainties are summarized in Table 2.

Background from B +
The fraction of B + background is estimated from data with a very small statistical uncertainty. A variation, within their uncertainties, of the branching fractions of semileptonic B 0 decays resulting in a D * − or D − in the final state gives systematic uncertainties on the B + fractions of 0.5% and 0.4% for B 0 → D − µ + ν µ X and B 0 → D * − µ + ν µ X. The resulting uncertainty on ∆m d is 0.1 ns −1 in B 0 → D − µ + ν µ X and is negligible for B 0 → D * − µ + ν µ X.
In the default fit, the decay time acceptance ratio of the B 0 and the B + components is taken from simulation. The time acceptance is to a large extent due to the cut on the D 0 impact parameter. A possible systematic effect due to an incorrect determination of the acceptance ratio from simulation is estimated by fitting events, generated with the default signal and background acceptances, with an acceptance ratio determined by using a tighter D 0 IP cut than the default. This gives an uncertainty of 0.4 ns −1 on both decay modes. The above systematic uncertainties are considered as uncorrelated between the two channels. The uncertainty on ∆m d from the resolution on the B + decay length is 0.1 ns −1 in the B 0 → D − µ + ν µ X channel and is negligible in the B 0 → D * − µ + ν µ X channel.

Other backgrounds
The impact of the knowledge of backgrounds due to semileptonic B 0 s decays with D ( * )− in the final state is estimated by varying their contributions within the uncertainties on their branching fractions. This effect has a negligible impact on ∆m d for both channels. For the B 0 → D − µ + ν µ X channel, there is an additional contribution from B 0 s → D − s µ + ν µ decays, where a kaon in the D − s → K − K + π − decay is misidentified as a pion, which gives an 8% contribution due to D − s peaking under the D − mass. A difference in ∆m d of 0.5 ns −1 is observed.
The Λ 0 b → nD * − µ + ν µ decay has not been observed. However, because of the similar final state, it can be mistaken for B + background, since neither of them exhibits oscillatory behaviour. Dedicated simulated samples are generated by assuming colour suppression with respect to signal, and are used to estimate a signal contamination of 0.2% from Λ 0 b decays, with 100% uncertainty, which gives a negligible effect on ∆m d .
Small contributions from B → D ( * )− D + s X decays, with the D + s decaying semileptonically give an uncertainty of 0.2 ns −1 on ∆m d in the B 0 → D − µ + ν µ X mode, and a negligible effect for the B 0 → D * − µ + ν µ X mode.

The k-factor
Two main sources of systematic uncertainty are related to the k-factor. The first, due to possible differences in the B momentum spectrum between simulation and data, is studied by comparing the B momentum in B + → J/ψ K + decays in data and simulation, and reweighting signal simulation to estimate the effect on the k-factor distribution and therefore on ∆m d . The systematic uncertainties on ∆m d from this effect for B 0 → D − µ + ν µ X and B 0 → D * − µ + ν µ X are 0.3 ns −1 and 0.5 ns −1 . The second source, related to the uncertainties on the measurements of the branching fractions for the exclusive modes which are used to build the simulated samples, is evaluated by varying the branching fractions of exclusive decays one at a time by one standard deviation, and reweighting the corresponding kfactor distribution. An uncertainty of 0.4 ns −1 is obtained for both B 0 → D − µ + ν µ X and B 0 → D * − µ + ν µ X channels. The systematic uncertainties from the k-factor correction are taken to be correlated between the two channels.
The systematic uncertainties on ∆m d from the finite number of events in the simulation sample used to compute the k-factor corrections are 0.3 and 0.4 ns −1 (B 0 → D − µ + ν µ X) and 0.2 and 0.3 ns −1 (B 0 → D * − µ + ν µ X) for the 2011 and 2012 samples, respectively.

Other systematic uncertainties
Possible differences between data and simulation in the resolution on the B 0 flight distance are evaluated by using the results of a study reported in Ref. [31], and scaling the widths of the triple Gaussian function by a factor 1.5 with respect to the default. Uncertainties of 0.3 ns −1 and 0.5 ns −1 on ∆m d are obtained for B 0 → D − µ + ν µ X and B 0 → D * − µ + ν µ X. Both channels are affected by the same discrepancy between data and simulation; thus these systematic uncertainties are taken as correlated.
Since all parameters are allowed to vary freely in the invariant mass fits, the uncertainties from the invariant mass model are small. As a cross-check, when the fits are repeated using the sWeights determined without splitting the mass fits in tagging categories, negligible variation in ∆m d is found. Signal and background mistag probabilities are free parameters in the fit, and therefore no systematic uncertainty is associated to them.
Asymmetries in the production of neutral and charged B mesons, in tagging efficiency and mistag probabilities, and in the reconstruction of the final state are neglected in the ∆m d fits. Also, the B 0 semileptonic CP asymmetry a d sl is assumed to be zero. The systematic uncertainty on ∆m d arising from these assumptions is studied using parameterized simulations with the asymmetries set to zero, to their measured values, and to random variations from their central values within the uncertainties [32]. The resulting uncertainty on ∆m d is found to be negligible.
The bias in ∆m d from the correlation between the decay time and its resolution is determined using the simulation. The dependence of ∆m d on possible differences between data and simulation has already been considered above by varying the composition of the simulation sample used to construct the k-factor distribution. Since the bias is related to the cut on the D meson IP with respect to the PV, the fits are repeated with a k-factor distribution obtained with a tighter cut on the IP, and the difference with respect to the default is taken as the systematic uncertainty. The systematic uncertainties (0.5 and 0.3 ns −1 for B 0 → D − µ + ν µ X and B 0 → D * − µ + ν µ X, respectively) related to the bias are considered as uncorrelated between the channels, as they are determined from different simulation samples and the time-biasing cuts, responsible for the systematic uncertainty on the bias, are different for the two channels.
The knowledge of the length scale of the LHCb experiment is limited by the uncertainties from the metrology measurements of the silicon-strip vertex detector. This was evaluated in the context of the ∆m s measurement and found to be 0.022% [31]. This translates into an uncertainty on ∆m d of 0.1 ns −1 . The uncertainty on the knowledge of the momentum scale is determined by reconstructing the masses of various particles and is found to be 0.03% [33]. This uncertainty results in a 0.2 ns −1 uncertainty in ∆m d in both modes. Both uncertainties are considered correlated across the two channels.
Effects due to the choice of the binning scheme and fitting ranges are found to be negligible.  Table 1. Then, the resulting ∆m d values of each mode are averaged taking account of statistical and uncorrelated systematic uncertainties. The correlated systematic uncertainty is added in quadrature to the resulting uncertainty. The combined result is shown in the last row of Table 1.

Summary and conclusion
In conclusion, the oscillation frequency, ∆m d , in the B 0 -B 0 system is measured in semileptonic B 0 decays using data collected in 2011 and 2012 at LHCb. The decays B 0 → D − µ + ν µ X and B 0 → D * − µ + ν µ X are used, where the D mesons are reconstructed in Cabibbo-favoured decays D − → K + π − π − and D * − → D 0 π − , with D 0 → K + π − . A combined ∆m d measurement is obtained, ∆m d = (505.0 ± 2.1 (stat) ± 1.0 (syst)) ns −1 , which is compatible with previous LHCb results and the world average [13]. This is the most precise single measurement of this quantity, with a total uncertainty similar to the current world average.

A.1 BDT classifier
The variables used as input for the BDT classifier are the following:  A.2 Distributions of the k-factor       [28] T. Skwarnicki, A study of the radiative cascade transitions between the Upsilon-prime and Upsilon resonances, PhD thesis, Institute of Nuclear Physics, Krakow, 1986, DESY-F31-86-02.