1 Introduction

\(B_{s}^{0}\) and B 0 mesons propagate as superpositions of particle and antiparticle flavour states. For a flavour-specific decay processFootnote 1 such as B 0D μ + ν, particle-antiparticle mixing lends a sinusoidal component to the decay rates [1, 2]. To measure mixing, the flavour state of the B meson must be observed to change, which requires knowledge of the state from at least two points in time. The experimentally accessible times to determine the flavour are at production and decay. Neglecting CP-violation in mixing, the decay rate N at a proper decay time t simplifies to

$$ N_\pm(t)= N(0) \frac{e^{-\Gamma t}}{2} \bigl[\cosh{(\Delta\Gamma t/2)} \pm \cos{(\Delta m t)} \bigr] , $$

where ΔΓ and Δm are the width and mass differencesFootnote 2 of the two mass eigenstates, and Γ is the average decay width [2]. The positive sign applies when the B meson decays with the same flavour as its production and the negative sign when the particle decays with opposite flavour to its production, later referred to as “even” and “odd”. In this study, a sample of semileptonic decays obtained with the LHCb detector is used to measure the mixing frequencies Δm s and Δm d for the \(B^{0}_{s}\) and B 0 systems. These quantities have previously been measured to high precision, usually in the combination of several channels, relying heavily on hadronic decay modes (see for example Refs. [3, 4] and our recent results, Refs. [57]). To date no observation of \(B_{s}^{0}\) mixing has been made using only semileptonic decay channels.

2 Experimental setup

The LHCb detector [8] is a single-arm forward spectrometer covering the pseudorapidity range 2<η<5, designed for the study of particles containing b or c quarks. The detector consists of several dedicated subsystems, organized successively further from the interaction region. A silicon-strip vertex detector surrounds the pp interaction region and approaches to within 8 mm of the proton beams. The first of two ring-imaging Cherenkov (RICH) detectors comes next, followed by the remainder of the tracking system, which comprises, in order: a large-area silicon-strip detector; a dipole magnet with a bending power of about 4 Tm; and three multilayer tracking stations, each with central silicon-strip detectors and peripheral straw drift tubes. After this comes the second RICH detector, the calorimeter and the muon stations.

The combined high-precision tracking system provides a momentum measurement with relative uncertainty that varies from 0.4 % at 5 GeV c −1 to 0.6 % at 100 GeV c −1, and impact parameterFootnote 3 resolution of 20 μm for tracks with high transverse momentum. By combining information from the two RICH detectors [9] charged hadrons can be identified across a wide range in momentum, around 2 to 150 GeV c −1. The calorimeter system consists of scintillating-pad and preshower detectors, an electromagnetic calorimeter and a hadronic calorimeter, allowing identification of photon, electron and hadron candidates. Muons that pass through the calorimeters are detected using a system of alternating layers of iron and multiwire proportional chambers [10]. Triggering of events is performed in two stages [11]: a hardware stage, based on information from the calorimeter and muon systems, followed by a software stage, which performs full event reconstruction.

3 Data selection and reconstruction

The LHCb dataset used in this analysis corresponds to an integrated luminosity of 1.0 fb−1 collected in pp collisions at a centre of mass energy of 7 TeV during the 2011 physics run at the LHC. Where simulation is required, Pythia 6.4 [12] is used, with a specific LHCb configuration [13]. Decays of hadronic particles are described by EvtGen [14], in which final-state radiation is generated using Photos [15]. The interaction of the generated particles with the detector and the detector response are implemented using the Geant4 toolkit [16, 17] as described in Ref. [18]. Input to EvtGen is taken from the best knowledge of branching fractions (\(\mathcal{B}\)) and form factors at the time of the simulation [1]. The same reconstruction and selection is applied on simulated and detector data.

A sample of events is selected in which a \(D_{(s)}^{+} \to K^{+}K^{-}\pi ^{+}\) candidate forms a vertex with a muon candidate. A cut-based selection is applied to enhance the fraction of real \(D^{+}_{(s)}\) mesons in this sample that arise from \(B^{0}_{(s)}\) semileptonic decays. Vertex and track reconstruction qualities, momenta, invariant masses, flight distances and particle identification (PID) variables are used. The selection was initially optimized on simulated data to maximize the signal significance, \(S/\sqrt{(S+B)}\), where S (B) denotes the number of selected signal (background) candidates. The most important cuts for this analysis are those on the PID and invariant masses. Combined information from the RICH detectors, muon stations, calorimeters and tracking allows us to place stringent requirements on a log-likelihood based PID parameter for each final-state particle separately, ensuring at least 99 % purity in the muon sample, and suppressing peaking backgrounds such as D +K π + π + decays, where a pion has been misidentified as a kaon. To allow a simultaneous measurement of Δm s and Δm d , a broad mass window for the K + K π + system is used to cover both the D + and \(D_{s}^{+}\) masses, \(-0.2 < M(K^{+}K^{-}\pi^{+}) - M_{0}(D^{+}_{s}) < 0.1~\mbox{GeV}\,c^{-2}\), where \(M_{0}(D^{+}_{s})\) is the known mass of the \(D^{+}_{s}\) meson [1]. Decays of the type D (2010)+D 0 π + are additionally suppressed by requiring that the invariant mass of the two kaons M(K + K )<1.84 GeV c −2, and combinatorial background with slow collinear pions is similarly removed with the mass requirement M(K + K π +)−M(K + K )−M 0(π +)>15 MeV c −2.

Simulation studies indicate that the selected sample is dominated by \(B_{s}^{0} \to D_{s}^{-} \mu^{+} (\nu,\pi^{0}, \gamma)\), B 0D μ +(ν,π 0,γ) and B +D μ +(ν,π +,γ) decays, where no specific intermediate states are required other than those mentioned, and where at least one neutrino will occur together with any number of the other particles in the parentheses. These additional particles are ignored and so a clear B mass peak cannot be reconstructed. For simplicity, to quantify the measured mass, M(), within its possible range, we define a “normalized mass”, n, relative to the known masses (M 0) of the B, D, and μ:

$$ n = \frac{M(D\mu)-M_0(D)-M_0(\mu)}{M_0(B)-M_0(D)-M_0(\mu)}. $$

We require 0.24<n<1.0, where the lower cut mainly removes low-mass combinatorial background candidates. The K + K π + invariant mass distribution and the normalized mass distribution (n) of the selected candidates are shown in Fig. 1, in which the \(D^{+}_{s}\) and D + peaks can clearly be seen over the combinatorial background.

Fig. 1
figure 1

Mass distributions for all selected signal candidates. Left, the K + K π + invariant mass, where the known mass of the \(D^{+}_{s}\) has been subtracted. Right, the normalized mass as defined in Eq. (2). Neutral candidates are those of the form D μ ±, while double-charged candidates are those of the form D ± μ ±. The double-charged candidates arise from several background sources, most of which are also present in the neutral sample. In the left plot, the neutral sample exhibits much larger D mass peaks, indicative of the large B signal component

Determination of the initial-state flavour is performed using the standard LHCb flavour-tagging algorithms, which are described in detail elsewhere [5, 6, 19]. These algorithms rely on the reconstruction of particles that were produced in association with, and are flavour-correlated with, the signal B-meson. The correlations arise either from fragmentation, which often produces a kaon or pion of specific charge correlated with the signal, or from “opposite-side” decays, where the decay products of the partner b quark are reconstructed (e.g. a muon). A neural network combines tagging decisions for the best tagging power [6].

A hypothesis is required for the nature of the reconstructed candidate, either \(B^{0}_{s}\) or B 0, in order to choose the tagging algorithms to be applied and to select the appropriate mass with which to calculate n. A split around the midpoint between the \(D^{+}_{s}\) and D + peaks is used. For the \(B^{0}_{s}\) hypothesis all available tags are used. For the B 0 hypothesis only opposite-side tags are used, to reduce the difference between B + and B 0 tagging performance and thus better constrain the B + background (see Sects. 5 and 6). The flavour-tagged dataset comprises 594,845 selected candidates.

Two techniques are employed to measure the mixing frequencies: (a) multidimensional log-likelihood maximization, simultaneously fitting Δm s and Δm d ; (b) model-independent Fourier analysis, used as a cross-check, which determines Δm s with good precision, but Δm d with a very poor precision. Both methods use a common determination of the proper decay time and so share a portion of the corresponding systematic effects.

4 Proper decay-time distributions

To obtain the B-meson decay times, a correction is applied for the momentum lost due to missing particles, using a k-factor method as employed in many previous measurements (see, for example, Refs. [20] and [21]). The k-factor [22] is a simulation-based statistical correction, where the average missing momentum in a simulated sample is used to correct the reconstructed momentum as a function of the reconstructed mass (as shown in Fig. 2). In this study we use a fourth-order polynomial to parameterize k as a function of the normalized mass (n from Eq. (2)), which allows us to use the same correction for \(B^{0}_{s}\) and B 0. With this approach, both Δm s and Δm d exhibit residual biases of around 1 %; these biases are known to good precision from the full simulation and are corrected in the final results.

Fig. 2
figure 2

Input to obtain the k-factor correction from the fully-simulated \(B^{0}_{s}\) sample. For each event the ratio of reconstructed to generated momentum, p rec/p sim is plotted against the normalized mass (n in Eq. (2)). The curve shows a fourth-order polynomial resulting from a fit to the mean of the distribution (in bins of n)

The experimental resolution of the proper decay time (t) reduces the visibility of the oscillations, smearing Eq. (1) with a resolution function R(t,t′−t), where t is the true decay time and t′ is the measured value. The limited performance of the tagging also reduces the visibility of the oscillations. Our selection requirements include variables that are correlated with the decay time, leading to a time-dependent efficiency function, ε(t′). Thus Eq. (1) becomes

$$\begin{aligned} N_{\pm}\bigl(t'\bigr) &= N(0) \eta\frac{e^{-\Gamma t}}{2} \bigl[ \cosh {(\Delta\Gamma t/2)} \\ &\quad \pm(1-2\omega)\cos{(\Delta m t)} \bigr] \otimes R\bigl(t,t'-t \bigr) \times\varepsilon\bigl(t'\bigr), \end{aligned}$$

where η is the tagging efficiency and ω is the mistag probability (the fraction of tags that assign the wrong flavour). We parameterize the time-dependent efficiency with an empirical “acceptance” function. Specifically Gaussian functions are used as motivated by data and full simulation studies [22], ε(t′)=1−fG(t′;μ 0,σ 1)−(1−f)G(t′;μ 0,σ 2), where G is the Gaussian function and the parameters are determined from fits to the data (typical values are σ 1,2<1 ps and μ 0≈0.01 ps).

The k-factor is a relative correction for the average missing momentum at a given value of n; as shown in Fig. 2, the range of missing momenta is broad and varies from about 70 % at n=0.2 to zero at n=1. This large relative uncertainty on the corrected momentum (p′) dominates the decay time resolution, i.e. σ(t′)/t′≈σ(p′)/p′. Hence σ(t′) is approximately proportional to t′ (as seen in Fig. 3) and the decay time resolution worsens as decay time increases. This dependence is determined and parameterized from the full simulation. We may choose between a parameterization in terms of either the generated (“true”) decay time, using a numerical convolution, or in terms of the measured decay time, using analytical methods; the latter is the default approach. The resolution dependence is well-fitted with second or third order polynomials.

Fig. 3
figure 3

Illustration of the decay time resolution obtained from a fully simulated B 0 signal sample. The left plots demonstrate the Gaussian fits (solid lines) using the full LHCb simulated data (filled), to determine the decay time resolution. Each measured (reconstructed and corrected) time, t′, is compared to the corresponding simulated decay time, t. The results are shown for several bins of t′. The dependence on decay time of the mean (bias, μ) and width (standard deviation, σ) can be fitted with a quadratic or cubic function of either t or t′. The right hand plot shows a quadratic fit to the widths

5 Multivariate fits to the data

A binned, multidimensional, log-likelihood fit to the data is made, using the Root and embedded RooFit fitting frameworks [23, 24]. In order to improve the resolution on the fitted value of Δm s , the sample is divided into two subsamples about normalized mass n=0.56 (with this value determined using fast-simulation “pseudo-experiment” studies), and the two subsamples are fitted simultaneously as described below. There are 101,000 bins over the K + K π + mass, the measured decay time (t′), the normalized mass (n<0.56 and n>0.56), and the tagging result (even and odd). Seven categories of signal and background are assigned component probability density functions (PDFs) whose fractions and shape parameters are left free in the fits to the data. The backgrounds are categorized in terms of their shapes in the mass and decay-time observables. Using the M(K + K π +) distribution we separate out peaking \(D_{(s)}^{+}\) components from combinatorial background components. Each of these categories can be further divided into two based on their decay-time shape. We use the term “prompt” to describe fake candidates containing particles exclusively produced in the primary pp interaction, and the term “detached” for candidates that contain at least one daughter of a secondary decay and which therefore tend to exhibit a significantly larger lifetime. Candidates for the signal B-decays of interest must be both detached and peaking. The signal-like decays are usually grouped together in the fit; however, we separate the specific background contribution of B + within the D + peak and fit that directly. These components are shown in together in Fig. 4 and separately in different M(K + K π +) regions in Figs. 5 and 6. Each mass PDF is a Gaussian function or a Chebychev polynomial (Fig. 4), and each background decay-time PDF is a simple exponential with an appropriate acceptance function as previously described (Fig. 6). For the signal decay-time shape we use the model described in Eq. (3), with one instance for each peak. The majority of our sensitivity arises from the mixing asymmetry, whose time-dependent fit in the signal regions is shown in Fig. 7. Any odd/even asymmetry is assumed to be constant as a function of time for prompt backgrounds and for backgrounds that are known not to mix (B +,Λ b , etc.). Generic detached backgrounds are allowed to have a time-dependent asymmetry varying as an arbitrary quadratic polynomial.

Fig. 4
figure 4

Distribution of measured K + K π + mass, where the known mass of the \(D^{+}_{s}\) has been subtracted. Black points show the data, and the various lines overlay the result of the fit. The small step at −50 MeV c −2 is the result of differences in tagging efficiency for the \(B^{0}_{s}\) and B 0 hypotheses

Fig. 5
figure 5

Measured B decay-time distribution, overlaid with projections of the fit, for background-only regions. Top left: a region between the two signal peaks, −80 to −20 MeV c −2 (with respect to the known mass of the \(D_{s}^{+}\)), showing only low decay times. Top right: a region to the right of the signal peaks 20 to 100 MeV c −2, showing only low decay times. Bottom row: the same on an extended decay-time scale and logarithmic. The legend is the same as in Fig. 4

Fig. 6
figure 6

Measured B decay-time distribution, overlaid with projections of the fit, for signal regions. Top left: for odd-tags, high-n and a region of ±20 MeV c −2 around the \(D^{+}_{s}\) mass peak, showing only low decay times, where \(B^{0}_{s}\) oscillations can be clearly seen. Top right: for odd-tags and all n for a region of ±20 MeV c −2 around the D + mass peak, showing only low decay times. Bottom row: for both tags and all n for regions of ±20 MeV c −2 around the \(D^{+}_{s}\) (left) and D + (right) mass peaks. The legend is the same as in Fig. 4

Fig. 7
figure 7

Tagged (mixing) asymmetry, (N +N )/(N ++N ), as a function of B decay time. The left plot shows the asymmetry for events for a region of ±20 MeV c −2 around the \(D^{+}_{s}\) mass peak, and the right plot shows the corresponding asymmetry around the D + mass peak. The black points show the data and the curves are projections of the fitted PDF. On the left plot the fast oscillations of \(B^{0}_{s}\) are gradually washed out by the increasingly poor decay-time resolution

The proportion of B +D μ +(ν,π +,γ) with respect to B 0D μ +(ν,π 0,γ) is fixed to 11 % with a ±2 % uncertainty, using the ratio of known fragmentation functions and branching fractions [1]. Based on the full LHCb simulation, this ratio is corrected by 25 % to account for differences in the reconstruction and tagging efficiencies, with the full value of this correction taken as a systematic uncertainty. We fix ΔΓ s using the result of a recent LHCb analysis [25], and ΔΓ d is fixed to zero.

Only the signal mass shapes and the parameters of interest, Δm s and Δm d , are shared between the two subsamples in n, which are fitted simultaneously. The goodness of the fit is verified with a local density method [26], which finds a p-value of 19.6 %.

6 Fit results and systematic uncertainties

Table 1 gives the fitted values for some important quantities. In principle the signal lifetimes are also measured, but these have very large systematic uncertainties and so no results are quoted. The systematic uncertainties on Δm s and Δm d are first discussed before the final results are given.

Table 1 A selection of fitted parameter values, for which statistical uncertainties only are given. The \(B^{0}_{s}\) signal fraction includes contributions from any detached \(D^{+}_{s}\) production. When the omitted fractions (of combinatorial background components) are included, the total fraction sums to unity within each n region separately

Several sources of systematic uncertainty on the main measured quantities, Δm s and Δm d , are considered, as summarized in Table 2. The majority of the systematic uncertainties are obtained from the data.

  • The k-factor: the k-factor correction is a simulation-based method, and so differences between the simulation and reality that modify the visible and invisible momenta potentially invalidate the correction. Such differences could for example be in D ∗∗ branching fractions or form factors. Large-scale pseudo-experiment studies are combined with full simulations to vary these underlying distributions within their uncertainties and examine biases produced on the fitted Δm values. Small relative uncertainties are found, 0.3 % for Δm s and 1.0 % for Δm d , representing the ultimate limit of this technique without further knowledge of the various sub-decays.

  • Detector alignment: momentum scale, decay-length scale, and track position uncertainties arise from known alignment uncertainties and result in variations in reconstructed masses and lifetimes as functions of decay opening angle. These uncertainties have been studied using detector survey data and various control modes; they are well determined and small in comparison to the statistical uncertainties [27].

  • Values of ΔΓ: The quantities ΔΓ d and ΔΓ s are nominally constant in our fits. When they are varied, within ±5 % for ΔΓ d (chosen to well-cover the experimental range given the lack of information on its sign [1]) and within the known uncertainty on ΔΓ s  [25], our result is only marginally affected.

  • Model bias: a correction has been made for the 1 % residual frequency bias seen in full simulation studies, as discussed in Sect. 4. This is taken directly from simulation and half of the correction is assigned a systematic uncertainty.

  • Signal proper-time model: the fit is repeated with two different time-resolution models. (a) When the resolution is parameterized as a function of true rather than measured decay time, using full numerical convolution, a (0.09, 0.002) ps−1 variation is seen in (Δm s , Δm d ). (b) When a time-independent (average) resolution is used, a 0.001 ps−1 variation is seen in Δm d (this method is not applicable to the measurement of Δm s due to many factors; crucially, within the time frame of any single \(B^{0}_{s}\) oscillation the decay time resolution worsens by an appreciable fraction of the oscillation period, seen in Figs. 3 and 7). With other modifications to the signal model (resolutions and acceptances) a larger variation in Δm d of 0.007 ps−1 is found.

  • Other models and binning: the order of the Chebychev polynomial is varied, Crystal Ball functions are used for the mass peak shapes, and the background parameterizations and the binning schemes are varied. Out of these modifications, the binning scheme has the largest effect. Resulting uncertanties of 0.05 ps−1 and 0.001 ps−1 are assigned to Δm s and Δm d , respectively.

  • Assumptions on B + decays: The Δm d measurement is sensitive to χ d , the integrated mixing probability, which in turn is sensitive to the non-mixing B +-background. We hold constant several B +-background parameters in the baseline fit, determined from the full simulation. Many features of the B + background fit are varied to evaluate systematic variations, including the fraction, the lifetime, and the corrections for relative tagging performance. The largest uncertainty arises from tagging performance corrections and for this a 0.008 ps−1 uncertainty is assigned to Δm d . It is possible to leave one or more of these parameters free during the fit, but the loss in statistical precision is prohibitive.

Table 2 Sources of systematic uncertainty on Δm s and Δm d . “Simulation” implies a combination of full LHCb simulation and pseudo-experiment studies

For cross-checks the data are split by LHCb magnet polarity and LHCb trigger strategies; no variations beyond the expected statistical fluctuations are observed.

We obtain

$$\begin{aligned} &\Delta m_s = \bigl( 17.93 \pm0.22 \textrm{(stat)} \pm0.15 \textrm{(syst)}\bigr)~\textrm{ps}^{-1} , \\ &\Delta m_d = \bigl( 0.503 \pm0.011 \textrm{(stat)} \pm0.013 \textrm{(syst)}\bigr)~\textrm{ps}^{-1}. \end{aligned}$$

To obtain a measure for the significance of the observed oscillations, the global likelihood minimum for the full fit is compared with the likelihood of the hypotheses corresponding to the edges of our search window (Δm=0 or Δm≥50 ps−1). Both would result in almost flat asymmetry curves (cf. Fig. 7) corresponding to no observed oscillations. We reject the null hypothesis of no oscillations by the equivalent of 5.8 standard deviations for \(B^{0}_{s}\) oscillations and 13.0 standard deviations for B 0 oscillations.

7 Fourier analysis

The full fit as described above was performed in the time domain, but measurement of the mixing frequency can also be made directly in the frequency domain as a cross-check, using well-established Fourier transform techniques [2830]. The cosine term in Eq. 3 has a different sign for the odd and even samples, where the lifetime, acceptance, and other features are shared; this simplifies the analysis in the frequency domain. Any Fourier components not arising from mixing are suppressed by subtracting the odd Fourier spectrum from the even spectrum and no parameterizations of the background shapes, signal shapes, or decay-time resolution are required, allowing a model-independent measurement of the mixing frequencies. We search for the Δm s peak in the subtracted Fourier spectrum, shown in Fig. 8. Extensive fast simulation pseudo-experiments have shown that the value of Δm s is obtained reliably and with a reasonable precision using this method; however Δm d is heavily biased and has a large uncertainty, and so a result is not quoted. Since residual components of the Fourier spectrum are of much lower frequency than the Δm s component, and several complete oscillation periods of Δm s are observable, the search for a spectral peak is relatively free from complications. For Δm d , however, the relatively low frequency is similar to that of many other features of the data, and only a single oscillation period is observed; therefore the determination of Δm d is difficult with this simple model-independent approach.

Fig. 8
figure 8

Result of using Fourier transforms to search for the Δm s -peak. The image on the left is constructed from bins of the K + K π + mass which are 25 MeV c −2 in width, analysed in steps of 5 MeV c −2 such that a smooth image is produced. The colour scale (blue–green–yellow–red) is an arbitrary linear representation of the signal intensity; dark blue is used for zero and below. The vertical dashed line is drawn at 18.0 ps−1. The apparent double-peak structure is an artifact of this image. On the right a slice around the \(D^{+}_{s}\) mass region shows only the peak as used to measure the central value and rms width

Taking the spectrum for events in a 25 MeV c −2 bin around the \(D^{+}_{s}\) mass, we find a clear and separated peak (Fig. 8, right). The rms width of the peak is 0.4 ps−1, around a peak value of 17.95 ps−1; the rms can be used as a model-independent proxy for the statistical uncertainty. To further evaluate the expected statistical fluctuation in the peak value, we perform a large set of fast simulation pseudo-experiments taking the result of the multivariate fit as a model for signal and background. The uncertainty found from the simulation studies is 0.32 ps−1, slightly smaller than given by the rms. We report Δm s =(17.95±0.40(rms)±0.11(syst)) ps−1, in order to be model-independent. Systematic uncertainties arise from the detector alignment and the k-factor correction method, common to both measurement techniques, as quantified previously in Sect. 6.

8 Conclusion

The mixing frequencies for neutral B mesons have been measured using flavour-specific semileptonic decays. To correct for the momentum lost to missing particles, a simulation-based kinematic correction, known as the k-factor, was adopted. Two techniques were used to measure the mixing frequencies: a multidimensional simultaneous fit to the K + K π + mass distribution, the decay-time distribution, and tagging information; and a simple Fourier analysis. The results of the two methods were consistent, with the first method being more precise. We obtain

$$\begin{aligned} &\Delta m_s = \bigl( 17.93 \pm0.22 \textrm{(stat)} \pm0.15 \textrm {(syst)}\bigr)~\textrm{ps}^{-1} , \\ &\Delta m_d = \bigl( 0.503 \pm0.011 \textrm{(stat)} \pm0.013 \textrm {(syst)}\bigr)~\textrm{ps}^{-1}. \end{aligned}$$

We reject the hypothesis of no oscillations by 5.8 standard deviations for \(B^{0}_{s}\) and 13.0 standard deviations for B 0. This is the first observation of \(B^{0}_{s}\)\(\overline{B}{}^{0}_{s} \) mixing to be made using only semileptonic decays.