Observation of the $B^0_s\rightarrow \chi_{c1}(3872)\pi^+\pi^-$ decay

The first observation of the $B^0_s \rightarrow \left( \chi_{c1}(3872) \rightarrow J/\psi\pi^+\pi^-\right) \pi^+ \pi^-$ decay is reported using proton-proton collision data, corresponding to integrated luminosities of 1, 2 and 6fb$^{-1}$, collected by the LHCb experiment at centre-of-mass energies of 7, 8 and 13TeV, respectively. The ratio of branching fractions relative to the $B^0_s \rightarrow \left( \psi(2S) \rightarrow J\psi\pi^+\pi^- \right) \pi^+ \pi^-$ decay is measured to be $$ \frac{ \mathcal{B} \left( B^0_s \rightarrow \chi_{c1}(3872) \pi^+\pi^-\right) \times \mathcal{B} \left( \chi_{c1}(3872) \rightarrow J\psi\pi^+\pi^-\right)} { \mathcal{B} \left( B^0_s \rightarrow \psi(2S) \pi^+ \pi^- \right) \times \mathcal{B} \left( \psi(2S) \rightarrow J\psi\pi^+\pi^-\right) } = \left( 6.8 \pm 1.1 \pm 0.2 \right) \times 10^{-2} , $$ where the first uncertainty is statistical and the second systematic. The mass spectrum of the $\pi^+\pi^-$ system recoiling against the $\chi_{c1}(3872)$ meson exhibits a large contribution from $B^0_s \rightarrow \chi_{c1}(3872) \left( f_0(980) \rightarrow \pi^+ \pi^-\right)$ decays.


Detector and simulation
The LHCb detector [29,30] is a single-arm forward spectrometer covering the pseudorapidity range 2 < η < 5, designed for the study of particles containing b or c quarks.The detector includes a high-precision tracking system consisting of a silicon-strip vertex detector surrounding the pp interaction region [31], a large-area silicon-strip detector located upstream of a dipole magnet with a bending power of about 4 Tm, and three stations of silicon-strip detectors and straw drift tubes [32,33] placed downstream of the magnet.
The tracking system provides a measurement of the momentum of charged particles with a relative uncertainty that varies from 0.5% at low momentum to 1.0% at 200 GeV/c.The momentum scale is calibrated using samples of J/ψ → µ + µ − and B + → J/ψK + decays collected concurrently with the data sample used for this analysis [34,35].The relative accuracy of this procedure is estimated to be 3 × 10 −4 using samples of other fully reconstructed b hadrons, Υ and K 0 S mesons.The minimum distance of a track to a primary pp-collision vertex (PV), the impact parameter (IP), is measured with a resolution of (15 + 29/p T ) µm, where p T is the component of the momentum transverse to the beam, in GeV/c.Different types of charged hadrons are distinguished using information from two ring-imaging Cherenkov detectors (RICH) [36].Photons, electrons and hadrons are identified by a calorimeter system consisting of scintillating-pad and preshower detectors, an electromagnetic and a hadronic calorimeter [37].Muons are identified by a system composed of alternating layers of iron and multiwire proportional chambers [38].
The online event selection is performed by a trigger [39], which consists of a hardware stage, based on information from the calorimeter and muon systems, followed by a software stage, which applies a full event reconstruction.The hardware trigger selects muon candidates with large transverse momentum or dimuon candidates with a large value of the product of the p T of the muons.In the software trigger, two oppositely charged muons are required to form a good-quality vertex that is significantly displaced from every PV, with a dimuon mass exceeding 2.7 GeV/c 2 .
Simulated events are used to describe signal shapes and to compute the efficiencies needed to determine the branching fraction ratio.In the simulation, pp collisions are generated using Pythia [40] with a specific LHCb configuration [41].Decays of unstable particles are described by the EvtGen package [42], in which final-state radiation is generated using Photos [43].The interaction of the generated particles with the detector, and its response, are implemented using the Geant4 toolkit [44] as described in Ref. [45].The decays B 0 s → χ c1 (3872)π + π − and B 0 s → ψ(2S)π + π + are simulated using a phase-space decay model that is adjusted to match the mass distributions of the two-pion systems recoiling against the χ c1 (3872) and ψ(2S) mesons in data.In the simulation χ c1 (3872) → J/ψπ + π − decays proceed via an S-wave J/ψρ 0 intermediate state [46][47][48].The model described in Refs.[49][50][51][52][53][54] is used for the ψ(2S) → J/ψπ + π − decays.The simulation is corrected to reproduce the transverse momentum and rapidity distributions of the B 0 s mesons observed in data.To account for imperfections in the simulation of charged-particle reconstruction, the track reconstruction efficiency determined from simulation is corrected using data-driven techniques [55].

Event selection
Candidate B 0 s → J/ψπ + π + π − π − decays are reconstructed using the J/ψ → µ + µ − decay mode.As explained in detail below, an initial selection criteria similar to those used in Refs.[16,56,57] are used to reduce the background.Subsequently, a multivariate estimator, in the following referred as the MLP classifier, is applied.It is based on an artificial neural network algorithm [58,59] configured with a cross-entropy cost estimator [60].
Muon and hadron candidates are identified using combined information from the RICH, calorimeter and muon detectors [61].The candidates are required to have a transverse momentum greater than 550 MeV/c and 200 MeV/c for muons and pions, respectively.To ensure that the particles can be efficiently separated by the RICH detectors, pions are required to have a momentum between 3.2 and 150 GeV/c.To reduce the combinatorial background due to particles produced promptly in the pp interaction, only tracks that are inconsistent with originating from a PV are used.Pairs of oppositely charged muons consistent with originating from a common vertex are combined to form J/ψ candidates.The mass of the dimuon candidate is required to be between 3.05 and 3.15 GeV/c 2 , corresponding to a range of approximately three times the µ + µ − mass resolution, around the known mass of the J/ψ meson [25].Selected J/ψ meson candidates are combined with two pairs of oppositely charged pions to form the B 0 s → J/ψπ + π + π − π − candidates and a requirement on the quality of the common six-prong vertex is imposed.To improve the mass and decay time resolution, a kinematic fit [62] is used in which the momentum direction of the B 0 s candidate is constrained to be collinear to the direction from the PV to the B 0 s decay vertex, and a mass constraint on the J/ψ state is applied.A requirement on the χ 2 of this fit, χ 2 fit , is imposed to reduce the background.The mass of selected J/ψπ + π + π − π − combinations, m J/ψπ + π + π − π − , is required to be between 5.30 and 5.48 GeV/c 2 .The proper decay time of the B 0 s candidates is required to be between 0.2 and 2.0 mm/c.The lower limit is used to reduce background from particles coming from the PV while the upper limit suppresses poorly reconstructed candidates.A possible feed down from Λ 0 b → J/ψpπ + π − π − decays, with the proton misidentified as a pion, is suppressed by rejecting the B 0 s candidates whose mass, recalculated using the proton hypothesis for one of the pion candidates, is consistent with the known mass of the Λ 0 b baryon [25].The final selection of candidates using the MLP classifier is based on the p T and pseudorapidity of J/ψ candidate, the χ 2 of the six-prong vertex, the value of χ 2 fit , the proper decay time of the B 0 s candidates, the transverse momenta of the pions, the dipion mass from the χ c1 (3872) → J/ψπ + π − decays and the angle between the momenta of the J/ψ and B 0 s mesons in the χ c1 (3872) rest frame.The MLP classifier is trained on a sample of simulated B 0 s → (χ c1 (3872) → J/ψπ + π − ) π + π − decays and a background sample of J/ψπ + π + π − π − combinations from the high mass sideband of the B 0 s signal peak, 5.42 < m J/ψπ + π + π − π − < 5.50 GeV/c 2 .The low-mass sideband of the B 0 s signal peak contains a large contribution from partially reconstructed b-hadron decays and therefore is not representative of the background under the B 0 s signal peak.For the background sample, J/ψπ + π − combinations with the J/ψπ + π − mass consistent with the known masses of the χ c1 (3872) state [20,63], 3.86 < m J/ψπ + π − < 3.88 GeV/c 2 , or ψ(2S) meson [25], 3.68 < m J/ψπ + π − < 3.69 GeV/c 2 , are excluded.To avoid introducing a bias in the MLP evaluation, a k-fold cross-validation technique [64] with k = 7 is used.The requirement on the MLP classifier is chosen to maximize the Punzi figure of merit , where ε is the efficiency for the B 0 s → χ c1 (3872)π + π − signal, α = 5 is the target signal significance and B is the expected background yield.The efficiency ε is estimated using simulation.The expected background yield within the narrow mass window centred around the known masses of the B 0 s and χ c1 (3872) mesons [25] is determined from fits to data.The selected B 0 s candidates with J/ψπ + π − mass within the range 3.85 < m J/ψπ + π − < 3.90 GeV/c 2 are considered as the B 0 s → χ c1 (3872)π + π − candidates.Similarly, the B 0 s candidates with the J/ψπ + π − mass within the range 3.67 The signal yields for the B 0 s → χ c1 (3872)π + π − and B 0 s → ψ(2S)π + π − decays are determined using a two-dimensional extended unbinned maximum-likelihood fit to the J/ψπ + π + π − π − and J/ψπ + π − mass distributions.Following Refs.[16,20,66], to improve the resolution on the J/ψπ + π − mass and to eliminate a small correlation between the J/ψπ + π − and J/ψπ + π + π − π − masses, the J/ψπ + π − mass is computed constraining the mass of the B 0 s candidate [62] to its known mass [25].The fit is performed simultaneously in two separate regions of the J/ψπ + π − mass, defined as the χ c1 (3872) region with 3.85 < m J/ψπ + π − < 3.90 GeV/c 2 , and the ψ(2S) region with 3.67 < m J/ψπ + π − < 3.70 GeV/c 2 , which correspond to the B 0 s → χ c1 (3872)π + π − and B 0 s → ψ(2S)π + π − decays, respectively.In the χ c1 (3872) region, the two-dimensional fit model is defined as the sum of four components: 1.A signal component, corresponding to the B 0 s → (χ c1 (3872) → J/ψπ + π − ) π + π − decay, described by the product of the B 0 s and χ c1 (3872) signal templates, discussed in the next paragraph.
3. A component corresponding to random combinations of the χ c1 (3872) state with a π + π − pair, parameterised as a product of the signal χ c1 (3872) template and a second-order positive-definite polynomial function [68].
4. A component corresponding to random J/ψπ + π + π − π − combinations, parameterized as a two-dimensional non-factorizable positive polynomial function P bkg , that is linear in each variable for a fixed value of the other variable.
In the ψ(2S) region, the two-dimensional fit model is defined in an equivalent way with replacement of the signal χ c1 (3872) template with the signal ψ(2S) template.In this approach we neglect possible interference effects between the B 0 s → χ c1 (3872)π + π − and non-resonant B 0 s → J/ψπ + π + π − π − contributions.The B 0 s signal shape is modelled with a modified Gaussian function with power-law tails on both sides of the distribution [69,70].The tail parameters are fixed from simulation, while the mass parameter of the B 0 s meson is allowed to vary.The detector resolution taken from simulation is corrected by a scale factor, s B 0 s , that accounts for a small discrepancy between data and simulation [16,20] and is allowed to vary in the fit.
The χ c1 (3872) and ψ(2S) signal shapes are modelled as relativistic S-wave Breit-Wigner functions convolved with the detector resolution.Due to the proximity of the χ c1 (3872) state to the D 0 D * 0 threshold, modelling this component with a Breit-Wigner function may not be adequate [23,[71][72][73][74].However, the analyses in Refs.[20,63] demonstrate that a good description of data is obtained with a Breit-Wigner line shape when the detector resolution is included.Mass parameters for the χ c1 (3872) and ψ(2S) signals are allowed to vary in the fit, while the mass difference is Gaussian constrained to the known value [20].The width parameters of the χ c1 (3872) and ψ(2S) states are fixed to known values [20,25] using a Gaussian constraint.The detector resolution functions are described by a symmetric modified Gaussian function with power-law tails on both sides , where x * ≡ (x − x min )/(x max − x min ), and x min , x max denote the minimal and maximal values of x, respectively [67].  of the distribution [69,70], with all parameters determined from simulation.The resolutions are further corrected by a common scale factor, s J/ψπ + π − , that accounts for a small discrepancy between data and simulation and is allowed to vary in the fit.
The fit is performed using the B 0 s mass and the resolution scale factors, s B 0 s and s J/ψπ + π − , as shared parameters between the χ c1 (3872)π + π − and ψ(2S)π + π − final states.The J/ψπ + π + π − π − and J/ψπ + π − mass distributions together with projections of the fit are shown in Fig. 1.The parameters of interest, namely the B 0 s yields, masses of the B 0 s meson, χ c1 (3872) and ψ(2S) states, and the resolution scale factors are listed in Table 1.The statistical significance for the B 0 s → χ c1 (3872)π + π − signal is estimated using Wilks' theorem [75] to be 7.3 standard deviations.
The ratio of branching fractions R, defined in Eq. ( 1) is calculated as s → ψ(2S)π + π − decays, respectively.The efficiencies are defined as the product of the detector geometric acceptance and the reconstruction, selection, particle identification and trigger efficiencies.All of the efficiency contributions, except the particle identification efficiency, are determined using simulated samples.The efficiencies of the hadron identification are obtained as a function of particle momentum, pseudorapidity and number of charged tracks in the event using dedicated calibration samples of D * + → (D 0 → K − π + ) π + and K 0 S → π + π − decays selected in data [36,76].The efficiency ratio is found to be where the uncertainty is due to the limited size of the simulated samples.The efficiency ratio differs from unity due to the different p T spectra of pions in the ψ(2S) → J/ψπ + π − and the χ c1 (3872) → J/ψπ + π − decays.The resulting value of R is where the uncertainty is statistical.

Dipion mass spectrum
The mass spectra for the dipion system recoiling against the χ c1 (3872) and ψ(2S) states are obtained using the sPlot technique [77], based on results of the two-dimensional fit described in previous section.The spectra shown in Fig. 2 exhibit significant deviations from the phase-space distribution and are similar to the π + π − mass spectrum observed in the B 0 s → J/ψπ + π − decay [78][79][80] and the S-wave π + π − component in D + s → π + π + π − decays [81,82].
In Ref. [78] it has been found that the π + π − mass spectrum from B 0 s → J/ψπ + π − decays can be described by a coherent sum of the f 0 (980) amplitude and a contribution from a high-mass scalar meson, initially identified as the f 0 (1370) state.References [83,84] suggest a better interpretation of the high-mass state as the f 0 (1500) resonance.A dedicated amplitude analysis from Ref. [80] followed this suggestion and confirmed the dominant f 0 (980) and f 0 (1500) contributions in B 0 s → J/ψπ + π − decays.Assuming a similarity with the B 0 s → J/ψπ + π − decays, the mass spectra of the dipion system recoiling against the χ c1 (3872) and ψ(2S) states are described with a function consisting of two components: • A component corresponding to a coherent sum of the scalar f 0 (980) and f 0 (1500) amplitudes and parameterised as where m is the π + π − mass, q is the momentum of the π + meson in the π + π − rest frame, p is the momentum of the π + π − system in the B 0 s rest frame, A f 0 (980) and A f 0 (1500) are the f 0 (980) and f 0 (1500) amplitudes, φ is a relative phase and the real coefficient f characterises the relative contributions of the f 0 (980) and f 0 (1500) components.The amplitude A f 0 (1500) is parameterised as a relativistic S-wave Breit-Wigner function, while the modified Flatté-Bugg amplitude [85,86] (see Eq. ( 18) in Ref. [80]) is used for the f 0 (980) state.
The fit is performed simultaneously for the dipion mass spectra from the B 0 s → χ c1 (3872)π + π + and B 0 s → ψ(2S)π + π + decays.The shape parameters of the f 0 (980) and f 0 (1500) states are shared in the fit.To stabilise the fit, Gaussian constraints are applied to the parameters of the f 0 (980) state and the mass and width of the f 0 (1500) state, according to Solution I from Ref. [80].
The fit results are shown in Fig. 2.This simplistic model qualitatively describes the major contributions to the dipion mass spectrum from B 0 s → ψ(2S)π + π − decays and supports the hypothesis of the dominant contribution of two S-wave resonances.The fit indicates the necessity of a dedicated analysis to properly account for the sub-leading contributions.The same model, consisting of two coherent contributions from the f 0 (980) and f 0 (1500) states describes well the dipion mass spectrum from the B 0 s → χ c1 (3872)π + π − decay.The statistical significance for the B 0 s → χ c1 (3872)f 0 (980) decay is estimated using Wilks' theorem and found to be 9.1 standard deviations.

Systematic uncertainties
Due to the similar decay topologies, systematic uncertainties largely cancel in the ratio R. The remaining contributions to systematic uncertainties are summarized in Table 2 and discussed below.
The systematic uncertainties arising from imperfect knowledge of the signal and background shapes used for determination of the B 0 s → χ c1 (3872)π + π − and B 0 s → ψ(2S)π + π − signal yields are estimated with alternative models.For the B 0 s signal shape the bifurcated Student's t-distribution [87], Apollonios and Hypatia distributions [88] are tested as alternative models.The J/ψπ + π − mass resolution functions are also modelled with Student's t-distribution and symmetric Apollonios functions.The degree of the polynomial functions used for the parameterisation of the background and the non-resonant J/ψπ + π − components is varied by one unit.Exponential functions are used as components in alternative background models.In addition, the detector resolution scale factors s B 0 s and s J/ψπ + π − are constrained to values of s B 0 s = 1.04 ± 0.02, s J/ψπ + π − = 1.06 ± 0.02 from The background-subtracted mass spectra for the dipion system recoiling against the χ c1 (3872) or ψ(2S) states for (left) B 0 s → ψ(2S)π + π − and (right) B 0 s → χ c1 (3872)π + π − decays.The results of the fit, described in the text, are overlaid.
Ref. [16] and s B 0 s = 1.052 ± 0.003, s J/ψπ + π − = 1.048 ± 0.004 from Ref. [20] using Gaussian constraints.For each alternative model the ratio of the B 0 s → χ c1 (3872)π + π − and B 0 s → ψ(2S)π + π − signal yields is recalculated.The maximum relative deviation with respect to the baseline value is found to be 2.5% which is assigned as a relative systematic uncertainty for the ratio R. The fit procedure itself is tested using a large sample of pseudoexperiments, generated using the default model with parameters extracted from data and the results are found to be unbiased.The B 0 s → χ c1 (3872)π + π − and B 0 s → ψ(2S)π + π − decays are simulated as phase-space decays and corrected to reproduce the dipion mass spectra observed in data.A weighting procedure, based on a gradient boosted tree algorithm [89], is used for corrections of simulated samples.The systematic uncertainty related to the correction method is estimated by varying the hyper-parameters of the regression trees ensemble.The maximum deviation from the baseline value, 0.9%, is taken as the uncertainty associated with the unknown B 0 s decay models.An additional systematic uncertainty on the ratio R arises due to differences between data and simulation.In particular, there are differences in the reconstruction efficiency of charged-particle tracks that do not cancel completely in the ratio due to the different kinematic distributions of the final-state particles.The track-finding efficiencies obtained from simulated samples are corrected using data-driven techniques [55].The uncertainties related to the efficiency correction factors, together with the uncertainty on the hadron-identification efficiency due to the finite size of the calibration samples [36,76], are propagated to the ratio of the total efficiencies using pseudoexperiments and amount to 0.1%.A systematic uncertainty on the ratio related to the knowledge of the trigger efficiencies is estimated by comparing the ratios of trigger efficiencies in data and simulation for large samples of B + → J/ψK + and B + → ψ(2S)K + decays [90] and is taken to be 1.1%.
Remaining data-simulation differences, that are not previously discussed, are investigated by varying the selection criteria.The resulting variations in the ratios of the efficiency-corrected yields do not exceed 2%, which is taken as the corresponding systematic uncertainty.The final systematic uncertainty considered on the ratios of branching fractions is due to the knowledge of the ratios of efficiencies in Eq. ( 2), limited by the size of simulated samples.It is determined to be 0.5%.
No systematic uncertainty is included for the admixture of the CP -odd and CPeven B 0 s eigenstates in the B 0 s → χ c1 (3872)π + π − and B 0 s → ψ(2S)π + π − decays [91], which is assumed to be the same for both channels.Analysis of the dipion spectrum from the B 0 s → ψ(2S)π + π − decays in Sec. 5 indicates that the final state is predominantly CP -odd, with the effective lifetime corresponding to the heavy-mass long-lifetime B 0 s eigenstate [92][93][94].The dipion spectrum from the B 0 s → χ c1 (3872)π + π − decays is also consistent with being predominantly CP -odd.In the extreme case that the B 0 s → χ c1 (3872)π + π − decay is an equal mixture of the short-lifetime and long-lifetime eigenstates, the corresponding ratio of branching fractions would change by 2.4 %.
The statistical significance for the B 0 s → χ c1 (3872)π + π − decay is recalculated using Wilks' theorem for each alternative fit model, and the smallest value of 7.3 standard deviations is taken as the significance that includes the systematic uncertainty.
Alternative models are used also for the fit to the dipion mass spectra from the B 0 s → χ c1 (3872)π + π − and B 0 s → ψ(2S)π + π − channels.In particular, Solution II from Ref. [80] is investigated as an alternative external constraint for the f 0 (980) and f 0 (1500) parameters.The model with the f 0 (1370) resonance instead of the f 0 (1500) state is used as alternative model with a constraint to the known parameters of the f 0 (1370) state [25].As an alternative model for the noncoherent nonresonant component, a product of the Φ 2,3 phase-space function and the positive second-order polynomial function [68] has been probed.Also, a coherent nonresonant contribution, parameterised with a complex-valued constant function, is added to the function from Eq. ( 4).The smallest significance for the B 0 s → χ c1 (3872)f 0 (980) decay is found to be 7.4 standard deviations, which is taken as the significance including systematic uncertainty.

Table 2 :
Relative systematic uncertainties (in %) for the ratio of branching fractions.The sources are described in the text.