Observation of the $B_c^+ \to J/\psi \pi^+ \pi^0$ decay

The first observation of the $B_c^+ \to J/\psi \pi^+ \pi^0$ decay is reported with high significance using proton-proton collision data, corresponding to an integrated luminosity of 9fb$^{-1}$, collected with the LHCb detector at centre-of-mass energies of 7, 8, and 13 TeV. The ratio of its branching fraction relative to the $B_c^+ \to J/\psi \pi^+$ channel is measured to be $$ \frac{ {\cal{B}}( B_c^+ \to J/\psi \pi^+\pi^0 ) } { {\cal{B}}( B_c^+ \to J/\psi \pi^+ ) } = 2.80 \pm 0.15 \pm 0.11 \pm 0.16 \,, $$ where the first uncertainty is statistical, the second systematic and the third related to imprecise knowledge of the branching fractions for $B^+ \to J/\psi K^{*+}$ and $B^+ \to J/\psi K^+$ decays, which are used to determine the $\pi^0$ detection efficiency. The $\pi^+\pi^0$ mass spectrum is found to be consistent with the dominance of an intermediate $\rho^+$ contribution in accordance with a model based on QCD factorisation.


Event selection
The signal B + c → J/ψπ + π 0 and control B + → J/ψ (K * + → K + π 0 ) candidates are reconstructed using J/ψ → µ + µ − decays.The same dimuon final state of the J/ψ meson is used to reconstruct the normalisation B + c → J/ψπ + candidates.A loose preselection similar to that used in Refs.[14,39,104] is applied, followed by a multivariate classifier to select a higher purity subset of candidates.
Muons, pions and kaons are identified by combining information from the RICH, calorimeter and muon detectors, and they are required to have transverse momenta larger than 500 MeV/c.Pions and kaons are further required to have a momentum between 3.2 and 150 GeV/c to ensure good performance of the particle identification in the RICH detectors [76,105].To reduce the combinatorial background due to charged particles produced in pp interactions, only tracks that are inconsistent with originating from any PV are used.Photons are reconstructed from clusters in the electromagnetic calorimeter with transverse energy above 300 MeV.The clusters must not be associated with reconstructed tracks [106,107].
Pairs of oppositely charged muons consistent with originating from a common vertex are combined to form J/ψ → µ + µ − candidates.The reconstructed mass of the µ + µ − pair is required to be within the range 3.0 < m µ + µ − < 3.2 GeV/c 2 .The position of the reconstructed dimuon vertex is required to be separated from any reconstructed PV.The π 0 candidates are reconstructed as diphoton pairs with mass within 30 MeV/c 2 of the known mass of the π 0 meson [40].
The selected J/ψπ + π 0 , J/ψπ + and J/ψK + π 0 combinations are used to form the B + c and B + candidates.The π + π 0 mass m π + π 0 is required to satisfy 620 < m π + π 0 < 920 MeV/c 2 and the K + π 0 mass must be within 50 MeV/c 2 of the known mass of the K * + meson [40].To improve the mass resolution for the B mesons, a kinematic fit [108] constrains the masses of the J/ψ and π 0 candidates to their known values [40] and requires the B candidates to originate from their associated PV. 2 The decay time t B of the B + c (B + ) candidates is required to satisfy 0.1 < t B < 1.0 mm/c (0.2 < t B < 2.0 mm/c) to suppress the large combinatorial background from tracks produced at the PV and from misreconstructed B candidates.The B + c → J/ψπ + π 0 candidates with a J/ψπ + mass within 45 MeV/c 2 of the known mass of the B + c meson are vetoed to avoid contamination from B + c → J/ψπ + decays with a random π 0 added.Similarly, candidates with the J/ψπ + mass between 5.180 and 5.305 GeV/c 2 are also vetoed to remove the background from B + → J/ψπ + and B + → J/ψK + decays with a random π 0 added.
Further selection of the B + c and B + candidates is based on a multivariate estimator known as a multi-layer perceptron (MLP) classifier.The MLP classifier is based on an artificial neural network algorithm [109,110] configured with a cross-entropy cost estimator [111].It reduces the combinatorial background to a low level while retaining a high signal efficiency.Three MLP classifiers are trained separately for the signal, normalisation and control candidates.The variables used in the classifiers include those related to the reconstruction quality, kinematics and decay time of the B + c and B + candidates, kinematics of the final-state hadrons and photons, as well as a variable that characterises the identification of charged pions and kaons.The classifiers are trained using simulated signal samples, while the B + c and B + candidates from data with mass outside the regions 6.00 < m J/ψπ + π 0 < 6.50 GeV/c 2 , 6.20 < m J/ψπ + < 6.34 GeV/c 2 and 5.10 < m J/ψK + π 0 < 5.50 GeV/c 2 are used to represent the background for the signal, normalisation and control modes, respectively.The k-fold cross-validation technique [112] with k = 11 is used to avoid introducing a bias in the MLP evaluation.The requirement on each of the MLP classifiers is chosen to maximise the figure-of-merit defined as S/ √ S + κB, where S represents the expected signal yield, B is the background yield and κ is the fraction of the total background that affects the signal determination.The background yield B is calculated from fits to data, while S = εS 0 , where ε is the efficiency of the requirement on the response of the MLP classifier determined from simulation and S 0 is the signal yield obtained from the fit to the data with a loose requirement applied.The factor κ is determined from pseudoexperiments.The mass distributions for the selected B + c → J/ψπ + , B + → J/ψK * + and B + c → J/ψπ + π 0 candidates are shown in Figs. 2, 3 and 4, respectively.

Determination of signal yields
The yields for the normalisation B + c → J/ψπ + , control B + → J/ψK * + and signal B + c → J/ψπ + π 0 decays are determined using extended unbinned maximum-likelihood fits to the J/ψπ + , J/ψK + π 0 and J/ψπ + π 0 mass spectra shown in Figs. 2, 3 and 4. Each signal component is parameterised as the sum of a modified Gaussian function with power law tails on both sides of the distribution [113,114] and a standard Gaussian function with common location parameter. 3The background components are parameterised by positive decreasing second-order polynomials for the fits to the J/ψπ + and J/ψK + π 0 mass spectra and with a positive convex decreasing fourth-order polynomial for the fit to the J/ψπ + π 0 mass spectrum [115].In addition, the fit model for the J/ψπ + mass spectrum includes a small component corresponding to the contribution from Cabibbo-suppressed B + c → J/ψK + decays [11,17] where the charged kaon is reconstructed as a pion.The shape of this component is taken from simulation.The fit model for the J/ψK + π 0 mass spectrum includes an additional component corresponding to contributions from B + → J/ψK + π 0 π 0 and B 0 → J/ψK + π 0 π − decays with missing π 0 or π − particles.This component is parameterised using a Gaussian function.The signal shape parameters, except for the location parameters, are determined from simulation and their uncertainties are propagated to the fits using multivariate Gaussian constraints.For each fit, the location and resolution  parameters are expressed as where m B is the known mass of the corresponding beauty meson [40], σ MC is the resolution parameter obtained from simulation, the term δm B corrects for a possible mass bias, and the scale factor s takes into account any difference in the mass resolution between simulation and data.The parameters δm B and s are allowed to vary in the J/ψπ + and J/ψK + π 0 mass spectra fits, while the J/ψπ + π 0 fit restricts them to the values obtained for the B + → J/ψK + π 0 mode with their corresponding uncertainties propagated using Gaussian constraints.The results of the fits are shown in Figs. 2, 3 and 4 with the parameters summarised in Table 1.The statistical significance for the B + c → J/ψπ + π 0 signal is estimated using Wilks' theorem [116] and is found to exceed 20 standard deviations.To validate the observed B + c → J/ψπ + π 0 signal, several cross-checks are performed.The data are split into data-taking periods with different polarity of the dipole magnet and into B + c and B − c samples.Alternative multivariate estimators, such as decision trees with gradient boosting [117] and a projective likelihood estimator [118], are used.The results are found to be consistent among all samples and analysis techniques.To validate the predictions of the BLL model, the π + π 0 mass spectrum from B + c → J/ψπ + π 0 decays is studied by extending the π + π 0 mass interval to the region m π + π 0 < 1.6 GeV/c 2 .The background-subtracted π + π 0 mass spectrum is shown in Fig. 5, which uses the sPlot technique [119] based on the fit to the J/ψπ + π 0 mass spectrum with the wider π + π 0 mass interval.There is a clear peak corresponding to the decay ρ + → π + π 0 , as expected by the BLL model.To quantify the possible deviations Table 1: Yields N , mass biases δm B and resolution scale factors s from the fits to the J/ψπ + , J/ψK + π 0 and J/ψπ + π 0 mass spectra.The uncertainties for δm B include those from the known masses of the B + c and B + mesons.The mass bias and resolution scale parameters for the B + c → J/ψπ + π 0 channel are constrained by those from the B + → J/ψK + π 0 control channel.
Channel   from the predictions, a fit to the π + π 0 mass spectrum is performed using a function consisting of two components.The signal component corresponds to the coherent sum of the B + c → J/ψρ + and B + c → J/ψρ(1450) + contributions [93], with the shape obtained from the BLL model.The second component describes contributions differing from the BLL model, generically parameterised as where q is the pion momentum in the dipion rest-frame, p is the π + π 0 momentum in the B + c rest-frame [120], and P + 1 is a positive first-order polynomial function [115].This parameterisation accounts for the phase-space terms and the suppression of the S-wave contribution.The unknown decay dynamics are absorbed by the polynomial function P + 1 .The result of the fit to the background-subtracted π + π 0 mass spectrum is superimposed in Fig. 5.The contribution of the second component vanishes in the fit, demonstrating good agreement between the observed π + π 0 mass spectrum and the expectations from the BLL model [48][49][50][51]92].

Branching fraction ratio computation
The ratio of branching fractions for the B + c → J/ψπ + π 0 and B + c → J/ψπ + decays is calculated as where the yields N B + c →J/ψπ + π 0 and N B + c →J/ψπ + are taken from Table 1, and ε denotes the corresponding efficiency, which is defined as the product of the detector acceptance and the reconstruction, selection and trigger efficiencies, where each subsequent efficiency is defined with respect to the previous one.Each of the partial efficiencies is calculated using the simulation samples described in Sec. 2. The ratio of branching fractions is found to be R = 2.80 ± 0.15 , where the uncertainty is statistical only.

Systematic uncertainties
Many sources of systematic uncertainties cancel in the ratio of branching fractions R due to the similarity between the signal and normalisation decay channels.The remaining contributions to the systematic uncertainty for R are summarised in Table 2 and discussed below.
An important source of systematic uncertainty is the imperfect knowledge of the shapes of the signal and background components used in the fits.To estimate this effect, several alternative models are tested.The B + c → J/ψπ + π 0 signal shape is parameterised as a sum of a Gaussian and three alternative functions: 1. a bifurcated generalised Student's t-distribution [121,122]; 2. a Johnson's S U -distribution [123,124]; 3. a generalised hyperbolic distribution [125,126].
Similarly, the B + c → J/ψπ + signal component is modelled as the sum of a Gaussian function with three alternative shapes: 1. a bifurcated Student's t-distribution; 2. a generalised hyperbolic distribution; 3. a Pearson's type IV distribution [127].
Five alternative background functions are tested for the fit to the J/ψπ + π 0 mass spectrum: Lastly, for the background component of the J/ψπ + mass spectrum four alternative fit models are tested: 1. a positive decreasing third-order polynomial function; 2. a product of an exponential function and a positive first-order polynomial function; 3. a positive third-order polynomial function; 4. a positive fourth-order polynomial function.
For each alternative signal or background function, a large ensemble of pseudoexperiments is produced and fit with the baseline model.The absolute value of the mean over an ensemble for the relative difference between the fitted B + c → J/ψπ + π 0 or B + c → J/ψπ + signal yields and their baseline results is calculated.Its maximal value is found to be 1.0% (2.1%) for the alternative signal (background) shapes and is taken as the systematic uncertainty related to the fit model.For each model test, the signal significance of the newly observed B + c → J/ψπ + π 0 decay is estimated using Wilks' theorem [116] and is always found to exceed 20 standard deviations.
An uncertainty may originate from possible differences in the B + c production kinematics between data and simulation.The transverse momentum and rapidity spectra of the B + c mesons in the simulation are weighted to match those observed in a high-yield, low-background sample of B + c → J/ψπ + decays.The systematic uncertainty of the efficiency ratio caused by the finite size of this sample is estimated by varying the B + c -meson production kinematic spectra within their uncertainties in the weighting procedure.The induced variation for the ratio R is found to be smaller than 0.1%.Simulated B + c → J/ψπ + π 0 events are corrected for polarisation of the dipion system in the B + c rest frame to reproduce the corresponding angular distribution observed in data.Variations of this correction within statistical uncertainties causes a relative variation of 2.2% for the efficiency for B + c → J/ψπ + π 0 decays, which is taken as the systematic uncertainty related to the B + c decay model.There are residual differences in the reconstruction efficiency of charged-particle tracks that do not cancel completely in the ratio due to slightly different kinematic distributions of the final-state particles.The track-finding efficiencies obtained from simulated samples are corrected using calibration channels [98].The uncertainties related to the efficiency correction factors are propagated to the ratios of the total efficiencies using pseudoexperiments and found to be 0.1% for the ratio R.
The difference of the photon reconstruction efficiency between data and simulation is studied using large samples of B + → J/ψ(K * + → K + (π 0 → γγ)) and B + → J/ψK + decays [99][100][101][102][103].The uncertainty for the photon efficiency corrections has three components: statistical, systematic and those related to the imprecise knowledge of the ratio of branching fractions for B + → J/ψK * + and B + → J/ψK + decays.The statistical and systematic uncertainties are propagated to the ratio of the total efficiencies using pseudoexperiments and the resulting 0.9% is taken as the systematic uncertainty related to the photon reconstruction.The uncertainty due to imprecise knowledge of external branching fractions of 5.9% [40] is treated separately.
The simulated detector response used for identification of pions is resampled from the control channels [76,97].The systematic uncertainty obtained through this procedure arises from the kernel shape used in the estimation of the probability density distributions.An alternative combined response is estimated using a modified kernel shape and the efficiency models are regenerated [130,131].The difference between the two efficiency ratios of 0.6% is taken as the systematic uncertainty related to hadron identification.
The systematic uncertainty related to the trigger efficiency has been previously studied by comparing the ratios of the trigger efficiencies in data and simulation using large samples of B + → J/ψK + and B + → ψ(2S)K + decays [132].Based on this comparison, a conservative estimate of 1.1% for the relative difference between data and simulation is taken as the corresponding systematic uncertainty.
The imperfect data description by simulation due to remaining effects not described above is studied by varying the MLP selection criteria for the normalisation decay B + c → J/ψπ + .The observed maximal difference between the efficiency estimated using data and simulation does not exceed 2.0%.This value is taken as the corresponding systematic uncertainty for the ratio R. The last systematic uncertainty considered is due to the finite size of the simulated samples, which amounts to 0.5%.The total systematic uncertainty is calculated as the quadratic sum of the individual contributions.

Figure 2 :
Figure 2: Mass distribution for selected B + c → J/ψπ + candidates with the result of the fit described in the text.

Figure 3 :
Figure 3: Mass distribution for selected B + → J/ψK * + candidates with the result of the fit described in the text.

Figure 4 :
Figure 4: Mass distribution for selected B + c → J/ψπ + π 0 candidates with the result of the fit, described in the text.

Figure 5 :
Figure 5: Background-subtracted π + π 0 mass distribution from the B + c → J/ψπ + π 0 decays.The orange line shows the result of the two-component fit described in the text.The blue dashed line corresponds to the component parameterised by Eq. (3), which vanishes in the fit.

Table 2 :
Relative systematic uncertainties for the ratio R. The total systematic uncertainty is the quadratic sum of individual contributions.The systematic uncertainty of the π 0 reconstruction efficiency due to imprecise knowledge of the branching fraction of the B + → J/ψK * + and B + → J/ψK + decays is treated separately.positive decreasing convex third-order polynomial function[115]; 2. a positive decreasing fourth-order polynomial function; 3. a product of an exponential function and a positive third-order polynomial function; 4. a sum of an exponential function and a positive third-order polynomial function; 5. a sum of a generalised Pareto distribution[128,129] and a positive first-order polynomial function.