Searches for $B^{0}_{(s)} \rightarrow J/{\psi} p \overline{p}$ and $B^{+} \to J/{\psi} p \overline{p} {\pi}^{+}$ decays

The results of searches for $B^{0}_{(s)} \rightarrow J/{\psi} p \overline{p}$ and $B^{+} \to J/{\psi} p \overline{p} {\pi}^{+}$ decays are reported. The analysis is based on a data sample, corresponding to an integrated luminosity of 1.0 fb$^{-1}$ of $pp$ collisions, collected with the LHCb detector. An excess with 2.8$\,\sigma$ significance is seen for the decay $B^{0}_{s} \to J/{\psi} p \overline{p}$ and an upper limit on the branching fraction is set at the 90 % confidence level: ${\cal B}(B^{0}_{s} \to J/{\psi} p \overline{p})<4.8 \times 10^{-6}$, which is the first such limit. No significant signals are seen for $B^{0} \to J/{\psi} p \overline{p}$ and $B^{+} \to J/{\psi} p \overline{p} {\pi}^{+}$ decays, for which the corresponding limits are set: ${\cal{B}}(B^{0} \to J/{\psi} p \overline{p})<5.2 \times 10^{-7}$, which significantly improves the existing limit; and ${\cal{B}}(B^{+} \to J/{\psi} p \overline{p} {\pi}^{+})<5.0 \times 10^{-7}$, which is the first limit on this branching fraction.

In this paper, the results of a search for B 0 → J/ψ pp and B 0 s → J/ψ pp decays are presented. No prediction or experimental limit exists for the branching fraction B(B 0 s → J/ψ pp), but it is of interest to measure the suppression relative to B 0 s → J/ψ π + π − [16]. In addition, a search for the decay B + → J/ψ ppπ + is performed, for which no published measurement exists. All branching fractions are measured relative to that of the decay B 0 s → J/ψ π + π − , which is well suited for this purpose due to its similar topology to the signal decays. Additionally, the lower background level and its more precisely measured branching fraction make it a more suitable normalisation channel than the companion B 0 mode.

Detector and dataset
The LHCb detector [17] is a single-arm forward spectrometer covering the pseudorapidity range 2 < η < 5, designed for the study of particles containing b or c quarks. The detector includes a high precision tracking system consisting of a silicon-strip vertex detector surrounding the pp interaction region, a large-area silicon-strip detector located upstream of a dipole magnet with a bending power of about 4 Tm, and three stations of silicon-strip detectors and straw drift tubes placed downstream. The combined tracking system provides momentum measurement with relative uncertainty that varies from 0.4% at 5 GeV/c to 0.6% at 100 GeV/c, and impact parameter (IP) resolution of 20 µm for tracks with high transverse momentum (p T ). Charged hadrons are identified using two ring-imaging Cherenkov detectors [18]. Photon, electron and hadron candidates are identified by a calorimeter system consisting of scintillating-pad and preshower detectors, an electromagnetic calorimeter and a hadronic calorimeter. Muons are identified by a system composed of alternating layers of iron and multiwire proportional chambers [19]. The trigger [20] consists of a hardware stage, based on information from the calorimeter and muon systems, followed by a software stage, which applies a full event reconstruction.
The analysis uses a data sample, corresponding to an integrated luminosity of 1.0 fb −1 of pp collision data at a centre-of-mass energy of 7 TeV, collected with the LHCb detector during 2011. Samples of simulated events are also used to determine the signal selection efficiency, to model signal event distributions and to investigate possible background contributions. In the simulation, pp collisions are generated using Pythia 6.4 [21] with a specific LHCb configuration [22]. Decays of hadronic particles are described by EvtGen [23], in which final state radiation is generated using Photos [24]. The interaction of the generated particles with the detector and its response are implemented using the Geant4 toolkit [25] as described in Ref. [26].

Trigger and selection requirements
The trigger requirements for this analysis exploit the signature of the J/ψ → µ + µ − decay, and hence are the same for the signal and the B 0 s → J/ψ π + π − control channel. At the hardware stage either one or two identified muon candidates are required. In the case of single muon triggers, the transverse momentum of the candidate is required to be larger than 1.5 GeV/c. For dimuon candidates a requirement on the product of the p T of the muon candidates is applied, √ p T 1 p T 2 > 1.3 GeV/c. In the subsequent software trigger, at least one of the final state muons is required to have both p T > 1.0 GeV/c and IP > 100 µm. Finally, the muon tracks are required to form a vertex that is significantly displaced from the primary vertices (PVs) and to have invariant mass within 120 MeV/c 2 of the known J/ψ mass, m J/ψ [12].
The selection uses a multivariate algorithm (hereafter referred to as MVA) to reject background. A neural network is trained on data using the B 0 s → J/ψ π + π − control channel as a proxy for the signal decays. Preselection criteria are applied in order to obtain a clean sample of the control channel decays. The muons from the J/ψ decay must be well identified and have p T > 500 MeV/c. They should also form a vertex with χ 2 vtx < 12 and have invariant mass within the range −48 < m µ + µ − − m J/ψ < 43 MeV/c 2 . The separation of the J/ψ vertex from all PVs must be greater than 3 mm. The pion candidates must be inconsistent with the muon hypothesis, have p T > 200 MeV/c and have minimum χ 2 IP with respect to any of the PVs greater than 9, where the χ 2 IP is defined as the difference in χ 2 of a given PV reconstructed with and without the considered track. In addition, the scalar sum of their transverse momenta must be greater than 600 MeV/c. The B candidate formed from the J/ψ and two oppositely charged hadron candidates should have χ 2 vtx < 20 and a minimum χ 2 IP with respect to any of the PVs less than 30. In addition, the cosine of the angle between the B candidate momentum vector and the line joining the associated PV and the B decay vertex (B pointing angle) should be greater than 0.99994.
The mass distribution of candidate B 0 (s) → J/ψ π + π − decays remaining after the preselection is then fitted in order to obtain signal and background distributions of the variables that enter the MVA training, using the sPlot technique [27]. The fit model is described in Sec. 4. The variables that enter the MVA training are chosen to minimise any difference in the selection between the signal and control channels. Different selection algorithms are trained for the B 0 (s) → J/ψ pp mode and for the B + → J/ψ ppπ + mode, with slightly different sets of variables. The variables in common between the selections are the minimum χ 2 IP of the B candidate; the cosine of the B pointing angle; the χ 2 of the B and J/ψ candidate vertex fits; the χ 2 per degree of freedom of the track fit of the charged hadrons; and the minimum IP of the muon candidates. For the B 0 (s) → J/ψ pp selection the following additional variables are included: the p T of the charged hadron and J/ψ candidates; the p T of the B candidate; and the flight distance and flight distance significance squared of the B candidate from its associated PV. For the B + → J/ψ ppπ + selection only the momentum and p T of the muon candidates are included as additional variables.
The MVAs are trained using the NeuroBayes package [28]. Two different figures of merit are considered to find the optimal MVA requirement. The first is that suggested in Ref. [29] Q where a = 3 and quantifies the target level of significance, MVA is the efficiency of the selection of the signal candidates, which is determined from simulated signal samples, and B MVA is the expected number of background events in the signal region; which is estimated by performing a fit to the invariant mass distribution of the data sidebands. The second figure of merit is an estimate of the expected 90 % confidence level upper limit on the branching fraction in the case that no signal is observed where σ N sig is the expected uncertainty on the signal yield, which is estimated from pseudo-experiments generated with the background-only hypothesis. The maximum of the first and the minimum of the second figure of merit are found to occur at very similar values. For the B 0 (s) → J/ψ pp (B + → J/ψ ppπ + ) decay, requirements are chosen such that approximately 50 % (99 %) of the signal is retained while reducing the background to 20 % (70 %) of its level prior to the cut. The background level for the B + → J/ψ ppπ + decay is very low due to its proximity to threshold, and only a loose MVA requirement is necessary.
The particle identification (PID) selection for the signal modes is optimised in a similar way using Eq. (1). It is found that, for the signal channels, placing a tight requirement on the proton with a higher value for the logarithm of the likelihood ratio of the proton and pion hypotheses [18] and a looser requirement on the other proton results in much better performance than applying the same requirement on both protons. No PID requirements are made on the pion track in the B + → J/ψ ppπ + mode.
The acceptance and selection efficiencies are determined from simulated signal samples, except for those of the PID requirements, which are determined from data control samples to avoid biases due to known discrepancies between data and simulation. High-purity control samples of Λ → pπ − (D 0 → K − π + ) decays with no PID selection requirements applied are used to tabulate efficiencies for protons (pions) as a function of their momentum and p T . The kinematics of the simulated signal events are then used to determine an average efficiency. Possible variations of the efficiencies over the multibody phase space are considered. The efficiencies are determined in bins of the Dalitz plot, m 2 J/ψ h + vs. m 2 h + h − , where h = π, p; the J/ψ decay angle (defined as the angle between the µ + and the pp system in the J/ψ rest frame); and the angle between the decay planes of the J/ψ and the h + h − system. The variation with the Dalitz plot variables is the most significant. For the B 0 s → J/ψ π + π − control sample, the distribution of the signal in the phase space variables is determined using the sPlot technique and these distributions are used to find a weighted average efficiency.
A number of possible background modes, such as cross-feed from B 0 (s) → J/ψ h + h − final states (where h ( ) = π, K), have been studied using simulation. None of these are found to give a significant peaking contribution to the B candidate invariant mass distribution once all the selection criteria had been applied. Therefore, all backgrounds in the fits to the mass distributions of B 0 (s) → J/ψ pp and B + → J/ψ ppπ + candidates are considered as being combinatorial in nature. For the fits to the B 0 s → J/ψ π + π − control channel, some particular backgrounds are taken into account, as described in the following section.

Fit model and results
Signal and background event yields are estimated by performing unbinned extended maximum likelihood fits to the invariant mass distributions of the B candidates. The signal probability density functions (PDFs) are parametrised as the sum of two Crystal Ball (CB) functions [30], where the power law tails are on opposite sides of the peak. This form is appropriate to describe the asymmetric tails that result from a combination of the effects of final state radiation and stochastic tracking imperfections. The two CB functions are constrained to have the same peak position, equal to the value fitted in the simulation. The resolution parameters are allowed to vary within a Gaussian constraint, with the central value taken from the simulation and scaled by the ratio of the values found in the control channel data and corresponding simulation. The proximity to threshold of the signal decays provides a mass resolution of 1-3 MeV/c 2 , whereas for the normalisation channel it is 6-9 MeV/c 2 . The tail parameters and the relative normalisation of the two CB functions are taken from the simulated distributions and fixed for the fits to data.
A second-order polynomial function is used to describe the combinatorial background component in the B 0 (s) → J/ψ pp spectrum while an exponential function is used for the same component in the B + → J/ψ ppπ + and B 0 (s) → J/ψ π + π − channels. The parameters of these functions are allowed to vary in the fits. There are several specific backgrounds that contribute to the B 0 (s) → J/ψ π + π − invariant mass spectrum [14], which need to be explicitly modelled. In particular, the decay B 0 → J/ψ K + π − , where a kaon is misidentified as a pion, is modelled by an exponential function. The yield of this contribution is allowed to vary in order to enable a better modelling of the background in the low mass region. Two additional sources of peaking background are considered: partially reconstructed decays, such as B 0 s → J/ψ η (ργ); and decays where an additional low momentum pion is included from the rest of the event, such as B + → J/ψ K + . Both distributions are fitted with a non-parametric kernel estimation, with shapes fixed from simulation. The yields of these components are also fixed to values estimated from the known branching fractions and selection efficiencies evaluated from simulation.
In order to validate the stability of the fit, a series of pseudo-experiments have been generated using the PDFs described above. The experiments are conducted for a wide range of generated signal yields. No significant bias is observed in any of the simulation ensembles; any residual bias being accounted for as a source of systematic uncertainty.
The statistical significances of the signal yields are computed from the change in the fit likelihood when omitting the corresponding component, according to 2 ln(L sig /L 0 ), where L sig and L 0 are the likelihoods from the nominal fit and from the fit omitting the signal component, respectively. The statistical significances are found to be 1.2 σ, 3.0 σ and 0.2 σ for the decays B 0 → J/ψ pp, B 0 s → J/ψ pp and B + → J/ψ ppπ + , respectively. The statistical likelihood curve is convolved with a Gaussian function of width given by the systematic uncertainty. The resulting negative log likelihood profiles are shown in Fig. 3. The total significances of each signal are found to be 1.0 σ, 2.8 σ and 0.2 σ for the modes B 0 → J/ψ pp, B 0 s → J/ψ pp and B + → J/ψ ppπ + , respectively.

Systematic uncertainties
Many potential sources of systematic uncertainty are reduced by the choice of the normalisation channel. Nonetheless, some factors remain that could still affect the measurements of the branching fractions. The sources and their values are summarised in Table 1. Precise knowledge of the selection efficiencies for the modes is limited both by the simulation sample size and by the variation of the efficiency over the multi-body phase space, combined with the unknown distribution of the signal over the phase space. The simulation sample size contributes an uncertainty of approximately 1 % in each of the channels, and the effect of efficiency variation across the phase space, determined from the spread of values obtained in bins of the relevant variables, is evaluated to be 17 %, 14 % and 23 % for B 0 → J/ψ pp, B 0 s → J/ψ pp and B + → J/ψ ppπ + decays, respectively. The large systematic uncertainties reflect the unknown distribution of signal events across the phase space. In contrast, the uncertainty for the B 0 s → J/ψ π + π − normalisation channel is estimated by varying the binning scheme in the phase space variables and is found to be only 1% for both the B 0 (s) → J/ψ pp and B + → J/ψ ppπ + MVA selections. Possible biases due to training the MVA using the control channel were investigated and found to be negligible.
The proton PID efficiency is measured using a high-purity data sample of Λ → pπ − decays. By repeating the method with a simulated control sample, and considering the difference with the simulated signal sample, the associated systematic uncertainties are found to be 3 %, 3 % and 2 % for the modes B 0 → J/ψ pp, B 0 s → J/ψ pp and B + → J/ψ ppπ + , respectively. Furthermore, the limited sample sizes give an additional 1 % uncertainty. In the B + → J/ψ ppπ + channel there is an additional source of uncertainty due to the different reconstruction efficiencies for the extra pion track in data and simulation, which is determined to be less than 2 %.
The effect of approximations made in the fit model is investigated by considering alternative functional forms for the various signal and background PDFs. The nominal signal shapes are replaced with a bifurcated Gaussian function with asymmetric exponential tails. The background is modelled with an exponential function for B 0 (s) → J/ψ pp decays, whereas a second-order polynomial function is used for B + → J/ψ ppπ + and the normalisation channel. Combined in quadrature, these sources change the fitted yields by 2.5, 2.6 and 0.7 events, which correspond to 42 %, 12 % and 92 % for the B 0 → J/ψ pp, B 0 s → J/ψ pp and B + → J/ψ ppπ + modes, respectively. The bias on the determination of the fitted yield is studied with pseudo-experiments. No significant bias is found, and the associated systematic uncertainty is 0.2, 0.3 and 0.2 events (4 %, 1 % and 26 %) for the B 0 → J/ψ pp, B 0 s → J/ψ pp and B + → J/ψ ppπ + modes, respectively. Since a B 0 s meson decay is used for the normalisation, the results for B(B 0 → J/ψ pp) and B(B + → J/ψ ppπ + ) rely on the knowledge of the ratio of the fragmentation fractions, measured to be f s /f d = 0.256 ± 0.020 [31], introducing a relative uncertainty of 8 %. It is assumed that f u = f d . The uncertainty on the measurement of the B 0 s → J/ψ π + π − branching fraction includes a contribution from this source. Hence, to avoid double

Results and conclusions
The relative branching fractions are determined according to Table 1: Systematic uncertainties on the branching fraction ratios of the decays B 0 → J/ψ pp, B 0 s → J/ψ pp and B + → J/ψ ppπ + measured relative to B 0 s → J/ψ π + π − . The total is obtained from the sum in quadrature of all contributions.

Source
Uncertainty on the branching fraction ratio (%) where sel is the selection efficiency, PID is the particle identification efficiency, and N is the signal yield. The results obtained are where the third uncertainty originates from the control channel branching fraction measurement. The dominant uncertainties are statistical, while the most significant systematic come from the fit model and from the variation of the efficiency over the phase space.
Since the significances of the signals are below 3 σ, upper limits at both 90 % and 95 % confidence levels (CL) are determined using a Bayesian approach, with a prior that is uniform in the region with positive branching fraction. Integrating the likelihood (including all systematic uncertainties), the upper limits are found to be B(B 0 → J/ψ pp) B(B 0 s → J/ψ π + π − ) In summary, using the data sample collected in 2011 by the LHCb experiment corresponding to an integrated luminosity of 1.0 fb −1 of pp collisions at √ s = 7 TeV, searches for the decay modes B 0 → J/ψ pp, B 0 s → J/ψ pp and B + → J/ψ ppπ + are performed. No significant signals are seen, and upper limits on the branching fractions are set. A significant improvement in the existing limit for B 0 → J/ψ pp decays is achieved and first limits on the branching fractions of B 0 s → J/ψ pp and B + → J/ψ ppπ + decays are established. The limit on the B 0 → J/ψ pp branching fraction is in tension with the theoretical prediction [15]. The significance of the B 0 s → J/ψ pp signal is 2.8 σ, which motivates new theoretical calculations of this process as well as improved experimental searches using larger datasets.