Search for the $B^{0}_{s} \to \eta^{\prime}\phi$ decay

A search for the charmless $B^{0}_{s} \to \eta^{\prime}\phi$ decay is performed using $pp$ collision data collected by the LHCb experiment at centre-of-mass energies of $7$ and $8$ TeV, corresponding to an integrated luminosity of 3 fb$^{-1}$. No signal is observed and upper limits on the $B^{0}_{s} \to \eta^{\prime}\phi$ branching fraction are set to $0.82\times 10^{-6}$ at $90\%$ and $1.01\times 10^{-6}$ at $95\%$ confidence level.


Introduction
Charmless hadronic decays of beauty hadrons proceed predominantly through tree-level b → u and loop-level (penguin) b → s weak transitions. In the Standard Model the amplitudes of these processes, suppressed compared to the dominant tree b → c transition governing charmed decays, usually have similar magnitudes and give rise to possibly large violation of the charge-parity (CP ) symmetry. Therefore, charmless decays of B mesons should be sensitive to additional amplitudes from new, heavy particles, contributing to the loop-level transitions [1].
Charmless hadronic B + and B 0 decays 1 have been the subject of extensive studies, both experimentally, at hadron and e + e − colliders, and theoretically. The phenomenological understanding that has emerged allows predictions to be made for charmless B 0 s decays, as will be illustrated in the following. In the ongoing effort to test these predictions experimentally, the LHCb experiment has recently observed the decay 2 B 0 s → η η . The relatively large measured branching fraction B(B 0 s → η η ) = (33.1±7.1)×10 −6 is consistent with Standard Model expectations [2]. However, the knowledge about charmless hadronic B 0 s decays into light pseudoscalar (P) and vector (V) mesons is still limited. Further measurements will help to better constrain phenomenological models, the uncertainties of which often translate into a major contribution to the theoretical uncertainties in searches for physics beyond the Standard Model.
The decay B 0 s → η φ proceeds predominantly through b → sss transitions, as illustrated in Fig. 1. It is of particular interest in constraining phenomenological models, as predictions for its branching fraction cover a wide range, typically from 0.1 × 10 −6 to 20 × 10 −6 , with large uncertainties that reflect the limited knowledge of form factors, penguin contributions, the ω − φ mixing angle, or the s-quark mass. The decay B 0 s → η φ has been studied in the framework of QCD factorisation [3,4], perturbative QCD [5,6], soft-collinear effective theory (SCET) [7], SU(3) flavour symmetry [8], and factorisation-assisted topological (FAT) amplitude approach [9]. Table 1 presents the available predictions for B(B 0 s → η φ). In QCD factorisation, predictions for B(B 0 s → η φ) are generally small because the spectator quark can become part of either the η or the φ meson (see Fig. 1), leading to a strong cancellation between the PV and VP amplitudes contributing to the η φ final state [3]. Such cancellation does not occur in the symmetric B 0 s → η η (PP) and B 0 s → φφ (VV) decays. However, other values of the form factor for the B 0 s to φ transitions can lead to enhancements of the branching fraction by more than an order of magnitude [4]. The measurement of B(B 0 s → η φ) is therefore important to improve the knowledge of the B 0 s to φ form factor and the accuracy of model predictions.
The comparison of QCD factorisation [3,4], perturbative QCD [5], and SCET [7] calculations shows that the hierarchy of branching fractions in B 0 s → η φ and B 0 s → ηφ decays is sensitive to the size of the colour-suppressed QCD penguin loop, which is estimated to be large in perturbative QCD [5], and to "gluonic charming penguins", which play an important role in SCET calculations [7]. Future measurements of both decay modes will provide useful information on these loop contributions.  Figure 1: Lowest-order diagrams for the B 0 s → η φ decay. The spectator quark can become part of either the η or the φ meson, forming two different amplitudes (called PV and VP in the text). 13.0 ± 1.6 [9] corresponding to a total integrated luminosity of 3 fb −1 . The signal B 0 s → η φ and normalisation B + → η K + candidates are reconstructed through the decays η → π + π − γ and φ → K + K − . The B 0 s → η φ branching fraction is determined with respect to the B + → η K + mode according to where B(B + → η K + ) = (70.6 ± 2.5) × 10 −6 [10], B(φ → K + K − ) = 0.489 ± 0.005 [10], f u /f s is the B + /B 0 s production ratio assumed to be equal to the B 0 /B 0 s production ratio f d /f s = 1/(0.259 ± 0.015) [11], and (B 0 s → η φ) and (B + → η K + ) are the total efficiencies of the signal and normalisation modes, respectively. The ratio of the observed yields N (B 0 s → η φ)/N (B + → η K + ) is obtained from a two-dimensional fit to the invariant mass distributions of the η and the B candidates, performed simultaneously on the signal and normalisation modes.

Detector and simulation
The LHCb detector [12,13] is a single-arm forward spectrometer covering the pseudorapidity range 2 < η < 5, designed for the study of particles containing b or c quarks. The detector includes a high-precision tracking system consisting of a siliconstrip vertex detector surrounding the pp interaction region, a large-area silicon-strip detector located upstream of a dipole magnet with a bending power of about 4 Tm, and three stations of silicon-strip detectors and straw drift tubes placed downstream of the magnet. The tracking system provides a measurement of momentum of charged particles with a relative uncertainty that varies from 0.5% at low momentum to 1.0% at 200 GeV/c. The minimum distance of a track to a pp-collision point (primary vertex), the impact parameter, is measured with a resolution of (15 + 29/p T ) µm, where p T is the component of the momentum transverse to the beam, in GeV/c. Different types of charged hadrons are distinguished using information from two ring-imaging Cherenkov detectors. Photons, electrons and hadrons are identified by a calorimeter system consisting of scintillating-pad (SPD) and preshower detectors, an electromagnetic calorimeter and a hadronic calorimeter. Muons are identified by a system composed of alternating layers of iron and multiwire proportional chambers.
The trigger [14] consists of a hardware stage, based on information from the calorimeter and muon systems, followed by a software stage, which applies a full event reconstruction. The B decays of interest are triggered at the hardware stage, either by one of the decay products depositing a transverse energy greater than 3.5 GeV in the hadron calorimeter, or by other high-p T particles produced in the pp collision. The software trigger requires a two-, three-or four-track secondary vertex with a significant displacement from the primary vertices. At least one charged particle must have a transverse momentum p T > 1.7 GeV/c and be inconsistent with originating from a primary vertex. A multivariate algorithm [15] is used for the identification of secondary vertices consistent with the decay of a b hadron.
Simulated decays are used to optimise the event selection and to evaluate the selection efficiencies. In the simulation, pp collisions are generated using Pythia 8 [16] with a specific LHCb configuration [17]. Decays of hadronic particles are described by EvtGen [18], in which final-state radiation is generated using Photos [19]. The interaction of the generated particles with the detector, and its response, are implemented using the Geant4 toolkit [20] as described in Ref. [21].

Event selection
The selection of the signal B 0 s → η φ and normalisation B + → η K + candidates, generically referred to as B candidates, is optimised for the signal. Wherever possible, the same selection criteria are applied for the normalisation channel.
Only good-quality tracks identified as pions or kaons [13] and inconsistent with originating from any primary vertex are used. Tracks used to reconstruct an η or φ candidate are each required to be consistent with coming from a common secondary vertex and to have p T > 0.4 GeV/c. The π + π − invariant mass in the η decay must be larger than 0.52 GeV/c 2 to reject K 0 S → π + π − decays. Photon candidates must be of good quality [13] and have p T > 0.3 GeV/c. The invariant masses of the η and φ candidates must satisfy 0.88 < m ππγ < 1.04 GeV/c 2 and 1.005 < m KK < 1.035 GeV/c 2 . An η candidate is combined with a candidate φ meson (or a charged kaon with p T > 1 GeV/c) to make a B 0 s (or B + ) candidate. Each B candidate is required to have a good-quality vertex, by imposing a loose requirement of the χ 2 of the vertex fit (χ 2 < 6), and p T > 1.5 GeV/c. The invariant masses of the B 0 s and B + candidates, computed after constraining the π + π − γ mass to the nominal η mass [10], are required to satisfy 5.0 < m η KK < 5.6 GeV/c 2 and 5.0 < m η K < 5.5 GeV/c 2 , respectively.
To further separate signal from background, boosted decision trees (BDTs) based on the AdaBoost algorithm [22,23] are used. Different BDTs are used for the signal and normalisation channels. Each BDT is trained, tested and optimised on fully simulated signal decays and background taken from data. The background consist of events in the mass range 5.0 < m η KK < 5.6 GeV/c 2 (5.0 < m η K < 5.5 GeV/c 2 ) excluding the signal region defined below.
To minimise statistical and systematic uncertainties, the BDT algorithm uses input variables that provide significant background rejection, are well modelled in simulation, and are defined for both the signal and normalisation channels. Nine variables are used as input to each BDT. Two variables are related to the kinematics of the final-state particles: the transverse momenta of the photon and the η meson. Three variables describe the topology of the B candidate: the B-candidate flight distance, the cosine of the angle between the reconstructed B momentum and the vector pointing from the associated primary vertex to the B decay vertex, and the impact parameter of the B candidate with respect to its associated primary vertex. The associated primary vertex is the primary vertex with respect to which the B candidate has the smallest χ 2 IP , where χ 2 IP is defined as the difference in the vertex-fit χ 2 of the selected primary vertex reconstructed with or without the considered particle. Three variables are related to the B-candidate vertex: the vertex-fit quality, characterised by its χ 2 , and two vertex isolation variables defined as the smallest vertex-fit χ 2 values obtained when adding to the vertex in turn either all single tracks or all pairs of tracks from the set of tracks that are not assigned to the B candidate. The last variable is the sum of the χ 2 IP of the charged particles used to form the B candidate, calculated with respect to the associated primary vertex. The photon p T and the B-candidate impact parameter provide the best background discrimination. The BDT is trained for the full data set, irrespective of the pp collision energy. To minimise biases in the final selection, both the data and simulated samples are randomly divided into two subsamples and two BDTs are defined. Each BDT is trained, tested and optimised on one subsample, and then applied to the other subsample for the candidate selection [24]. The selected candidates from both subsamples are then merged into a single sample for the next stage of the analysis.
The requirement on the BDT output is chosen to maximise the figure of merit [25], where a = 5 is the target signal significance, and N B is the number of background events in the signal region estimated from the B 0 s mass sidebands. The signal region is defined as the B 0 s mass range 5.287 − 5.446 GeV/c 2 , corresponding approximately to 7 times the mass resolution. The optimised BDT requirement has an efficiency of 59% for B 0 s → η φ decays, while rejecting 93% of the combinatorial background in the signal region. As a check, an alternative optimisation is performed: for various values of the B 0 s → η φ branching fraction, pseudoexperiments are generated with a model containing only signal and combinatorial background, and then are analysed with a simple two-dimensional maximum likelihood fit to the B 0 s and η masses. The signal significance, determined using Wilks' theorem [26], is found to reach its maximum for a BDT requirement in agreement with that obtained using the method of Ref. [25].
In events containing multiple candidates ( 3%), the candidate with the best identified photon is kept. The full selection described above retains 430 B 0 s → η φ candidates and 22 681 B + → η K + candidates for further analysis.
Selection efficiencies are evaluated with simulated data, except those of the particle identification (PID) requirements and the hardware trigger, for which calibration data are used. Systematic uncertainties on the efficiency ratio (B + → η K + )/ (B 0 s → η φ) are summarised in Table 2. The BDT algorithms are validated using the normalisation channel as proxy for the signal, and by comparing the distributions obtained with the sPlot technique [27] of the nine input variables and the BDT output variable. The difference between the efficiencies in data and simulation of the BDT requirement for the normalisation channel is used as a measure of the systematic uncertainty on the BDT efficiency. The correlation evaluated in simulation between the BDT variables for B 0 s → η φ and B + → η K + is then used to determine the systematic uncertainty on the ratio of the BDT efficiencies.
Another systematic effect on the determination of the efficiency ratio is the uncertainty on the PID efficiency, which is determined as a function of kinematic parameters using a clean high-statistics sample of kaons and pions from D * + → D 0 (→ K − π + )π + decays [28]. The uncertainty on the trigger efficiency, which is mostly due to the computation of the hardware-stage trigger efficiency, is estimated with simulated data by varying the value of the minimum transverse energy requirement used in the trigger decision. An uncertainty is assigned on the efficiency ratio to take into account the mismodelling of the hit multiplicity in the SPD, which is used as a discriminant variable at the hardware stage of the trigger. This uncertainty is evaluated in simulation by varying the requirement on the SPD hit multiplicity. Corrections determined from control channels are applied to the tracking and photon reconstruction efficiencies to account for mismodelling effects in the simulation. The uncertainties on these corrections are quoted as systematic uncertainties. Since the correction to the tracking efficiency is obtained using muons, an additional uncertainty is needed to account for hadronic interactions in the detector material [29]. Finally, the limited statistics of the simulated samples used in the evaluation of the efficiencies is added as a source of uncertainty. Combining all uncertainties in quadrature, the ratio of the selection efficiencies is The selection requirements efficiently reject physics backgrounds such as B 0 → φK * 0 and modes with resonances decaying strongly to K + π − π 0 , but not B 0 s → φφ decays with one of the two φ resonances decaying to π + π − π 0 and one of the photons from the π 0 → γγ decay not being reconstructed. From simulation studies and known branching fractions [10], the number of B 0 s → φφ decays passing the selection is expected to be 104 ± 34. Hence this background is included as a specific component in the mass fit described below.

Mass fit
The B 0 s → η φ signal yield is determined from a two-dimensional extended unbinned maximum likelihood fit, where the signal is fitted simultaneously with the normalisation channel B + → η K + . The observables used in the fit are the invariant masses m ππγ and m η KK (m η K ) for the sample of B 0 s → η φ (B + → η K + ) candidates. The sample of B 0 s → η φ candidates is described with a four-component model: the signal, the two combinatorial backgrounds with and without a true η resonance, and the B 0 s → φφ physics background, where one of the two φ resonances decays to the π + π − π 0 final state. The sample of B + → η K + candidates is modelled using three components: the signal and the two combinatorial backgrounds with and without a true η resonance. The yields of all components are free to vary in the fit. The peaking components in the B 0 s , B + and η mass spectra are described using Gaussian functions modified with an exponential tail on each side. While all the tail parameters are fixed from simulation, the mean and the widths of the Gaussian functions are free to vary in the fit, but the ratio of the widths of the peaking components in m η KK and m η K is fixed to the value obtained in simulation and the difference between the B 0 s and B + masses is constrained to the known value [10]. The η resonances in the two samples are modelled using a common function, with mean and width free in the fit. The combinatorial components are described with linear functions, with the exception of the random combinations in m η K , where a parabolic function is used. To account for correlations between m η KK and m ππγ , the B 0 s → φφ component is described with a superposition of two-dimensional Gaussian kernel functions [30] determined from simulation. For all other components, in particular the signal, the correlation is negligible due to the η mass constraint applied in the computation of the B-candidate mass. The fit procedure is validated on simulated samples containing the expected proportion of background and signal events, according to various assumptions on B(B 0 s → η φ). In particular, for B(B 0 s → η φ) = 4 × 10 −6 , a statistical significance corresponding to more than 5 standard deviations is observed in 74% of the pseudoexperiments.  Sets of pseudoexperiments are used to evaluate possible fit biases. Fits on samples generated from the probability density function (PDF) with parameters obtained from the data are found to be unbiased. The procedure is then repeated using simulated B 0 s → φφ events instead of generating the corresponding background component from the PDF. Biases of −1.3 ± 0.3 on the signal yield and of (−1.16 ± 0.33) × 10 −4 on the ratio of yields are observed. The results obtained with data are corrected for these biases and systematic uncertainties computed as the quadratic sum of the statistical uncertainty on the bias and half of the bias value are assigned.
Additional systematic uncertainties affect the signal yield and the yield ratio. The mass fit is repeated with different combinatorial background PDFs: linear functions are replaced with exponential functions, and the parabolic function is replaced with a third-order polynomial. The quadratic sum of the differences between the values obtained in these alternative fits and the nominal result is assigned as a systematic uncertainty. The limited size of the simulated B 0 s → φφ sample leads to an uncertainty on the determination of the nonparametric PDF for the physics background, which is propagated as a systematic uncertainty. The effect of fixing some of the model parameters in the fit is studied by performing a large number of fits on the data, with the fixed parameters sampled where the first (second) quoted uncertainty is statistical (systematic). Bayesian upper limits x U are determined assuming a uniform prior in the observable x (yield, yield ratio, or B) as is the likelihood function convolved with the systematic uncertainties, and α is the confidence level (CL). The obtained upper limits are N (B 0 s → η φ) < 8.9 (10.9) at 90% (95%) CL and N (B 0 s → η φ) N (B + → η K + ) < 8.0 (9.9) × 10 −4 at 90% (95%) CL .

Result and conclusion
A search has been performed for the B 0 s → η φ decay. No signal is found. The branching fraction B(B 0 s → η φ) = (−0.18 +0.47 −0.36 (stat) ± 0.10(syst)) × 10 −6 is computed from Eqs. (1), (2) and (4) using the known value of B(B + → η K + ) [10] and the LHCb measurement of f s /f d [11], which leads to B(B 0 s → η φ) < 0.82 (1.01) × 10 −6 at 90% (95%) CL using the likelihood integration method described above. This is the first upper limit set on the B 0 s → η φ branching fraction. This result favours the lower end of the range of predictions for this branching fraction, pointing to form factors consistent with the light-cone sum-rule calculation used in Ref. [4], or with the hypotheses used in Refs. [3,5]. Although large theoretical uncertainties make most predictions compatible with the result of this analysis, the central values of the predictions in Refs. [6][7][8][9] are significantly larger than the upper limit. These discrepancies should help in constraining the theoretical models used in the prediction of branching fractions and CP asymmetries for B-meson hadronic charmless decays.