Measurement of the $B^0 \rightarrow K^{*0}e^+e^-$ branching fraction at low dilepton mass

The branching fraction of the rate decay $B^0 \rightarrow K^{*0}e^+e^-$ in the dilepton mass region from 30 to 1000 MeV$/c^2$ has been measured by the LHCb experiment, using $pp$ collision data, corresponding to an integrated luminosity of 1.0 fb$^{-1}$, at a centre-of-mass energy of 7 TeV. The decay mode $B^0 \rightarrow J/\psi(e^+e^-) K^{*0}$ is utilized as a normalization channel. The branching fraction $B^0 \rightarrow K^{*0}e^+e^-$ is measured to be $$ B(B^0 \rightarrow K^{*0}e^+e^-)^{30-1000 MeV/c^2}= (3.1\, ^{+0.9\mbox{} +0.2}_{-0.8\mbox{}-0.3} \pm 0.2)\times 10^{-7}, $$ where the first error is statistical, the second is systematic, and the third comes from the uncertainties on the B^0 \rightarrow J/\psi K^{*0}$ and $J/\psi \rightarrow e^+e^- $ branching fractions.


Introduction
The b → sγ transition proceeds through flavour changing neutral currents, and thus is sensitive to the effects of physics beyond the Standard Model (BSM). Although the branching fraction of the B 0 → K * 0 γ decay has been measured [1-3] to be consistent with the Standard Model (SM) prediction [4], BSM effects could still be present and detectable through more detailed studies of the decay process. In particular, in the SM the photon helicity is predominantly left-handed, with a small right-handed current arising from long distance effects and from the non-zero value of the ratio of the s-quark mass to the b-quark mass. Information on the photon polarisation can be obtained with an angular analysis of the B 0 → K * 0 + − decay ( = e, µ) in the low dilepton invariant mass squared (q 2 ) region where the photon contribution dominates. The inclusion of charge-conjugate modes is implied throughout the paper. The low q 2 region also has the benefit of reduced theoretical uncertainties due to long distance contributions compared to the full q 2 region [5]. The more precise SM prediction allows for increased sensitivity to contributions from BSM. In the low q 2 interval there is a contribution from B 0 → K * 0 V (V → + − ) where V is one of the vector resonances ρ, ω or φ; however this contribution has been calculated to be at most 1% [6]. The diagrams contributing to the B 0 → K * 0 e + e − decay are shown in Fig. 1.
With the LHCb detector, the B 0 → K * 0 + − analysis can be carried out using either muons [7] or electrons. Experimentally, the decay with muons in the final state produces a much higher yield per unit integrated luminosity than electrons, primarily due to the clean trigger signature. In addition, the much smaller bremsstrahlung radiation leads to better momentum resolution, allowing a more efficient selection. On the other hand, the B 0 → K * 0 e + e − decay probes lower dilepton invariant masses, thus providing greater sensitivity to the photon polarisation [5]. Furthermore, the formalism is greatly simplified due to the negligible lepton mass [8]. It is therefore interesting to carry out an angular analysis of the decay B 0 → K * 0 e + e − in the region where the dilepton mass is less than 1000 MeV/c 2 . The lower limit is set to 30 MeV/c 2 since below this value the sensitivity for the angular analysis decreases because of a degradation in the precision of the orientation of the e + e − decay plane due to multiple scattering. Furthermore, the contamination from the B 0 → K * 0 γ decay, with the photon converting into an e + e − pair in the detector material, increases significantly as q 2 → 0.
The first step towards performing the angular analysis is to measure the branching fraction in this very low dilepton invariant mass region. Indeed, even if there is no doubt about the existence of this decay, no clear B 0 → K * 0 e + e − signal has been observed in this region and therefore the partial branching fraction is unknown. The only experiments to have observed B 0 → K * 0 e + e − to date are BaBar [9] and Belle [10], which have collected about 30 B 0 → K * 0 + − events each in the region q 2 < 2 GeV 2 /c 4 , summing over electron and muon final states.

The LHCb detector, dataset and analysis strategy
The study reported here is based on pp collision data, corresponding to an integrated luminosity of 1.0 fb −1 , collected at the Large Hadron Collider (LHC) with the LHCb detector [11] at a centre-of-mass energy of 7 TeV during 2011. The LHCb detector is a single-arm forward spectrometer covering the pseudorapidity range 2 < η < 5, designed for the study of particles containing b or c quarks. It includes a high precision tracking system consisting of a silicon-strip vertex detector (VELO) surrounding the pp interaction region, a large-area silicon-strip detector located upstream of a dipole magnet with a bending power of about 4 Tm, and three stations of silicon-strip detectors and straw drift tubes placed downstream. The combined tracking system has momentum resolution (∆p/p) that varies from 0.4% at 5 GeV/c to 0.6% at 100 GeV/c, and impact parameter (IP) resolution of 20 µm for tracks with high transverse momentum (p T ). Charged hadrons are identified using two ring-imaging Cherenkov detectors. Photon, electron and hadron candidates are identified by a calorimeter system consisting of scintillating-pad (SPD) and preshower (PS) detectors, an electromagnetic calorimeter (ECAL) and a hadronic calorimeter. Muons are identified by a system composed of alternating layers of iron and multiwire proportional chambers. The trigger [12] consists of a hardware stage, based on information from the calorimeter and muon systems, followed by a software stage which applies a full event reconstruction.
For signal candidates to be considered in this analysis, at least one of the electrons from the B 0 → K * 0 e + e − decay must pass the hardware electron trigger, or the hardware trigger must be satisfied independently of any of the daughters of the signal B 0 candidate (usually triggering on the other b-hadron in the event). The hardware electron trigger requires the presence of an ECAL cluster with a transverse energy greater than 2.5 GeV. An energy deposit is also required in one of the PS cells in front of the ECAL cluster, where the threshold corresponds to the energy that would be deposited by the passage of five minimum ionising particles. Finally, at least one SPD hit is required among the SPD cells in front of the cluster. The software trigger requires a two-, three-or four-track secondary vertex with a high sum of the p T of the tracks and a significant displacement from the primary pp interaction vertices (PVs). At least one track should have p T > 1.7 GeV/c and IP χ 2 with respect to the primary interaction greater than 16. The IP χ 2 is defined as the difference between the χ 2 of the PV reconstructed with and without the considered track. A multivariate algorithm is used for the identification of secondary vertices consistent with the decay of a b-hadron.
The strategy of the analysis is to measure a ratio of branching fractions in which most of the potentially large systematic uncertainties cancel. The decay B 0 → J/ψ (e + e − )K * 0 is used as normalization mode, since it has the same final state as the B 0 → K * 0 e + e − decay and has a well measured branching fraction [13,14], approximately 300 times larger than B(B 0 → K * 0 e + e − ) in the e + e − invariant mass range 30 to 1000 MeV/c 2 . Selection efficiencies are determined using data whenever possible, otherwise simulation is used, with the events weighted to match the relevant distributions in data. The pp collisions are generated using Pythia 6.4 [15] with a specific LHCb configuration [16]. Hadron decays are described by EvtGen [17] in which final state radiation is generated using Photos [18]. The interaction of the generated particles with the detector and its response are implemented using the Geant4 toolkit [19] as described in Ref. [20].

Selection and backgrounds
The candidate selection is divided into three steps: a loose selection, a multivariate algorithm to suppress the combinatorial background, and additional selection criteria to remove specific backgrounds.
Candidate K * 0 mesons are reconstructed in the K * 0 → K + π − mode. The p T of the charged K (π) mesons must be larger than 400 (300) MeV/c. Particle identification (PID) information is used to distinguish charged pions from kaons [21]. The difference between the logarithms of the likelihoods of the kaon and pion hypotheses is required to be larger than 0 for kaons and smaller than 5 for pions; the combined efficiency of these cuts is 88%. Candidates with a K + π − invariant mass within 130 MeV/c 2 of the nominal K * 0 mass and a good quality vertex fit are retained for further analysis. To remove background from B 0 s → J/ψ (e + e − )φ and B 0 s → φe + e − decays, where one of the kaons is misidentified as a pion, the mass computed under the K + K − hypothesis is required to be larger than 1040 MeV/c 2 .
Bremsstrahlung radiation, if not accounted for, would worsen the B 0 mass resolution. If the radiation occurs downstream of the dipole magnet the momentum of the electron is correctly measured and the photon energy is deposited in the same calorimeter cell as the electron. In contrast, if photons are emitted upstream of the magnet, the measured electron momentum will be that after photon emission, and the measured B 0 mass will be degraded. In general, these bremsstrahlung photons will deposit their energy in different calorimeter cells than the electron. In both cases, the ratio of the energy detected in the ECAL to the momentum measured by the tracking system, an important variable in identifying electrons, is unbiased. To improve the momentum reconstruction, a dedicated bremsstrahlung recovery procedure is used, correcting the measured electron momentum by the bremsstrahlung photon energy. As there is little material within the magnet, the bremsstrahlung photons are searched for among neutral clusters with an energy larger than 75 MeV in a well defined position given by the electron track extrapolation from before the magnet. Oppositely-charged electron pairs with an electron p T larger than 350 MeV/c and a good quality vertex are used to form B 0 → K * 0 e + e − and B 0 → J/ψ (e + e − )K * 0 candidates. The e + e − invariant mass is required to be in the range 30 -1000 MeV/c 2 or 2400 -3400 MeV/c 2 for the two decay modes, respectively. Candidate K * 0 mesons and e + e − pairs are combined to form B 0 candidates which are required to have a good-quality vertex. For each B 0 candidate, the production vertex is assigned to be that with the smallest IP χ 2 . The B 0 candidate is also required to have a direction that is consistent with coming from the PV as well as a reconstructed decay point that is significantly separated from the PV.
In order to maximize the signal efficiency while still reducing the high level of combinatorial background, a multivariate analysis, based on a Boosted Decision Tree (BDT) [22] with the AdaBoost algorithm [23], is used. The signal training sample is B 0 → K * 0 e + e − simulated data. The background training sample is taken from the upper sideband (m B 0 > 5600 MeV/c 2 ) from half of the data sample. The variables used in the BDT are the p T , the IP and track χ 2 of the final state particles; the K * 0 candidate invariant mass, the vertex χ 2 and flight distance χ 2 (from the PV) of the K * 0 and e + e − candidates; the B 0 p T , its vertex χ 2 , flight distance χ 2 and IP χ 2 , and the angle between the B 0 momentum direction and its direction of flight from the PV. A comparison of the BDT output for the data and the simulation for B 0 → J/ψ (e + e − )K * 0 decays is shown in Fig. 2. The candidates for this test are reconstructed using a J/ψ mass constraint and the background is statistically subtracted using the sPlot technique [24] based on a fit to the B 0 invariant mass spectrum. The agreement between data and simulation confirms a proper modelling of the relevant variables. The optimal cut value on the BDT response is chosen by considering the combinatorial background yield (b) on the B 0 → K * 0 e + e − invariant mass distribution outside the signal region 1 and evaluating the signal yield (s) using the B 0 → K * 0 e + e − simulation assuming a visible B 0 → K * 0 e + e − branching fraction of 2.7 × 10 −7 . The quantity s/ √ s + b serves as an optimisation metric, for which the optimal BDT cut is 0.96. The signal efficiency of this cut is about 93% while the background is reduced by two orders of magnitude.
After applying the BDT selection, specific backgrounds from decays that have the same visible final state particles as the B 0 → K * 0 e + e − signal remain. Since some of these backgrounds have larger branching fractions, additional requirements are applied to the B 0 → K * 0 e + e − and B 0 → J/ψ (e + e − )K * 0 candidates. A large non-peaking background comes from the B 0 → D − e + ν decay, with D − → e − νK * 0 . The branching fraction for this channel is about five orders of magnitude larger than that of the signal. When the neutrinos have low energies, the signal selections are ineffective at rejecting this background. Therefore, the K * 0 e − invariant mass is required to be larger than 1900 MeV/c 2 , which is 97% efficient on signal decays. Another important source of background comes from the B 0 → K * 0 γ decay, where the photon converts into an e + e − pair. In LHCb, approximately 40% of the photons convert before the calorimeter, and although only about 10% are reconstructed as an e + e − pair, the resulting mass of the B 0 candidate peaks in the signal region. This background is suppressed by a factor 23 after the selection cuts (including the 30 MeV/c 2 minimum requirement on the e + e − invariant mass). The fact that signal e + e − pairs are produced at the B 0 decay point, whereas conversion electrons are produced in the VELO detector material, is exploited to further suppress this background. The difference in the z coordinates, ∆z, between the first VELO hit and the expected position of the first hit, assuming the electron was produced at the K * 0 vertex, should satisfy |∆z| < 30 mm. In addition, we require that the calculated uncertainty on the z-position of the e + e − vertex be less than 30 mm, since a large uncertainty makes it difficult to determine if the e + e − pair originates from the same vertex as the K * 0 meson, or from a point inside the detector material. These two additional requirements reject about 2/3 of the remaining B 0 → K * 0 γ background, while retaining about 90% of the B 0 → K * 0 e + e − signal. After applying these cuts, the B 0 → K * 0 γ contamination under the B 0 → K * 0 e + e − signal peak is estimated to be (10 ± 3)% of the expected signal yield.
Other specific backgrounds have been studied using either simulated data or analytical calculations and include the decays B → K * η, K * η , K * π 0 and Λ 0 b → Λ * γ, where Λ * represents a high mass resonance decaying into a proton and a charged kaon. The main source of background is found to be the B → K * η mode, followed by a Dalitz decay (η → γe + e − ). These events form an almost flat background in the mass range 4300 − 5250 MeV/c 2 . None of these backgrounds contribute significantly in the B 0 mass region, and therefore are not specifically modelled in the mass fits described later.
More generally, partially reconstructed backgrounds arise from B decays with one or more decay products in addition to a K * 0 meson and an e + e − pair. In the case of the B 0 → J/ψ (e + e − )K * 0 decay, there are two sources for these partially reconstructed events: those from the hadronic part, such as events with higher K * resonances (partially reconstructed hadronic background), and those from the J/ψ part (partially reconstructed J/ψ background), such as events coming from ψ(2S) decays. For the B 0 → K * 0 e + e − decay mode, only the partially reconstructed hadronic background has to be considered.

Fitting procedure
Since the signal resolution, type and rate of backgrounds depend on whether the hardware trigger was caused by a signal electron or by other activity in the event, the data sample is divided into two mutually exclusive categories: events triggered by an extra particle (e, γ, h, µ) excluding the four final state particles (called HWTIS, since they are triggered independently of the signal) and events for which one of the electrons from the B 0 decay satisfies the hardware electron trigger (HWElectron). Events satisfying both requirements (20%) are assigned to the HWTIS category. The numbers of reconstructed signal candidates are determined from unbinned maximum likelihood fits to their mass distributions separately for each trigger category. The mass distribution of each category is fitted to a sum of probability density functions (PDFs) modelling the different components.
1. The signal is described by the sum of two Crystal Ball functions [25] (CB) sharing all their parameters but with different widths.
2. The combinatorial background is described by an exponential function.
3. The shapes of the partially reconstructed hadronic and J/ψ backgrounds are described by non-parametric PDFs [26] determined from fully simulated events.
The signal shape parameters are fixed to the values obtained from simulation, unless otherwise specified. There are seven free parameters for the B 0 → J/ψ (e + e − )K * 0 fit for each trigger category. These include the peak value of the B 0 candidate mass, a scaling factor applied to the widths of the CB functions to take into account small differences between simulation and data, and the exponent of the combinatorial background. The remaining four free parameters are the yields for each fit component. The invariant mass distributions together with the PDFs resulting from the fit are shown in Fig. 3. The number of signal events in each category is summarized in Table 1.
A fit to the B 0 → K * 0 e + e − candidates is then performed, with several parameters fixed to the values found from the B 0 → J/ψ (e + e − )K * 0 fit. These fixed parameters are the scaling factor applied to the widths of the CB functions, the peak value of the B 0 candidate mass and the ratio of the partially reconstructed hadronic background to the signal yield. The B 0 → K * 0 γ yield is fixed in the B 0 → K * 0 e + e − mass fit using the fitted B 0 → J/ψ (e + e − )K * 0 signal yield, the ratio of efficiencies of the B 0 → K * 0 γ and B 0 → J/ψ (e + e − )K * 0 modes, and the ratio of branching fractions B(B 0 → K * 0 γ)/B(B 0 → J/ψ (e + e − )K * 0 ). Hence there are three free parameters for the B 0 → K * 0 e + e − fit for each trigger category: the exponent and yield of the combinatorial background and the signal yield. The invariant mass distributions together with the PDFs resulting from the fit are shown in Fig. 4. The signal yield in each trigger category is summarized in Table 1  HWElectron and (right) HWTIS trigger categories. The dashed line is the signal PDF, the light grey area corresponds to the combinatorial background, the medium grey area is the partially reconstructed hadronic background and the dark grey area is the partially reconstructed J/ψ background component.  HWElectron and (right) HWTIS trigger categories. The dashed line is the signal PDF, the light grey area corresponds to the combinatorial background, the medium grey area is the partially reconstructed hadronic background and the black area is the B 0 → K * 0 γ component.

Results
The B 0 → K * 0 e + e − branching fraction is calculated in each trigger category using the measured signal yields and the ratio of efficiencies where the ratio of efficiencies is sub-divided into the contributions arising from the selection requirements (including acceptance effects, but excluding PID), r sel , the PID requirements r PID and the trigger requirements r HW . The values of r sel are determined using simulated data, while r PID and r HW are obtained directly from calibration data samples: J/ψ → e + e − and D 0 → K − π + from D * + decays for r PID and B 0 → J/ψ (e + e − )K * 0 decays for r HW . The values are summarized in Table 2. The only ratio that is inconsistent with unity is the hardware trigger efficiency due to the different mean electron p T for the B 0 → K * 0 e + e − and B 0 → J/ψ (e + e − )K * 0 decays. The branching fraction for the B 0 → J/ψ K * 0 decay mode is taken from Ref. [14] and a correction factor of 1.02 has been applied to take into account the difference in the Kπ invariant mass range used, and therefore the different S-wave contributions.
The B 0 → K * 0 e + e − branching fraction, for each trigger category, is measured to be where the uncertainties are statistical only.

Systematic uncertainties
Several sources of systematic uncertainty are considered, affecting either the determination of the number of signal events or the computation of the efficiencies. They are summarized in Table 3.
The ratio of trigger efficiencies is determined using a B 0 → J/ψ (e + e − )K * 0 calibration sample from data, which is reweighted using the p T of the triggering electron in order to model properly the kinematical properties of the two decays. The uncertainties due to the limited size of the calibration samples are propagated to get the related systematic uncertainty shown in Table 2.
The PID calibration introduces a systematic uncertainty on the calculated PID efficiencies as given in Table 2. For the kaon and pion candidates this systematic uncertainty is estimated by comparing, in simulated events, the results obtained using a D * + calibration sample to the true simulated PID performance. For the e + e − candidates, the systematic uncertainty is assessed ignoring the p T dependence of the electron identification. The resulting effect is limited by the fact that the kinematic differences between the B 0 → J/ψ (e + e − )K * 0 and the B 0 → K * 0 e + e − decays are small once the full selection chain is applied.
The fit procedure is validated with pseudo-experiments. Samples are generated with different fractions or shapes for the partially reconstructed hadronic background, or different values for the fixed signal parameters and are then fitted with the standard PDFs. The corresponding systematic uncertainty is estimated from the bias in the results obtained by performing the fits described above. The resulting deviations from zero of each variation are added in quadrature to get the total systematic uncertainty due to the fitting procedure. The parameters of the signal shape are varied within their statistical uncertainties as obtained from the B 0 → J/ψ (e + e − )K * 0 fit. An alternate signal shape, obtained by studying B 0 → J/ψ (e + e − )K * 0 signal decays in data both with and without a J/ψ mass constraint is also tried; the difference in the yields from that obtained using the nominal signal shape is taken as an additional source of uncertainty. The ratio of the partially reconstructed hadronic background to the signal yield is assumed to be identical to that determined from the B 0 → J/ψ (e + e − )K * 0 fit. The systematic uncertainty linked to this hypothesis is evaluated by varying the ratio by ±50%. The fraction of partially reconstructed hadronic background thus determined is in agreement within errors with the one found in B 0 → K * 0 γ decays [27]. The shape of the partially reconstructed background used in the B 0 → J/ψ (e + e − )K * 0 and the B 0 → K * 0 e + e − fits are the same. The related systematic uncertainty has been evaluated using an alternative shape obtained from charmless b-hadron decays. The B 0 → K * 0 γ contamination in the B 0 → K * 0 e + e − signal sample is 1.2 ± 0.4 and 1.5 ± 0.5 events for the HWElectron and HWTIS signal samples, respectively. Combining the systematic uncertainties in quadrature, the branching fractions are found to be where the first error is statistical, the second systematic, and the third comes from the uncertainties on the B 0 → J/ψ K * 0 and J/ψ → e + e − branching fractions [13,14]. The branching ratios are combined assuming all the systematic uncertainties to be fully correlated between the two trigger categories except those related to the size of the simulation samples. The combined branching ratio is found to be B(B 0 → K * 0 e + e − ) 30−1000 MeV/c 2 = (3.1 +0.9 +0.2 −0.8 −0.3 ± 0.2) × 10 −7 .

Summary
Using pp collision data corresponding to an integrated luminosity of 1.0 fb −1 , collected by the LHCb experiment in 2011 at a centre-of-mass energy of 7 TeV, a sample of approximately 30 B 0 → K * 0 e + e − events, in the dilepton mass range 30 to 1000 MeV/c 2 , has been observed. The probability of the background to fluctuate upward to form the signal corresponds to 4.6 standard deviations including systematic uncertainties. The B 0 → J/ψ (e + e − )K * 0 decay mode is utilized as a normalization channel, and the branching fraction B(B 0 → K * 0 e + e − ) is measured to be This result can be compared to theoretical predictions. A simplified formula suggested in Ref. [5] takes into account only the photon diagrams of Fig. 1. When evaluated in the 30 to 1000 MeV/c 2 e + e − invariant mass interval using B(B 0 → K * 0 γ) [1-3], it predicts a B 0 → K * 0 e + e − branching fraction of 2.35 × 10 −7 . A full calculation has been recently performed [28] and the numerical result for the e + e − invariant mass interval of interest is (2.43 +0.66 −0.47 ) × 10 −7 . The consistency between the two values reflects the photon pole dominance. The result presented here is in good agreement with both predictions.
Using the full LHCb data sample obtained in 2011 -2012 it will be possible to do an angular analysis. The measurement of the A 2 T parameter [8] thus obtained, is sensitive to the existence of right handed currents in the virtual loops in diagrams similar to those of Fig. 1. For this purpose, the analysis of the B 0 → K * 0 e + e − decay is complementary to that of the B 0 → K * 0 µ + µ − mode. Indeed, it is predominantly sensitive to a modification of C 7 (the so-called C 7 terms) while, because of the higher q 2 in the decay, the B 0 → K * 0 µ + µ − A 2 T parameter has a larger possible contribution from the C 9 terms [29].