Search for B_s to mu^+ mu^- and B^0 to mu^+ mu^- decays

A search for the rare decays B_s to mu+ mu- and B^0 to mu^+ mu^- is performed in pp collisions at sqrt(s) = 7 TeV, with a data sample corresponding to an integrated luminosity of 5 inverse femtobarns collected by the CMS experiment at the LHC. In both decays, the number of events observed after all selection requirements is consistent with the expectation from background plus standard model signal predictions. The resulting upper limits on the branching fractions are Br(B_s to mu^+ mu^-)<7.7E-9 and Br(B^0 to mu^+ mu^-)<1.8E-9 at 95% confidence level.


Introduction
The decays B 0 s (B 0 ) → µ + µ − are highly suppressed in the standard model (SM) of particle physics, which predicts the branching fractions to be B(B 0 s → µ + µ − ) = (3.2 ± 0.2) × 10 −9 and B(B 0 → µ + µ − ) = (1.0 ± 0.1) × 10 −10 [1]. This suppression is due to the flavor-changing neutral current transitions b → s(d), which are forbidden at tree level and can only proceed via high-order diagrams that are described by electroweak penguin and box diagrams at the oneloop level. Additionally, the decays are helicity suppressed by a factor of m 2 µ /m 2 B , where m µ and m B are the masses of the muon and B meson, respectively (the symbol B is used to denote B 0 or B 0 s mesons). Furthermore, these decays also require an internal quark annihilation within the B meson that reduces the decay rate by an additional factor of f 2 B /m 2 B , where f B is the decay constant of the B meson. The leading theoretical uncertainty is due to incomplete knowledge of f B , which is constrained by measurements of the mixing mass difference ∆m s (∆m d ) for B 0 s (B 0 ) mesons.
Several extensions of the SM predict enhancements to the branching fractions for these rare decays. In supersymmetric models with non-universal Higgs masses [2] and in specific models containing leptoquarks [3], for example, the B 0 s → µ + µ − and B 0 → µ + µ − branching fractions can be enhanced. In the minimal supersymmetric extension of the SM, the rates are strongly enhanced at large values of tan β, which is the ratio of the two vacuum expectation values of the two Higgs boson doublets [4,5] . However, in most models of new physics, the decay rates can also be suppressed for specific choices of model parameters [6].
At the Tevatron, the D0 experiment has published an upper limit of B(B 0 s → µ + µ − ) < 5.1 × 10 −8 [7] at 95% confidence level (CL). The CDF experiment has set a limit of B(B 0 s → µ + µ − ) < 4.0 × 10 −8 and B(B 0 → µ + µ − ) < 6.0 × 10 −9 , and also reported an excess of B 0 s → µ + µ − events, corresponding to B(B 0 s → µ + µ − ) = (1.8 +1.1 −0.9 ) × 10 −8 [8]. At the Large Hadron Collider (LHC), two experiments have published results: B(B 0 s → µ + µ − ) < 1.9 × 10 −8 and B(B 0 → µ + µ − ) < 4.6 × 10 −9 by the Compact Muon Solenoid (CMS) Collaboration [9], and B(B 0 s → µ + µ − ) < 1.4 × 10 −8 and B(B 0 → µ + µ − ) < 3.2 × 10 −9 by the LHCb Collaboration [10]. This paper reports on a new simultaneous search for B 0 s → µ + µ − and B 0 → µ + µ − decays using data collected in 2011 by the CMS experiment in pp collisions at √ s = 7 TeV at the LHC. The dataset corresponds to an integrated luminosity of 5 fb −1 . An event-counting experiment is performed in dimuon mass regions around the B 0 s and B 0 masses. To avoid potential bias, a "blind" analysis approach is applied where the signal region is not observed until all selection criteria are established. Monte Carlo (MC) simulations are used to estimate backgrounds due to B decays. Combinatorial backgrounds are evaluated from the data in dimuon invariant mass (m µµ ) sidebands. In the CMS detector, the mass resolution, which influences the separation between B 0 s → µ + µ − and B 0 → µ + µ − decays, depends on the pseudorapidity η of the reconstructed particles. The pseudorapidity is defined as η = − ln[tan(θ/2)], where θ is the polar angle with respect to the counterclockwise proton beam direction. The background level also depends significantly on the η of the B candidate. Therefore, the analysis is performed separately in two channels, "barrel" and "endcap", and then combined for the final result. The barrel channel contains the candidates where both muons have |η| < 1.4 and the endcap channel contains those where at least one muon has |η| > 1.4.
A "normalization" sample of events with B + → J/ψK + decays (where J/ψ → µ + µ − ) is used to remove uncertainties related to the bb production cross section and the integrated luminosity. The signal and normalization efficiencies are determined through MC simulation studies. To validate the simulation distributions, such as the B 0 s transverse momentum (p T ) spectrum, and to evaluate potential effects resulting from differences in the fragmentation of B + and B 0 s , a "control" sample of reconstructed B 0 s → J/ψφ decays (with J/ψ → µ + µ − and φ → K + K − ) is used.
The dataset includes periods of high instantaneous luminosity conditions, with an average of 8 interactions per bunch crossing (later referred to as "pileup"). The analysis algorithms and the selection criteria have been optimized to mitigate the effects of pileup by reducing the influence of tracks coming from additional interactions in the event, as explained in Section 5. In parallel with the LHC luminosity increase, the CMS event triggering requirements also changed during the data-taking period. The analysis and simulations take these changes into account so that all MC samples incorporate the appropriate mixture of the trigger conditions, and the selection requirements applied in the data reconstruction are more restrictive than the most stringent trigger criteria.
The limits on the branching fractions depend on both systematic and statistical uncertainties. Several sources of systematic uncertainties can influence the estimated efficiency: detector acceptance, and analysis, muon identification and triggering efficiencies. The evaluation of the individual values are presented in the sections below when discussing the relevant efficiencies and then are combined in Section 6.
The data analyzed here include the event sample corresponding to an integrated luminosity of 1.14 fb −1 , which was used to obtain the earlier CMS result [9]. The present analysis differs in several ways: the total dataset is almost five times larger; new selection variables are added to the analysis; the selection criteria are optimized for higher pileup and varying trigger requirements; and the description of rare backgrounds is improved. All these changes result in a better signal sensitivity.

Monte Carlo simulation
Simulated events are used to determine the efficiencies for the signal and normalization samples. We split the efficiency into four parts: detector acceptance, analysis efficiency, and muon identification and trigger efficiencies. The detector acceptance combines the geometrical detector acceptance and the tracking efficiency, and is defined as tracks within |η| < 2.4 and satisfying p T > 1 GeV (p T > 0.5 GeV) for muons (kaons). The acceptance is about 25% (23%) for signal events in the barrel (endcap) channels. In the p T range relevant for this analysis the tracking efficiency for isolated muons and kaons is above 99.5% [11]. The analysis efficiency refers to the selection requirements described in Section 5, and is for signal events about 2.0% (1.2%) in the barrel (endcap) channels. The muon identification and trigger efficiencies are presented in Sections 3 and 4, respectively. The analysis, muon identification, and trigger efficiencies are all obtained from simulation and checked in data. Good agreement is found, and the residual differences are used to estimate systematic uncertainties on the efficiency estimates.
The simulated samples are also used to estimate the background from rare B decays where one or two hadrons are misidentified as muons. These decays include a variety of channels of the type B → h − µ + ν and B → h + h − , where h is a π, K or p and B stands for B 0 , B 0 s mesons or Λ b baryons. The most important backgrounds are from B 0 s → K − K + , B 0 → K + π − and from the semileptonic decays B 0 → π − µ + ν, B 0 s → K − µ + ν, and Λ 0 b → pµ −ν . The samples of simulated events are generated with PYTHIA 6.424 (TUNE Z2) [12], the unstable particles are decayed via EVTGEN [13], and the detector response is simulated with GEANT4 [14]. The signal and background events are selected from generic quantum chromodynamic (QCD) 2 → 2 sub-processes and provide a mixture of gluon-fusion, flavor-excitation, and gluon-splitting production. The evolution of the triggers used to collect the data is incorporated in the reconstruction of the simulated events. The number of simulated events in all the channels approximately match the expected number given the integrated luminosity.

The CMS detector
The CMS detector is a general-purpose detector designed and built to study physics at the TeV scale. A detailed description can be found in Ref. [15]. For this analysis, the main subdetectors used are a silicon tracker, composed of pixel and strip detectors within a 3.8 T axial magnetic field, and a muon detector, which is divided into a barrel section and two endcaps, consisting of gas-ionization detectors embedded in the steel return yoke of the solenoid. The silicon tracker detects charged particles within the pseudorapidity range |η| < 2.5. The pixel detector is composed of three layers in the barrel region and two disks located on each side in the forward regions of the detector. In total, the pixel detector contains about 66 million 100 µm × 150 µm pixels. Further from the interaction region is a microstrip detector, which is composed of ten barrel layers, and three inner and nine outer disks on either end of the detector, with a strip pitches between 80 and 180 µm. In total, the microstrip detector contains around 10 million strips and, together with the pixel detector, provides an impact parameter resolution of ∼ 15 µm. Due to the high granularity of the silicon tracker and to the strong magnetic field, a p T resolution of about 1.5% [16] is obtained for the charged particles in the p T range relevant for this analysis. The systematic uncertainty on the hadronic track reconstruction efficiency is estimated to be 4% [16]. Muons are detected in the pseudorapidity range |η| < 2.4 by detectors made of three technologies: drift tubes, cathode strip chambers, and resistive plate chambers. The analysis is nearly independent of pileup because of the high granularity of the CMS silicon tracker and the excellent three-dimensional (3D) hit resolution of the pixel detector.
The dimuon candidate events are selected with a two-level trigger system, the first level only uses the muon detector information, while the high-level trigger (HLT) uses additional information from the pixel and strip detectors. The first-level trigger requires two muon candidates without any explicit p T requirement, but there is an implicit selection since muons must reach the muon detectors (about 3.5 GeV in the barrel and 2 GeV in the endcap). The HLT imposes a p T requirement and uses additional information from the silicon tracker. As the LHC instantaneous luminosity increased, the trigger requirements were gradually tightened. This change in trigger requirements is also included in the trigger simulations. The most stringent HLT selection requires two muons each with p T > 4 GeV, the dimuon p T > 3.9 GeV (5.9 GeV in the endcap), dimuon invariant mass within 4.8 < m µµ < 6.0 GeV, and a 3D distance of closest approach to each other of d ca < 0.5 cm. For the entire dataset, the offline analysis selection is more restrictive than the most stringent trigger selections.
For the normalization (B + → J/ψK + ) and control (B 0 s → J/ψφ) samples, the data are collected by requiring the following: two muons each with p T > 4 GeV, dimuon p T > 6.9 GeV, |η| < 2.2, invariant mass within 2.9 < m µµ < 3.3 GeV, d ca < 0.5 cm, and the probability of the χ 2 per degree of freedom (χ 2 /dof) of the dimuon vertex fit greater than 15%. To reduce the rate of prompt J/ψ candidates, two additional requirements are imposed in the transverse plane: (i) the pointing angle α xy between the dimuon momentum and the vector from the beamspot (defined as the average interaction point) to the dimuon vertex must fulfill cos α xy > 0.9; and (ii) the flight distance significance xy /σ( xy ) must be larger than 3, where xy is the two-dimensional distance between the primary and dimuon vertices and σ( xy ) is its uncertainty.
The trigger efficiencies for the various samples are determined from the MC simulation. They are calculated after all muon identification selection criteria, as discussed in Section 4, have been applied. For the signal events the average trigger efficiency is 84% (74%) in the barrel (endcap) channel. The trigger efficiency for the normalization and control samples varies from 77% in the barrel channel to 60% in the endcap channel. This analysis depends on the ratio of the signal efficiency to the normalization sample efficiency. The systematic uncertainty on the trigger efficiency ratio is estimated as the sum in quadrature of two components. The first component is defined as the variation of the efficiency ratio when varying the muon p T threshold from 4 to 8 GeV in the MC simulation. The second one is the difference between the ratios determined in data and MC simulations using the tag-and-probe approach (described in Section 4). The systematic uncertainty on the ratio is estimated to be 3% in the barrel channel and 6% in the endcap channel.

Muon identification
Muon candidates are reconstructed by combining tracks found in the silicon tracker and the muon detector [17,18]. In order to ensure high-purity muons, the following additional requirements are applied: (i) muon candidates must have at least two track segments in the muon stations; (ii) they must have more than 10 hits in the silicon tracker, of which at least one must be in the pixel detector; (iii) the combined track must have χ 2 /dof < 10; and (iv) the impact parameter in the transverse plane d xy , calculated with respect to the beamspot, must be smaller than 0.2 cm. The systematic uncertainty on the muon track reconstruction efficiency is 2% [11] and is included in the uncertainty of the total efficiency.
The ratio of the muon identification efficiencies between the signal and normalization samples is used in this analysis. This ratio is determined in two ways. First, the MC event samples contain a full simulation of the muon detector, which allows an efficiency determination by counting the events that pass or fail the muon identification algorithm. Second, the muon identification efficiency is determined with a tag-and-probe method [17], which is applied to both data and MC event samples. To study the single-muon identification efficiency, the decays J/ψ → µ + µ − are used. In the tag-and-probe method, a "tag" muon, satisfying strict muon criteria, is paired with a"probe" track, where together they combine to give the J/ψ invariant mass, thus indicating the probe is in fact a muon. The single-muon efficiency is determined by the number of probe tracks passing or failing the muon identification algorithm. Dedicated trigger paths constructed using the tag muon and either a silicon track or a signal in the muon chambers are employed for this study, which ensures large event samples while avoiding potential bias of the efficiency measurement from using events triggered by the probe.
The muon identification efficiency is calculated after all selection criteria, including the detector acceptance, have been applied. For the signal events, the average efficiency is 71% (85%) in the barrel (endcap) channel based on the MC simulation. For the normalization and control samples, the muon identification efficiency is about 77% (78%) in the barrel (endcap). Paircorrelation effects influence these numbers [17]. The dimuon efficiency can be altered with respect to the product of single-muon efficiencies depending on the mutual proximity of the two muons in the muon system. This effect is included in the efficiency calculations in the detailed MC simulation of the muon detectors. The systematic uncertainty on the identification efficiency ratio is estimated in the same way as for the muon trigger efficiency ratio (Section 3), and is 4% in the barrel and 8% in the endcap.

Analysis
The reconstruction of B → µ + µ − candidates requires two oppositely-charged muons that originate from a common vertex and have an invariant mass in the range 4.9 < m µµ < 5.9 GeV. A fit of the B-candidate vertex is performed and its χ 2 /dof is evaluated. The two daughter muon tracks are combined to form the B-candidate track.
The primary vertex associated with a B candidate is chosen from all reconstructed primary vertices as the one which has minimal separation along the z axis from the z intercept of the extrapolated B candidate track. Reconstruction effects due to pileup are largely eliminated by the primary vertex matching procedure. The position of this primary vertex is then refit without the tracks of the B candidate with an adaptive vertex fit [16], where tracks are assigned a weight 0 < w < 1 based on their proximity to the primary vertex. After the refit, B candidates with badly reconstructed primary vertices are eliminated by requiring the average track weight w > 0.6. The 3D impact parameter of the B candidate δ 3D , its uncertainty σ(δ 3D ), and its significance δ 3D /σ(δ 3D ) are measured with respect to the primary vertex.
The isolation of the B candidate is an important criterion in separating the signal from background. Three variables are used for this purpose: verse momentum of the B candidate p T (B) and the transverse momenta of all other charged tracks satisfying ∆R = (∆η) 2 + (∆φ) 2 < 0.7, where ∆η and ∆φ are the differences in pseudorapidity and azimuthal angle between a charged track and the B-candidate momentum. The sum includes all tracks with p T > 0.9 GeV that are (i) consistent with originating from the same primary vertex as the B candidate or (ii) have a distance of closest approach d ca < 0.05 cm with respect to the B vertex and are not associated with any other primary vertex .
• The number of tracks N close trk with p T > 0.5 GeV and d ca < 0.03 cm with respect to the B-candidate's vertex.
• The minimum distance of closest approach between tracks and the B-candidate's vertex, d 0 ca , for all tracks in the event that are either associated with the same primary vertex as the B-candidate or not associated with any other primary vertex.
The first variable describes the isolation primarily with respect to tracks coming from the primary vertex itself. The latter two variables quantify the isolation of the B vertex. They help to reject partly reconstructed B decays where there are other tracks in addition to the two muons associated with the B-candidate vertex.
The distributions of the variables described above are shown in Fig. 1 for signal events from the MC simulation and for data background events. These include the momenta of the highermomentum (leading) and lower-momentum (sub-leading) muons p T,µ1 and p T,µ2 , p T (B), the 3D pointing angle α 3D , the 3D flight length significance 3D /σ( 3D ), the χ 2 /dof, and the isolation variables (I, N close trk , and d 0 ca ). The data background events are defined as B candidates with a dimuon mass in the sidebands covering the range 4.9 < m µµ < 5.9 GeV, excluding the (blinded) signal windows from 5.20 < m µµ < 5.45 GeV. Events shown in Fig. 1 must pass a tight selection that is close to the final one: muon p T > 4 GeV, p T (B) > 7.5 GeV, α < 0.05, ca > 0.015 cm, and N close trk < 2 tracks. For each distribution, the selection requirements for all variables, apart from the one plotted, are applied. This figure illustrates the differences in the distributions of signal and background events, and shows which variables are effective in reducing the background events, e.g., 3D /σ( 3D ). The analysis efficiency for each selection requirement is determined from the simulated events.
The reconstruction of B + → J/ψK + → µ + µ − K + and B 0 s → J/ψφ → µ + µ − K + K − events is very similar to the reconstruction of B → µ + µ − events. Candidates with two oppositely-charged muons sharing a common vertex and with invariant mass in the range 3.0-3.2 GeV are reconstructed. The selected candidates must have a dimuon p T > 7 GeV. Then they are combined with one or two tracks each assumed to be a kaon, with p T > 0.5 GeV and |η| < 2.4. The 3D distance of closest approach between all pairs among the three (four) tracks is required to be less than 0.1 cm. For B 0 s → J/ψφ candidates, the two assumed kaon tracks must form an invariant mass in the range 0.995-1.045 GeV and have ∆R(K + , K − ) < 0.25. The three (four) tracks from the decay are used in the vertex fit. All requirements listed above for B → µ + µ − events are also applied here, including the vertex-fit selection χ 2 /dof < 2, which eliminates poorly reconstructed candidates. Only B candidates with an invariant mass in the range 4.8-6.0 GeV are considered. Figures 2 and 3 show the MC simulation and sideband-subtracted data distributions for a number of variables for the B + → J/ψK + and B 0 s → J/ψφ candidates, respectively. For each distribution, the selection requirements for all variables, apart from the one plotted, are applied. The relative efficiency for each selection requirement is determined separately in data and MC simulation and compared. The largest relative differences are 2.5% for the isolation selection in the normalization sample and 1.6% for the χ 2 /dof selection in the control sample. We combine in quadrature the differences for all distributions to estimate the systematic uncertainty related to the selection efficiency and obtain 4% (3%) for the normalization (control) sample. The control sample uncertainty is used for the signal sample.
The dataset used in this analysis is affected by pileup, which includes an average of 8 reconstructed primary vertices per event. The distribution of the primary vertex z position has a Gaussian shape with an RMS of approximately 5.6 cm. To study a possible dependence on the amount of pileup, the efficiencies of all selection criteria are calculated as a function of the number of reconstructed primary vertices. In Fig. 4 this dependence is shown for the normalization sample. A horizontal line is superimposed to guide the eye indicating that no significant dependence is observed. The same conclusion is also obtained in the MC simulation by comparing the selection efficiency for events with less than six primary vertices to those with more than ten primary vertices. Similar studies of the control sample, albeit with less precision, lead to the same conclusion: the analysis is not affected by pileup.
Variables sensitive to the underlying production processes (gluon fusion, flavor excitation, or gluon splitting) are also studied to validate the production process mixture in the MC simulation. The clearest distinction among the three processes is obtained by studying (i) the ∆R distribution between the B candidate and another muon and (ii) the p T spectrum of this muon. The MC simulation (PYTHIA) describes these distributions adequately.

Results
The present analysis differs significantly from the previous one [9]: • The muon identification algorithm has changed. A tighter selection is used, which significantly decreases the rate at which kaons and pions are misidentified as muons.
• The definition of isolation is different and two additional isolation variables are used, which reduces the influence of event pileup and also lowers the combinatorial background.
[GeV]       • The requirement, for the normalization and control samples, that the two muons bend away from each other is removed, making the selection of these samples more similar to that for the signal.
• The rare backgrounds, discussed below, are taken into account when calculating the combinatorial background, thus improving the background estimate in the signal window.
The variables discussed in Section 5 are optimized to obtain the best expected upper limit using MC signal events and data sideband events for the background. The optimization procedure is based on a random-grid search of about 1.4 × 10 6 analysis selections. During this search, eleven variables are randomly sampled within predefined ranges. The resulting optimized requirements, which are used to obtain the final results, are summarized in Table 1. These requirements were established before observing the number of data events in the signal region. Hence, the analysis was blind to the signal events in the 5.20 < m µµ < 5.45 GeV mass range. In the endcap regions the selection is tighter than in the barrel because of the substantially larger background. The signal efficiencies ε tot for these selections are shown in Table 2. They include all selection requirements: the detector acceptance, and the analysis, muon identification, and trigger efficiencies. The quoted errors include all the systematic uncertainties. In general, the present analysis uses more strict selection requirements than in the earlier analysis [9], resulting in a higher sensitivity and a better signal-to-background ratio, but also a lower signal efficiency.
As an additional test, the optimization was repeated to maximize the ratio S/ √ S + B, where S is the number signal events and B is the number of background events. This resulted in a similar set of parameters to the ones listed in Table 1, but without an improvement in the expected upper limit.
To evaluate a possible bias due to the optimization of the selection criteria in the data sidebands and to validate our background expectation, the following crosscheck is performed. All candidates with I < 0.7, including those within the blinded region, are selected ("inverted isolation" selection), which generates a background-enriched sample with a very small expected signal contribution. From this sample, the candidate yields in the sidebands and in the blinded region are determined. The sideband yields are used to predict, through interpolation, the number of background candidates in the blinded region. Then the number of predicted background events can be compared to the number of observed candidates in the blinded region. This comparison is performed separately for the barrel and endcap channels and for the B 0 s and B 0 signal windows. Within statistical uncertainties, good agreement is found for all four cases, which means that no significant biases are present in the background interpolation.
The simulated dimuon mass resolution for signal events depends on the pseudorapidity of the B candidate and ranges from 37 MeV for η ∼ 0 to 77 MeV for |η| > 1.8. The dimuon mass scale and resolution in the MC simulation are compared with the measured detector performance by studying J/ψ → µ + µ − and Υ(1S) → µ + µ − decays. The residual differences between simulation and data are small and the uncertainty on the efficiency coming from these effects is estimated to be 3%.
Branching fractions are measured separately in the barrel and endcap channels using the following equation where ε tot is the total signal efficiency, N B + obs is the number of reconstructed B + → J/ψK + decays, ε B + tot is the total efficiency of B + reconstruction, B(B + ) is the branching fraction for B + → J/ψK + → µ + µ − K + , f u / f s is the ratio of the B + and B 0 s production cross sections, and N S is the background-subtracted number of observed B 0 s → µ + µ − candidates in the signal window 5.30 < m µµ < 5.45 GeV. The width of the signal windows is adjusted to maximize the efficiency for the B 0 s → µ + µ − decay, and it is approximately equal to twice the expected mass resolution in the endcap region. We use the value f s / f u = 0.267 ± 0.021, measured by LHCb for 2 < η < 5 [19] and B(B + ) ≡ B(B + → J/ψK + → µ + µ − K + ) = (6.0 ± 0.2) × 10 −5 [20]. An analogous equation is used to measure the B 0 → µ + µ − branching fraction, with the signal window 5.2 < m µµ < 5.3 GeV and the ratio f d / f u taken to be 1.
The number of reconstructed B + mesons N B + obs is (82.7 ± 4.2) × 10 3 in the barrel and (23.8 ± 1.2) × 10 3 in the endcap. The invariant mass distributions are fit with a double-Gaussian function for the signal and an exponential plus an error function for the background, as shown in Fig. 5. Partially reconstructed B 0 decays (e.g., B → J/ψK * with one of the K * decay products not reconstructed) lead to a step function-like behavior at a mass of m ≈ 5.15 GeV. This background shape was studied in detail in MC simulation and is parametrized with an error function of different width in the barrel and endcap. The systematic uncertainty on the fit yield, 5% in the barrel and in the endcap, is estimated by considering alternative fitting functions and by performing a fit with the dimuon invariant mass constrained to the J/ψ mass. The total efficiency ε B + tot , including the detector acceptance, is determined from MC simulation to be (11.0 ± 0.9) × 10 −4 for the barrel and (3.2 ± 0.4) × 10 −4 for the endcap, where the errors include statistical and systematical uncertainties. The detector acceptance part (which includes the track finding efficiency) of the total efficiency has a systematic uncertainty of 3.5% (5.0%) in the barrel (endcap). It is estimated by comparing the values obtained separately with three different bb production mechanisms: gluon splitting, flavor excitation, and flavor creation.
The branching fraction for the control decay B 0 s → J/ψφ which was analyzed in parallel with the normalization and signal decays, has also been evaluated using Equation 1. The resulting branching ratio is in agreement with the world average [20]. Moreover, the results for the barrel and endcap channels agree within the statistical uncertainties, showing the validity of extending the f s / f u measurement from [19] to the barrel region.
Events in the signal window have several sources: (i) genuine signal decays, (ii) decays of the type B → hh , where h, h are charged hadrons misidentified as muons (referred to as "peaking" background), (iii) rare semileptonic decays B → hµν, where h is misidentified as a muon, and (iv) combinatorial background. Note that events from the third category predominantly populate the lower sideband.
The expected numbers of signal events N exp signal for the barrel and endcap channels are shown in Table 2. They are calculated assuming the SM branching fractions [1] and are normalized to the measured B + yield.
The expected numbers of rare semileptonic decays and peaking background events, N exp peak , are also shown in Table 2. They are evaluated from a MC simulation, which is normalized to the measured B + yields, and from muon misidentification rates measured in D * + → D 0 π + , D 0 → K − π + , and Λ → pπ − samples [17]. The average misidentification probabilities in the kinematic range of this analysis are (0.10 ± 0.02)% for pions and kaons, and (0.05 ± 0.01)% for protons, where the uncertainties are statistical. The systematic uncertainty on the background includes the uncertainties on the production ratio (for B 0 s and Λ b decays), the branching fraction, and the misidentification probability.
Also shown in Table 2 are the expected numbers of combinatorial background events N exp comb . They are evaluated by interpolating into the signal window the number of events observed in the sideband regions, after subtracting the expected rare semileptonic background. The interpolation procedure assumes a flat background shape and has a systematic uncertainty of 4%, which is evaluated by varying the flight-length significance selections and by using a linear background shape with a variable slope.   Figure 6 shows the measured dimuon invariant-mass distributions. In the sidebands the observed number of events is equal to six (seven) for the barrel (endcap) channel. Six events are observed in the B 0 s → µ + µ − signal windows (two in the barrel and four in the endcap), while two events are observed in the B 0 → µ + µ − barrel channel and none in the endcap channel. As indicated by the numbers shown in Table 2, this observation is consistent with the SM expectation for signal plus background.

Summary
An analysis searching for the rare decays B 0 s → µ + µ − and B 0 → µ + µ − has been performed in pp collisions at √ s = 7 TeV. A data sample corresponding to an integrated luminosity of 5 fb −1 has been used. This result supersedes our previous measurement [9]. Stricter selection requirements were applied, resulting in a better sensitivity and a higher expected signal-to-background ratio. The observed number of events is consistent with background plus SM signals. The resulting upper limits on the branching fractions are B(B 0 s → µ + µ − ) < 7.7 × 10 −9 and B(B 0 → µ + µ − ) < 1.8 × 10 −9 at 95% CL. These upper limits can be used to improve bounds on the parameter space for a number of potential extensions to the standard model.

Acknowledgments
We congratulate our colleagues in the CERN accelerator departments for the excellent performance of the LHC machine. We thank the technical and administrative staff at CERN and other CMS institutes.