Improved determination of the sample composition of dimuon events produced in {\boldmath $p\bar{p}$} collisions at {\boldmath $\sqrt{s}=1.96$} TeV

We use a new method to estimate with 5% accuracy the contribution of pion and kaon in-flight-decays to the dimuon data set acquired with the CDF detector. Based on this improved estimate, we show that the total number and the properties of the collected dimuon events are not yet accounted for by ordinary sources of dimuons which also include the contributions, as measured in the data, of heavy flavor, $\Upsilon$, and Drell-Yan production in addition to muons mimicked by hadronic punchthrough.


Introduction
This article presents an improved determination of the composition of a dimuon sample recorded in pp collisions at √ s = 1.96 TeV. The data sample consists of events containing two central (|η| < 0.7) primary (or trigger) muons, each with transverse momentum p T ≥ 3 GeV/c, and with invariant mass larger than 5 GeV/c 2 and smaller than 80 GeV/c 2 . The sample may be dominated by real muon pairs due to semileptonic decays of heavy flavor, Drell-Yan production and Υ decays, but also contains events in which one or both muons are produced by hadrons that decay in flight or otherwise mimic a muon signal. Although the dimuon signature can be a powerful tool with which to search for new physics or sources of CP violation, the uncertainty of the in-flight-decay contribution makes the precise determination of the fractions of known processes a serious experimental challenge. In particular, it remains controversial if muons originating from the decay of objects with a lifetime longer than that of heavy-flavored hadrons can be completely accounted for with ordinary sources such as in-flight-decays. Earlier and recent studies estimate the fraction of this type of event to be negligible [1][2][3]. Other studies find it significant, suppress it by selecting muons produced close to the beamline [4], but have estimated its size with a very large uncertainty by using Monte Carlo simulations [5]. The present work is based on the same Monte Carlo simulated samples, and the same analysis methods as Refs. [4,5], but we improve the method to estimate the number of events due to in-flight-decays achieving a 5% accuracy. Section 2 describes the CDF II detector. In Sect. 3, we review the present experimental situation. Sections 4 to 6 describe the procedure used to tune the simulation and estimate the contribution of ordinary sources to events in which muons are produced by objects with very long lifetimes. Based on this results, Sect. 7 updates the estimate of the rate of multi-muon events reported in Ref. [5]. Our conclusions are presented in Sect. 8.

CDF II detector and trigger
CDF II is a multipurpose detector, equipped with a charged particle spectrometer and a finely segmented calorimeter. In this section, we describe the detector components that are relevant to this analysis. The description of these subsystems can be found in Refs. [6][7][8][9][10][11][12][13][14][15]. Two devices inside the 1.4 T solenoid are used for measuring the momentum of charged particles: the silicon vertex detector (SVXII and ISL) and the central tracking chamber (COT). The SVXII detector consists of microstrip sensors arranged in six cylindrical shells with radii between 1.5 and 10.6 cm, and with a total z coverage 1 of 90 cm. The first SVXII layer, also referred to as the L00 detector, is made of single-sided sensors mounted on the beryllium beam pipe. The remaining five SVXII layers are made of double-sided sensors and are divided into three contiguous five-layer sections along the beam direction z. The vertex z-distribution for pp collisions is approximately described by a Gaussian function with a rms of 28 cm. The transverse profile of the Tevatron beam is circular and has a rms spread of 25 μm in the horizontal and vertical directions. The SVXII single-hit resolution is approximately 11 μm and allows a track impact parameter resolution of approximately 35 μm, when also including the effect of the beam transverse size. The two additional silicon layers of the ISL help to link tracks in the COT to hits in the SVXII. The COT is a cylindrical drift chamber containing 96 sense wire layers grouped into eight alternating superlayers of axial and stereo wires. Its active volume covers |z| ≤ 155 cm and 40 to 140 cm in radius. The transverse momentum resolution of tracks reconstructed using COT hits is σ (p T )/p 2 T 0.0017 [GeV/c] −1 . The trajectory of COT tracks is extrapolated into the SVXII detector, and tracks are refitted with additional silicon hits consistent with the track extrapolation.
The central muon detector (CMU) is located around the central electromagnetic and hadronic calorimeters, which have a thickness of 5.5 interaction lengths at normal incidence. The CMU detector covers a nominal pseudorapidity range |η| ≤ 0.63 relative to the center of the detector, and is segmented into two barrels of 24 modules, each covering 15 • in φ. Every module is further segmented into three submodules, each covering 4.2 • in φ and consisting of four layers of drift chambers. The smallest drift unit, called a stack, covers a 1.2 • angle in φ. Adjacent pairs of stacks are combined together into a tower. A track segment (hits in two out of four layers of a stack) detected in a tower is referred to as a CMU stub. A second set of muon drift chambers (CMP) is located behind an additional steel absorber of 3.3 interaction lengths. The chambers are 640 cm long and are arranged axially to form a box around the central detector. The CMP detector covers a nominal pseudorapidity range |η| ≤ 0.54 relative to the center of the detector. Muons which produce a stub in both the CMU and CMP systems are called CMUP muons. The CMX muon detector consists of eight drift chamber layers and scintillation counters positioned behind the hadron calorimeter. The CMX detector extends the muon coverage to |η| ≤ 1 relative to the center of the detector.
The luminosity is measured using gaseous Cherenkov counters (CLC) that monitor the rate of inelastic pp collisions. The inelastic pp cross section at √ s = 1960 GeV is scaled from measurements at √ s = 1800 GeV using the calculations in Ref. [16]. The integrated luminosity is determined with a 6% systematic uncertainty [17]. CDF uses a three-level trigger system. At Level 1 (L1), data from every beam crossing are stored in a pipeline capable of buffering data from 42 beam crossings. The L1 trigger either rejects events or copies them into one of the four Level 2 (L2) buffers. Events that pass the L1 and L2 selection criteria are sent to the Level 3 (L3) trigger, a cluster of computers running speed-optimized reconstruction code.
For this study, we select events with two muon candidates identified by the L1 and L2 triggers. The L1 trigger uses tracks with p T ≥ 1.5 GeV/c found by a fast track processor (XFT). The XFT examines COT hits from the four axial superlayers and provides r − φ information in azimuthal sections of 1.25 • . The XFT passes the track information to a set of extrapolation units that determine the CMU towers in which a CMU stub should be found if the track is a muon. If a stub is found, a L1 CMU primitive is generated. The L1 dimuon trigger requires at least two CMU primitives, separated by at least two CMU towers. The L2 trigger additionally requires that at least one of the muons also has a CMP stub matched to an XFT track with p T ≥ 3 GeV/c. All these trigger requirements are emulated by the detector simulation on a run-by-run basis. The L3 trigger requires a pair of CMUP muons with invariant mass larger than 5 GeV/c 2 , and |δz 0 | ≤ 5 cm, where z 0 is the z coordinate of the muon track at its point of closest approach to the beamline in the r-φ plane. These requirements define the dimuon trigger used in this analysis.

Present understanding of the dimuon sample composition
The value of σ b→μ,b→μ and σ c→μ,c→μ , the correlated cross sections for producing pairs of central heavy-flavored quarks that decay semileptonically, is derived in Ref. [4] by fitting the impact parameter 2 distribution of the primary muons 2 The impact parameter is defined as the distance of closest approach of a track to the primary event vertex in the transverse plane with respect to the beamline.
with the expected shapes from all sources believed to be significant: semileptonic heavy flavor decays, prompt quarkonia decays, Drell-Yan production, and instrumental backgrounds due to punchthrough of prompt or heavy-flavored hadrons which mimic a muon signal. 3 In the following, the sum of these processes will be referred to as the prompt plus heavy flavor (P + HF) contribution. The notation K puth → μ and π puth → μ will be used to indicate muon signals mimicked by punchthrough of kaons and pions, respectively. In order to properly model the data with the templates of the various P + HF sources, the study in Ref. [4] has used strict selection criteria, referred to as tight SVX selection in the following, by requiring muon tracks with hits in the two innermost layers of the SVX detector, and in at least two of the next four outer layers. The tight SVX requirements select events in which both muons arise from parent particles that have decayed within a distance of 1.5 cm from the pp interaction primary vertex in the plane transverse to the beamline. This requirement suppresses the yield of primary muons due to in-flightdecays of pions and kaons, in the following referred to as π ifd → μ and K ifd → μ, respectively. This type of contribution to the dimuon dataset prior to any SVX requirement was considered negligible in previous [1,2] and recent [3] studies by the CDF and D0 collaborations.
As shown by Fig. 1, the tight SVX sample is well modeled by fits using the prompt and heavy flavor contributions [4]. The sample composition determined by the fit and Fig. 1 The projection of the two-dimensional impact parameter distribution of muon pairs onto one of the two axes is compared to the fit result. The various P + HF components listed in the first column of Table 1 are shown separately Table 1 Number of events attributed to the different dimuon sources by the fit to the muon impact-parameter distribution. The fit parameters BB, CC, and PP represent the bb, cc, and prompt dimuon contributions, respectively. The component BC represents events containing b and c quarks. The fit parameter BP (CP) estimates the number of events in which there is only one b (c) quark in the detector acceptance and the second muon is produced by prompt hadrons in the recoiling jet that mimic a muon signal. Real muons are muons from semileptonic decay of heavy flavors, Drell-Yan production or quarkonia decays. The data correspond to an integrated luminosity of 742 pb −1 . The dimuon data set consists of 743,006 events

Component
No. of events No. of real μ − μ No. and type of misidentified μ corrected for the appropriate efficiency of the tight SVX requirements 4 is listed in the first two columns of Table 1.
The difference between the total number of dimuons and the P + HF component indicates the presence of an important source of dimuons produced beyond 1.5 cm which is suppressed by the tight SVX requirements. Because unnoticed by previous experiments, this source was whimsically referred to as the ghost contribution.
The relative size of the ghost and P + HF contributions depends upon the type of SVX requirement applied to the trigger muons. Reference [5] shows that neglecting the presence of ghost events affected previous measurements of σ b→μ,b→μ [1,18] and ofχ [2] at the Tevatron. Finally, the ghost sample is shown to be the source of the dimuon invariant mass discrepancy observed in Ref. [19].
Reference [5] has studied a number of potential sources of muons originating beyond the beam pipe. Contrary to what assumed by previous experiments, the one source found to contribute significantly arises from in-flight-decays of pions and kaons. Based upon a generic QCD simulation, that study estimates a contribution of 57000 events. A smaller contribution (12052 ± 466 events) from K 0 S and hyperon decays in which the punchthrough of a hadronic prong mimics a muon signal was estimated using the data. Secondary inelastic interactions in the tracking volume were found to be a negligible source of ghost events. The final estimate of the size of possible sources of ghost events un- 4 The efficiency of the tight SVX selection has been measured [5] to be 0.257 ± 0.004 for prompt dimuons and 0.237 ± 0.001 for dimuons produced by heavy flavor decays by using control samples of data from various sources ( derpredicts the observed number by approximately a factor of two (154,000 observed and 69,000 accounted for), but the difference was not considered significant because of the simulation uncertainty.
The present study uses events selected with the tight SVX requirements to tune the QCD simulation. Since these data are well modeled by the impact parameter templates of the P + HF components, misidentified muons can only arise from the punchthrough of prompt hadrons or hadrons produced by heavy-flavor decays. The numbers of misidentified muons in the data are derived by subtracting the expected number of real muons, listed in the third column of Table 1, from the corresponding components in the second column. We then compare these differences to the rate of K puth → μ and π puth → μ misidentifications predicted by the simulation, and listed on the fourth column of the same table. The simulation is tuned by adjusting the predicted rate of pions and kaons to reproduce the observed number of muon misidentifications. Then, the tuned simulation is used to predict the number of muons due to in-flight-decays with 5% accuracy.

Rates of misidentified muons in the data and simulation
We make use of three different samples of simulated events generated with the HERWIG parton-shower Monte-Carlo program [20,21], the settings of which are described in Appendix A of Ref. [4]. We use option 1500 of the HER-WIG program to generate final states produced by hard scattering of partons with transverse momentum larger than 3 GeV/c (sample A = generic QCD). Hadrons with heavy flavors are subsequently decayed using the EVTGEN Monte Carlo program [22]. The detector response to particles produced by the above generators is modeled with the CDF II detector simulation that in turn is based on the GEANT Monte Carlo program [23,24]. The values of the heavy flavor cross sections predicted by the generator are scaled to the measured values σ b→μ,b→μ = 1549 ± 133 pb and σ c→μ,c→μ = 624 ± 104 pb [4]. The next simulated sample (sample B = single b + single c) is extracted from A by requiring the presence of at least a trigger muon generated from heavy-flavor semileptonic decays. The simulated sample C = bb + cc is extracted from B by requiring the presence of at least two trigger muons generated from heavy flavor decays. This sample has been used to construct impact parameter templates and estimate kinematic acceptances in Ref. [4].
In the various simulations, we evaluate the number of dimuons from heavy flavor decays and the number of pairs of tracks of different type that pass the same kinematic selection. The ratios of these numbers are listed in Tables 2, 3 and 4. The rate of pairs of tracks of different type predicted by the simulation are normalized to the data by multiplying these ratios by the number of dimuons from bb or cc production observed in the data.
The probability P puth K(π) that a kaon (pion) is not contained by the calorimeter and mimics a muon signal has been measured in Ref. [4] by using kaons and pions from D * ± → π ± D 0 with D 0 → K + π − decays. The probability that kaon (pion) in-flight-decays mimic a trigger muon,  Table 3 Ratio of the numbers of μ − K(π) combinations to that of primary dimuons from bb and cc production in the single-b and singlec simulated samples (221,096 and 83,590 dimuons, respectively) Single c 40.7 173.7 P ifd K(π) , has been derived in Ref. [5] by using the simulated sample C. These probabilities depend on the particle transverse momentum. Table 5 lists the average probabilities that kaons (pions) mimic a primary muon when applying the P puth K(π) and P ifd K(π) probabilities to simulated kaon (pion) tracks with p T ≥ 3 GeV/c and |η| < 0.7.
By weighting simulated pion (kaon) tracks that pass the muon kinematic selection with the corresponding P puth K(π) probability, we obtain the prediction of misidentified primary muons for the various P + HF components that is listed in the fourth column of Table 1. The third column of the same table lists the number of real muons for the various P + HF contributions. The sum of real plus misidentified muon pairs is in general agreement with the data listed in the second column of the table. Therefore, it is reasonable to use the observed rate of dimuons, the knowledge of the fraction of real dimuons due to semileptonic decay of heavy flavors, Drell-Yan or Υ mesons, and the knowledge of the P puth K(π) probabilities to normalize the absolute yields of pions and kaons predicted by the simulation. The simulation fitted to the data is then used to predict the rate of events due to in-flight-decay misidentifications by weighting simulated tracks with the P ifd K(π) probabilities, the average of which is listed in Table 5. In addition, the total rate of K → μ = K puth → μ + K ifd → μ misidentifications predicted by the simulation can be further constrained with data. This is done in the next section by using the number of primary muons due to misidentification of K * 0 , K * ± , and K 0 S decays. We first describe the evaluation of the content of real muons in the various P + HF components and the function used to fit the simulation to the data. Reference [4] estimates that the fraction R bb = 0.96 ± 0.04 of the BB component is due to real muons from b-quark semileptonic decays whereas the remaining 4% is due to muons mimicked by the punchthrough of hadrons produced by heavy flavor decays. Similarly, the fraction R cc = 0.81 ± 0.09 of the CC component is due to real muons from c-quark semileptonic decays whereas the remaining 19% is due to muons mimicked by the punchthrough of hadrons produced by heavy flavor decays. The uncertainty of the fraction of real muons due to bb (cc) production is accounted for by multiplying R bb (R cc ) by the fit parameter f bb (f cc ) constrained to 1 with a 4% (11%) Gaussian error. The number of Υ mesons contributing to the PP component (Υ = 51680 ± 649 candidates) has been determined in Ref. [4] by fitting the dimuon invariant mass spectrum with three Gaussian functions to model the signal and a straight line to model the combinatorial background. The Drell-Yan contribution is evaluated as DY = Υ × σ DY /σ Υ . The cross section σ DY in the 5-80 GeV/c 2 mass range is evaluated with a NLO calculation [25,26], and we use the measured value of σ Υ [27]. The ratio σ DY /σ Υ is 1.05 with a 10% error mostly due to the measurement in Ref. [27]. To account for the uncertainty, we weight the DY contribution with the fit parameter f dy constrained to 1 with a 10% Gaussian error.
The magnitude of the BP (CP) component, predicted with the single-b (single-c) simulation, with respect to that of the BB (CC) contribution depends on the ratio of NLO to LO terms evaluated by the HERWIG generator. Because of the dependence on the renormalization and factorization scales, the uncertainty of the single-b (c) cross section to that of the bb (cc) cross section is estimated [28] to be 20 (30)%. 5 We account for this uncertainty by weighting the rate of pion and kaon tracks predicted by the single-b (single-c) simulation with the additional fit parameter f sb (f sc ) constrained to 1 with a 20% (30%) Gaussian error.
The simulation prediction of the number of muons mimicked by the punchthrough of pions (kaons) is weighted with the fit free parameter f π (f K ). These fit parameters provide the absolute normalization of the pion (kaon) rate predicted by the simulation including the uncertainties of the punchthrough probabilities.

Measurement of the K → μ contribution
The small rate of K → μ = K puth → μ + K ifd → μ misidentifications is measured using a higher statistics sample of dimuon events corresponding to an integrated luminosity of 3.9 fb −1 . The number of K → μ misidentification, N K , is derived from N K * 0 , the number of identified K * 0 → K + π − decays with K + → μ + (and chargeconjugate states). The number N K * 0 is related to N K by where R(K * 0 ) is the fraction of kaons that result from K * 0 → K + π − decays and 0 is the efficiency to reconstruct the pion.
We also select K 0 S → π + π − with π → μ candidates and reconstruct K * ± → K 0 S π ± decays. The number of K * ± is related to that of K 0 S by 5 However, the study in Ref. [29] shows that the HERWIG generator predicts the observed single and correlated heavy-flavor cross sections to better than 10%.
where R(K * ± ) is the fraction of K 0 S resulting from K * ± → K 0 S π ± decays and 1 is the efficiency to reconstruct the additional pion. We use isospin invariance to set R(K * ± ) = R(K * 0 ). Since the additional pion used to search for the K * ± and K * 0 candidates is selected with the same kinematic requirements, we set 0 = 1 . It follows that We search for K * 0 decays by combining primary muons, assumed to be kaons, with all opposite charge tracks, assumed to be pions, with p T ≥ 0.5 GeV/c and in an angular cone with cos θ ≥ 0.6 around the direction of the primary muon. We require tracks with at least 10 axial and 10 stereo COT hits. We constrain the pair to arise from a common three-dimensional point, and reject combinations if the probability of the vertex-constrained fit is smaller than 0.001. The invariant mass spectrum of the selected K * 0 → K + π − candidates is shown in Fig. 2. We fit the invariant mass distribution with a Breit-Wigner function smeared by the detector mass resolution to model the signal. We fix the mass and width of the Breit-Wigner function to 896 and 51 MeV/c 2 [30], respectively. 6 We use a fourth order polynomial to model the combinatorial background under the signal, and the fitted range of invariant mass is conveniently chosen to yield a fit with 50% probability. The size of the signal is not affected by the arbitrary Fig. 2 Invariant mass distribution of K * 0 → K + π − candidates passing our selection criteria. The line represents the fit described in the text choice of the function used to model the combinatorial background or of the fitted mass range, and is solely determined by the accurate knowledge of the signal shape. The fit yields N K * 0 = 87,471 ± 2217K * 0 mesons.
We search for K 0 S → π + π − with a π → μ misidentification by combining primary muons with tracks passing the same requirements as those used in the K * 0 search. In this case, both tracks are assumed to be pions. As in the previous case, we select pairs consistent with arising from a common three-dimensional vertex. We take advantage of the K 0 S long lifetime to suppress the combinatorial background. We further require that the distance between the K 0 S vertex and the event primary vertex, corrected by the K 0 S Lorentz boost, corresponds to ct > 0.1 cm. The invariant mass of the K 0 S candidates is shown in Fig. 3.
We fit the signal with two Gaussian functions and the combinatorial background with a straight line in the mass range 0.4-0.6 GeV/c 2 . Having fixed the peak of the Gaussian functions at 0.497 GeV/c 2 [30], the fit returns an averaged σ of 8.4 MeV/c 2 , consistent with what is expected from simulated events, 7 and a signal of 32,445 ± 421K 0 S mesons in the mass range 0.474-0.522 GeV/c 2 .
We search for K * ± by combining K 0 S candidates with mass between 0.474 and 0.522 GeV/c 2 and ct > 0.1 cm with any additional track, assumed to be a pion, that pass the same selection as pion tracks used to find K * 0 candidates. We constrain the K 0 S mass to 0.497 GeV/c 2 and require that the K 0 S candidate and the pion track are consistent with arising from a common three-dimensional vertex. The invariant mass distribution of K * ± candidates is shown in Fig. 4.
We fit the invariant mass distribution with a Breit-Wigner function to model the signal and a fourth order polynomial to model the combinatorial background. We fix the mass and width of the Breit-Wigner function to 892 and 51 MeV/c 2 [30], respectively. 8 The fit returns a signal of 3326 ± 246K * ± mesons.
The signals obtained by analyzing the 3.9 fb −1 sample are rescaled to estimate the number of K → μ misidentifications present in the 742 pb −1 dataset. After rescaling, we obtain N K = N K 0 S /N K * ± × N K * 0 = 164,769 ± 13,067.
This number is used to constrain the total number N sim K of K → μ misidentifications predicted by the simulation. 7 Because of the K 0 S long decay path, reconstructed track segments may be shorter than the available tracking detector length. When K 0 S mesons decay before entering the COT volume, the mass resolution is 4 MeV/c 2 . 8 In simulated events, when constraining the K 0 S mass to the PDG value, the mass-constrained K 0 S momentum is measured as accurately as that of a track corresponding to a K → μ decay. The resulting K * ± mass resolution is approximately 5 MeV/c 2 .

Fig. 3 Invariant mass distribution of K 0
S candidates passing our selection criteria. The line represents the fit described in the text Fig. 4 Invariant mass distribution of K * ± → K 0 S π ± candidates passing our selection criteria. The line represents the fit described in the text

Fit of the simulation prediction to the data
We fit the simulation prediction with a χ 2 -minimization method [31]. The χ 2 function is defined as In addition, the sum 5 i=1 D[i] is constrained to the observed number of P + HF − BC events within its error. The fit results are shown in Tables 6, 7 and 8. In Table 6, the fit parameters that tune the various cross sections predicted by the HERWIG generator are very close to their nominal values indicating that the default simulation provides a quite accurate modeling of the data.
The fit returns 163,501K → μ candidates (164,769 ± 13,067 are measured in the data), 51% of which are due to punchthrough and 49% to in-flight-decays. We verify this result by measuring the fraction of K → μ decays in identified K * 0 → K + π − decays that pass the tight SVX requirements. The efficiencies of the tight SVX requirement applied to primary muons are 0.356 ± 0.002 for muons due to punchthrough of prompt and heavy-flavored hadrons and 0.166 ± 0.005 for muons arising from in-flight decays. 9 Based on the kaon composition returned by the fit, we estimate the efficiency of the tight SVX requirement applied to K → μ misidentifications to be 0.263 ± 0.008, where the error includes the uncertainty of the efficiencies and that of the kaon composition returned by the fit. Figure 5 shows the invariant mass distribution of K * 0 candidates after applying the tight SVX requirement. We fit the invariant mass distribution with the same function used to fit the K * 0 mass distribution in Fig. 2. The fit returns 22,689 ± 985K * 0 candidates to be compared with 87,741 ± 2219K * 0 candidates before applying the tight SVX requirement. The resulting efficiency of the tight SVX requirement is 0.253 ± 0.013,  Invariant mass distribution of K * 0 candidates in which the K → μ misidentification passes the tight SVX requirement. The line represents the fit described in the text 9 In Ref. [5], which uses 0.74 fb −1 of data, these efficiencies have been measured to be 0.45 and 0.21, respectively. In 3.9 fb −1 of data, by using Υ candidates, we measure a smaller efficiency of the tight SVX selection. The efficiency loss comes from periods of data taking in which the pedestals of the L00 channels were miscalibrated.
in agreement with what expected (0.263 ± 0.008) using the composition of the kaon sample returned by the fit.
Continuing with the analysis of the results returned by the fit, the total number of π → μ misidentifications is 240,915, 64% of which are due to punchthrough and 36% to in-flightdecays. The fractional composition of the K → μ and π → μ misidentifications is summarized in Table 9.
The total fraction of misidentified muons in the dataset is 27%. The number of misidentified muons due in-flightdecays of pions and kaons (ghost events) is 113,613 ± 5332. Since the number of muons from in-flight-decays is derived from that of muons mimicked by hadron punchthrough using the fake probabilities listed in Table 5, the uncertainty of these probabilities yields an additional error of 3845 events.
After adding the 12,052 ± 466 events from K 0 S and hyperon decays, we predict 125,665 ± 5351 ghost events, whereas the dimuon dataset contains 153,991 ± 5074 events of this type. The number of unaccounted events (28,326 ± 7374) is (12.8 ± 3.2)% of the bb production and (18.3 ± 4.7)% of the ghost sample.

Revised estimate of the rate of additional real muons in ghost events
As a cross-check of its bb content, Reference [5] has investigated the rate of sequential semileptonic decays of single b quarks in the dimuon sample. We provide here a summary of that study and its conclusions. That study searches for additional muons with p T ≥ 2 GeV/c and |η| ≤ 1.1 in a dimuon sample corresponding to an integrated luminosity of 1426 pb −1 . The sample of 1,426,571 events contains 1,131,090 ± 9271P + HF events in which both muons originate inside the beam pipe and 295,481 ± 9271 ghost events in which at least one muon is produced outside. The study selects pairs of primary and additional muons with opposite charge (OS) and invariant mass smaller than 5 GeV/c 2 .
In the case of Drell-Yan or quarkonia production, which was not simulated, the rate of same-charge pairs (SS) is a measure of the fake muon contribution since misidentified muons arise from the underlying event which has no charge correlation with primary muons. The rate of additional muons mimicked by hadronic punchthrough is also estimated with a probability per track derived by using kaons and pions from D * ± → π ± D 0 with D 0 → K + π − decays. This misidentification probability is approximately ten times larger than that for primary muons that have to penetrate twice as many interaction lengths. 10 The punchthrough probabilities for pions and kaons differ by a factor of two. In addition, in simulated events due to heavy flavor production, the pion to kaon ratio depends on the invariant mass and the charge of the muon-hadron pairs. Therefore, for P + HF events, the rate of OS − SS pairs is compared to that predicted by the heavy flavor simulation in which pions and kaons are weighted with the corresponding probabilities of mimicking a muon signal. In P + HF events, the number of sequential semileptonic decay-candidates (29,262 ± 850) is correctly modeled by the rate of sequential decays of single b-quarks predicted by the simulation (29,190 ± 1236). This number is 2.5% of the P + HF total contribution and (6.9 ± 0.4)% of the bb contribution (424,506 ± 18,454 events).
In the remaining 295,481 ± 9271 ghost events, the number of additional muons in an angular cone with cos θ ≥ 0.8 around a primary muon is 49,142 ± 519. In the absence of a simulation of ghost events, that study assumes that tracks in ghost events are a 50-50% mixture of pion and kaons, and estimates the number of misidentified additional muons to be 20,902 ± 284. The resulting number of unaccounted ghost events with three or more muons is 27,970 ± 538, (9.5 ± 0.4)% of the ghost events.
As shown by Table 9, one half of the ghost events arise from heavy flavor production acquired with a misidentified muon. This type of event should contain an appreciable fraction of additional muons due to semileptonic decays of heavy quarks. This contribution was not included in the estimate of Ref. [5].
We estimate this contribution using K 0 S and K * 0 candidates due to π → μ and K → μ misidentification, respectively. As shown in Sect. 5, there are 32,445 ± 421 and 87,471 ± 2217 candidates, respectively. We measure the fraction of these candidates in which at least one of the primary muons is accompanied by an additional muon in a cos θ ≥ 0.8 angular cone around its direction. We also estimate the contribution of fake additional muon by weighting all hadronic tracks that pass the additional muon selection criteria, assumed to be a 50-50% mixture of pions and kaons, with the corresponding misidentification probabilities [5]. Figures 6,7 show the invariant mass distribution of K 0 S (K * 0 ) candidates when at least one primary muon is accompanied by an additional muon or a predicted misidentified muon. As previously done, we fit the K 0 S distributions with two Gaussian functions to model the signal and a straight line to model the background. The K * 0 distribution is fitted with a Breit-Wigner function plus a fourth order polynomial.
The fits return 4572 ± 91 and 1954 ± 109 events in which a K 0 S meson is accompanied by an additional muon and by a fake muon, respectively. The fits return 10,176 ± 739 and 5230 ± 493 events in which a K * 0 meson is accompanied by an additional muon and by a fake muon, respectively.
As shown in Fig. 8 for events triggered by K 0 S misidentifications, sometimes the additional muon is contributed by the second prong of the K 0 S decay. Figure 8 shows the invariant mass distribution of primary and additional muons that pass the analysis selection. The usual fit yields 403 ± 33 events in which the additional muon is mimicked by the second leg of the K 0 S decay that also produced the primary muon. We remove this contribution to evaluate the fraction of real muons accompanying K → μ misidentifications. We will add it for the fraction of events triggered by misidentified K 0 S decays. After removing the predicted numbers of fake muons, the fraction of events acquired with an identified K 0 S meson that contain additional real muons is (6.83 ± 0.45)%. It is (5.6 ± 1.05)% for events acquired with an identified K * 0 meson. The average of the two fractions is (6.6 ± 0.4)%. We multiply this fraction by the number of ghost events due to ordinary sources (241,507 ± 10,284) to predict the number  S candidates reconstructed using primary and additional muons. The line represents the fit described in the text of real muons in events due to heavy flavor that are classified as ghost events because one of the primary muons was produced by a pion or kaon in-flight-decay. This procedure yields a slight overestimate because, according to the simulation tuned with the data, (53.4 ± 0.4)% of the K → μ misidentifications are due to events with heavy flavors, whereas (51.3 ± 0.6)% of the ghost events arise from heavy flavor production. As shown by Table 10, the improved estimate still does not account for 12,169 ± 1319 ghost events with additional real muons.

Conclusions
This article reports an improved understanding of the dimuon samples acquired by the CDF experiment. One dataset, corresponding to an integrated luminosity of 742 pb −1 , consists of 743,006 events containing two central (|η| < 0.7) primary (or trigger) muons, each with transverse momentum p T ≥ 3 GeV/c, and with invariant mass larger than 5 GeV/c 2 and smaller than 80 GeV/c 2 . These data are split into two subsets: one, referred to as P + HF, consisting of 589,015 ± 5074 events in which both muons originate inside the beam pipe of radius 1.5 cm; and one, referred to as ghost, consisting of 153,991 ± 5074 events in which at least one muon originates beyond the beam pipe. The study in Ref. [4] shows that the number and properties of P + HF events are correctly modeled by the expected contributions of semileptonic heavy flavor decays, prompt quarkonia decays, Drell-Yan production, and instrumental backgrounds due to punchthrough of prompt or heavy-flavored hadrons which mimic a muon signal. A previous study [5] has investigated significant sources of ghost events, such as in-flight-decays of pions and kaons and hyperon decays. That study could account for approximately half of the ghost events but was unable to asses the uncertainty of the in-flight-decay prediction. The present study shows that the HERWIG parton-shower generator provides an accurate model of the data. The large discrepancy in the previous study was generated by not including the contribution of final states in which a b(c) hadron decays semileptonically and the second muon is produced by the in-flightdecay of a particle in the recoiling jet. After tuning by a few percent the pion and kaon rates predicted by the simulation with a fit to the data, we show that ordinary sources, mostly in-flight-decays, account for 125,665 ± 5351 of the 153,991 ± 5074 ghost events isolated in the sample of 743,006 dimuons. For comparison, a D0 study has used a similar dimuon sample to set a limit [3] of (0.4 ± 0.26 stat ± 0.53 syst)% to the fraction of muons produced at a distance larger than 1.6 cm from the beamline including pion and kaon in-flightdecays. This appears to be in contradiction with the present result, and also with a recent estimate [32] of the fraction of K → μ and π → μ contributions in the D0 subset of samecharge dimuons ( 40%).
The present study also improves a previous estimate [5] of the content of additional muons with p T ≥ 2 GeV/c and |η| ≤ 1.1 in ghost events. We find that (23 ± 6)% of the unaccounted ghost events contain additional real muons. For comparison, the fraction of bb events that contain additional muons due to sequential semileptonic decays is (6.9 ± 0.4)%.
Both results presented in this article have implications for measurements derived in dimuon datasets without properly accounting for the presence of ghost events. As an example, the measurement of the dimuon charge asymmetry performed by the D0 experiment [32] estimates the fraction of K → μ and π → μ misidentifications with a similar method. After removing this background, the remaining muon pairs with same charge are attributed to bb production. The present study shows that, after removing this type of misidentified muons, the data set still contains an additional component that cannot be accounted for with ordinary sources. The size of this component, equally split in opposite and same sign pairs [5], is (12.8 ± 3.2)% of the total number of dimuons due to bb production.