Measurement of the production cross sections for a Z boson and one or more b jets in pp collisions at sqrt(s) = 7 TeV

The production of a Z boson, decaying into two leptons and produced in association with one or more b jets, is studied using proton-proton collisions delivered by the LHC at a centre-of-mass energy of 7 TeV. The data were recorded in 2011 with the CMS detector and correspond to an integrated luminosity of 5 inverse femtobarns. The Z(ll) + b-jets cross sections (where ll = mu mu or ee) are measured separately for a Z boson produced with exactly one b jet and with at least two b jets. In addition, a cross section ratio is extracted for a Z boson produced with at least one b jet, relative to a Z boson produced with at least one jet. The measured cross sections are compared to various theoretical predictions, and the data favour the predictions in the five-flavour scheme, where b quarks are assumed massless. The kinematic properties of the reconstructed particles are compared with the predictions from the MADGRAPH event generator using the PYTHIA parton shower simulation.


Introduction
Z bosons and jets originating from bottom quarks (b jets) are produced copiously in protonproton collisions at the Large Hadron Collider (LHC).The production of a Z boson with at least one b jet in the detector acceptance, Z+b-jets production, is useful for precision tests of perturbative QCD [1][2][3].The production of a Z boson with a single b jet, Z+1b-jet production, provides information relating to the b-quark content of the proton.The study of the production of a Z boson in association with at least two b jets, Z+2b-jets production, is of interest since it is a background in many searches for yet unobserved processes, such as the production of heavier supersymmetry-like Higgs bosons via vector boson fusion, and in studies of the standard model Higgs boson produced in association with a Z boson and decaying to b quarks [4,5].
The production of a Z boson with b jets originates in proton-proton collisions from gluon-gluon and quark-antiquark interactions, the former being the dominant contribution [3].A smaller contribution, expected to be less than 5% based on measurements of the effective area for hard double-parton interactions [6,7], originates from multiple parton interactions (MPIs).The production cross section for a Z boson with at least one b jet has been measured previously at the LHC at √ s = 7 TeV by the ATLAS [8] and CMS [9] Collaborations and by the CDF [10] and D0 [11] Collaborations at the Tevatron pp collider, at √ s = 1.96TeV, where the dominant contribution comes from quark-antiquark interactions.The characteristics of the production of a Z boson in association with b hadrons have been studied at the LHC by the CMS Collaboration [12].
In this paper, measurements are reported of the cross sections at √ s = 7 TeV for the production of a Z boson with exactly one jet and separately for the production of a Z boson with at least two b jets.Two event categories are defined according to the b-jet multiplicity, and the yields are corrected for the respective backgrounds and efficiencies, taking into account possible migrations of events between the two categories.The cross sections are estimated at the level of stable final-state particles and are compared with predictions from MADGRAPH [13] in the fiveflavour (5F) scheme, where b quarks are assumed massless, and the four-flavour (4F) scheme, where massive b quarks are used, as well as with the next-to-leading-order (NLO) predictions from aMC@NLO [14].The inclusive Z+b-jets cross section is compared to the production of a Z boson in association with jets of any type.The resulting ratio has smaller theoretical and experimental uncertainties than the absolute cross section [15] and is used to elucidate the apparent difference between the measured Z+b-jets cross section [9] and the prediction at the parton level from the MCFM NLO generator [2].
In addition, the distributions of reconstructed kinematic observables for jets and leptons in the Z+2b-jets final state are compared to a Monte Carlo (MC) simulation using the matrix element calculations of MADGRAPH in the five-flavour scheme and using PYTHIA [16] for the simulation of the parton shower and hadronization processes.Understanding the details of the kinematics is important in the search for undiscovered particles as well as for the study of the newly discovered Higgs boson [17][18][19] in similar topologies.

CMS detector and event samples
The data used in this analysis were collected with the Compact Muon Solenoid (CMS) detector.The central feature of the CMS apparatus is a superconducting solenoid of 6 m internal diameter that provides a magnetic field of 3.8 T. Within the field volume are a silicon pixel and strip tracker, a crystal electromagnetic calorimeter (ECAL), and a brass/scintillator hadron calorimeter.Muons are detected in gas-ionisation detectors embedded in the steel flux return yoke of the magnet.A more detailed description of the CMS detector can be found elsewhere [20].A right-handed coordinate system is used in CMS, with the origin at the nominal interaction point, the x axis pointing to the centre of the LHC ring and the y axis pointing up, perpendicular to the plane of the LHC ring.The polar angle θ is measured from the positive z axis, which points along the anticlockwise beam direction, and the azimuthal angle φ is measured in the x-y plane.The pseudorapidity is given by η = − ln[tan(θ/2)].
The data were collected in 2011 at a proton-proton centre-of-mass energy of 7 TeV and correspond to an integrated luminosity of L = 5.05 ± 0.11 fb −1 [21].During the course of data taking, the instantaneous luminosity increased from 10 32 to 3.5 × 10 33 cm −2 s −1 , resulting in an average number of proton-proton interactions per bunch crossing (pileup) of 9.7 with an RMS of 4.7.
Events are selected using dimuon and dielectron triggers.The dimuon trigger p T thresholds were increased from 7 GeV on both muons to 13 and 8 GeV on the leading and subleading muons, respectively, as the instantaneous luminosity increased during the data taking period [22].The dielectron trigger has transverse momentum (p T ) thresholds of 17 and 8 GeV, loose identification criteria, and very loose isolation requirements [23].
In order to compare the data to the theoretical expectations, signal events and the expected backgrounds (Z+jets, tt, and ZZ) are generated by MC simulation and simulated within the CMS detector using GEANT4 [24].Inclusive Z+jets and tt events are simulated with MAD-GRAPH 5.1.1.0,using PYTHIA 6.424 with the Z2 tune [25,26] for the parton showers, hadronization, and MPIs.The CTEQ6L1 parton distribution functions (PDFs) [27] are used.The ZZ sample is simulated using PYTHIA.The Z+jets sample is also used to extract the signal efficiencies and for the comparison of kinematic distributions.
The simulated samples used for comparison with data are normalized to the cross sections expected from theory in the full acceptance.The cross section for the Z+jets sample, 3048 pb, is normalized to match the next-to-NLO prediction for inclusive Z production obtained with FEWZ [28] and the CTEQ6m PDFs [27].NLO predictions obtained from MCFM are used for the normalization of the tt sample, 157.5 pb, and the ZZ sample, 6.2 pb [29].The simulated Z+jets sample is split into three subsamples, according to the underlying production of b jets, c jets, or jets originating only from gluons or u,d,s quarks (hereafter called light-parton jets), with no requirement on the p T or η of the jets.These subsamples are labelled Z+b, Z+c, and Z+l, respectively.

Event reconstruction and selection
The reconstruction and selection of events with a Z boson that decays into a pair of muons or electrons, and one or more b jets are based on the criteria used in the measurement of the inclusive Z+b-jets cross section at CMS [9].For the identification of muons, jets, and missing transverse energy, the CMS particle-flow event reconstruction is used.This algorithm combines the information from all subdetectors to identify and reconstruct the individual particles produced in the collision [30,31].
The leptons in the analysis are required to originate from the primary vertex, which is chosen as the vertex with the largest quadratic sum of the p T of its constituent tracks.Muons are reconstructed by combining the information from both the silicon tracker and the muon spectrometer in a global fit.Tight requirements, including particle-flow identification, are applied to the muon candidates to ensure high purity [22].Electrons are identified by combining tracker tracks and ECAL clusters, including the ECAL deposits from bremsstrahlung [23].An isolation variable, which is defined as the sum of the magnitudes of the transverse momenta of the particles reconstructed in a cone around the lepton candidate, ∆R = √ (∆η) 2 + (∆φ) 2 < 0.4 (0.3), relative to the transverse momentum of the lepton, is used to reject muons (electrons) that are embedded in jets.Charged particles not associated with the primary vertex are not considered in forming the isolation variable.To reduce the effect from pileup, the contribution of neutral particles is corrected by subtracting the energy deposited in the isolation cone by charged particles not associated with the primary vertex, multiplied by a factor of 0.5.This factor corresponds approximately to the ratio of neutral to charged hadron production in the hadronization process of pileup interactions [22,23].After this correction, the isolation variable is required to be less than 20% for muons and 15% for electrons.
Both leptons are required to have p T > 20 GeV and pseudorapidity |η | < 2.4.Opposite charges for the leptons are required when forming pairs.In the case of multiple lepton combinations, the lepton pair with the invariant mass closest to the nominal Z-boson mass is selected as the Z candidate.The efficiency of the dilepton selection is estimated using the tag-andprobe method [32] in events with at least two leptons and a jet passing the requirements detailed below.The offline selection efficiencies are estimated from data and simulations, and data/simulation 'scale factors' are estimated to correct for the differences; trigger efficiencies are estimated from data alone.All simulated events are corrected for differences between data and simulation by applying the trigger efficiencies and the data/simulation scale factors as a function of p T and η for each lepton.
Jets are reconstructed by clustering individual particle-flow objects using the anti-k T jet clustering algorithm [33] with a distance parameter of 0.5, as implemented in the FASTJET program [34,35].Jets are calibrated using photon+jet, Z+jet, and dijet events to ensure a uniform energy response in p T and η [36].The contribution to the jet transverse energy from pileup is estimated on an event-by-event basis using the jet-area method [37] and is subtracted.The reconstructed jets are required to have p j T > 25 GeV and to be separated from each of the selected leptons by at least ∆R( , j) = 0.5.Furthermore, jets are required to have |η j | < 2.1 to ensure optimal b-tagging performance.Loose identification criteria [36] are applied in order to reject jets coming from beam background, calorimeter noise, and isolated photons.Jets originating from pileup in the Z+jets sample, and thereby contributing falsely to the cross section ratio, are suppressed by requiring the momentum of particle tracks originating from the selected primary vertex compared to the jet momentum be at least 10%.The remaining background caused by jets from pileup is ∼2% in the Z+jets data sample.
Jets originating from b quarks are tagged by taking advantage of the long b-hadron lifetime.The 'Simple Secondary Vertex' (SSV) b-tagging algorithm employs a three-dimensional flight distance significance between the primary vertex and a secondary vertex in a jet.To maximise the selection efficiency of the Z+b-jets process for multiple b jets, the high-efficiency version of the SSV b-tagging algorithm is used, which considers secondary vertices built from two or more tracks.The discriminant value to define b-tagged jets is chosen such that the probability of tagging a light-parton jet (mistagging fraction) is less than 1%, with a b-tag efficiency of ∼55%.The b-tagging efficiencies and mistagging fractions are measured in the data and simulation as functions of the p T and η of the jet using inclusive jet samples, where the tagging efficiency in the data is ∼5% smaller than the efficiency in the simulations [38].Simulated events are corrected for this difference, taking into account the data/simulation scale factor for each btagged jet, depending on the generator-level flavour.
After the application of the b-tagging requirement, the sample is divided into nonoverlapping  categories according to the number of b-tagged jets in the sample: the Z+1b-jet sample contains events with exactly one b-tagged jet, while the Z+2b-jets sample contains the events with at least two b-tagged jets.In order to suppress background from tt production in both samples, the reconstructed dilepton invariant mass M is required to have a value between 76 and 106 GeV.In Fig. 1 the dielectron invariant mass distribution shows the effectiveness of this requirement.
To further suppress the tt background in the Z+2b-jets sample, the missing transverse energy (E miss T ) is evaluated and events with a value significantly different from zero are vetoed.The E miss T is calculated by forming the negative vector sum of the transverse momenta of all particles in the events.The E miss T significance is more robust than the E miss T itself against pileup, and offers an event-by-event assessment of the likelihood that the observed E miss T is consistent with zero given the reconstructed content of the event and known measurement resolutions of the CMS detector [39].In Fig. 2 the E miss T significance distribution is shown after requiring a Z candidate and two b-tagged jets.The distributions for the Z+b and tt components motivate the selection of events with a reconstructed E miss T significance less than 10, which results in a high signal efficiency and small systematic uncertainty.
All simulated events are corrected for the differences between data and simulation in the pileup distributions, b-tagging efficiencies, and lepton reconstruction efficiencies.The data yields as well as the predicted yields are summarised in Table 1.

Backgrounds
Events not originating from the Z+b-jets production process, but nevertheless contributing to the final reconstructed event yield after the full selection, are expected to originate from tt, Table 1: Data yields in the selected samples and a comparison to the expectation from various sources based on MC simulations.The expected yields are estimated using the theoretical predictions for the cross sections.Uncertainties are statistical only.Z+jets, and ZZ production.For the Z+1b-jet sample, the main background originates from Z bosons produced in association with non-b jets; for the Z+2b-jets sample, another sizable background originates from tt production, with another nonnegligible contribution from ZZ production.
The background originating from tt production is estimated by means of a binned fit to the wide dilepton invariant-mass spectrum, 61 < M < 121 GeV, as shown for the electron channel in Fig. 1.The shape of the invariant-mass distribution for Z+jets events is taken from Z-bosonenriched data samples, while the distribution (template) for tt is based on simulation.
As a control method, two other distinct parameterizations are employed for the probability density functions of the Z+jets and tt contributions: (i) Z+jets templates based on simulation together with tt templates based on distributions in data samples, and (ii) an empirical parameterization.These tt templates are acquired from an opposite-flavour (µ/e) dilepton sample in which the tt contribution is enriched.The empirical parameterizations employ a relativistic Breit-Wigner distribution to describe the Z+jets contribution and a polynomial distribution to describe the tt contribution; the parameters of both probability density functions are free to vary in the fit.As another control method, a multivariate matrix-element approach [40] is used to distinguish signal and background.
In all channels the results obtained with the various parameterizations and methods are consistent with each other and with the expectations from simulation.The fraction of events from tt, f tt , estimated from the fit within the wide mass window is interpolated to the signal mass window (76 < M < 106 GeV).The differences between the tt estimates derived from alternative parameterizations are used to estimate the related systematic uncertainty.
The background due to mistagged c and light-parton jets is estimated from the mass distribution of the secondary vertices (M SV ) of the b-tagged jets.For the Z+1b-jets sample, exactly one jet per event is b-tagged, and hence one secondary vertex per event is reconstructed and analyzed.For the Z+2b-jets sample the distributions of the M SV of both the leading (in p T ) and subleading b-tagged jets are used.
As described in detail in [9], templates are obtained from simulations to model the M SV distributions for the various jet flavours; separate templates are constructed for b jets, c jets, and light-parton jets.These templates are used in maximum-likelihood fits to extract the fractions of b, c, and light-parton jets from the data for both the Z+1b-jet and the Z+2b-jets samples.In the Z+2b-jets sample the distributions of the leading and subleading jets are fitted separately.The results of fits to the one-dimensional M SV distributions after the Z+2b-jets selection are shown in Fig. 3.The fractions of correctly tagged b jets in the Z+1b-jet and Z+2b-jets samples are estimated to be ∼55% and 80-85%, respectively.The estimated fraction of correctly tagged b jets is checked by comparing the fit results to (i) the results obtained with templates constructed from an independent MC sample, and (ii) the direct expectations from simulation, and are found to be in agreement.Effects due to gluon splitting in the modelling of the distributions have been , i.e. the fractions of events in the two samples that contain correctly tagged b jets; events in the Z+2b-jets sample with two correctly tagged b jets are considered as Z+2b-jets signal events, whereas events with one mistagged jet in the Z+2b-jets sample are considered for the Z+1b-jet signal yield.In order to estimate these ratios from the results of the one-dimensional fits, the various combinations in which two jets are b-tagged in the Z+2b-jets sample are studied in simulations.The systematic uncertainty related to the b purity is evaluated by varying the mistagging rates and production rates within their uncertainties.As a cross-check, a fit is performed to the two-dimensional distribution of the M SV values for the leading and subleading b-tagged jets, and consistent results are obtained.
A small background from ZZ events is expected in the Z+2b-jets sample.This contribution (N ZZ ) is estimated from MC simulations, using the cross section and uncertainty from the CMS measurement [41] for the normalization.The yield from a SM Higgs boson with mass of 125 GeV [18,19,42] that decays into two b jets, and is produced in association with a Z boson, is expected to be approximately 20% of the ZZ contribution, i.e. 2.1 events in the Z(µµ)+2b-jets final state and 1.7 events in Z(ee)+2b-jets final state.The resulting effect on the Z+2b-jets cross section is expected to be ∼0.6%.
The background contributions are summarized in Table 2.The backgrounds due to tt and ZZ production increase when requiring two b-tagged jets, because of the relatively harder spectra of these sources of background compared to the signal.At the same time, the backgrounds due to light-parton jets decrease, since the probability of mistagging two jets is smaller.The corrected signal yield (N sig ) is obtained by subtracting the backgrounds from the number of selected events (N rec ), and is estimated as (1) is the fraction of events in the Z+2b-jets sample for which one jet is mistagged, which is 16 ± 5%.The resulting contribution to the Z+1b-jet cross section is ∼1%.

Efficiencies and migrations
In order to extract a cross section at the particle level, the background-subtracted yields for the Z+1b-jet and the Z+2b-jets categories in Eq. ( 1) are corrected for the efficiencies in the selection of the dilepton pair and the b-tagged jets, as well as for the detector resolution effects.Both the application of b tagging and jet reconstruction may induce migrations between the category of events containing one b jet and that containing more than one, since the number of generated b jets and the number of correctly reconstructed b jets are, in general, not the same.In order to estimate the cross sections for the different b-jet multiplicities, the efficiency corrections (or 'unfolding') are performed as a function of the number of b jets.
Particle-level b jets are defined by matching generated jets to a b hadron within ∆R < 0.5 of the jet axis.No requirement is placed on the p T of the hadron, and the generated jet is constructed from particle-level objects which include invisible particles.The generated jets are clustered and selected with the same criteria used for the jets reconstructed in data.Particle-level leptons are defined as 'dressed' leptons, i.e. adding to the lepton all generator-level photons within a cone of ∆R < 0.1.
The selection efficiency is factorised into two parts: the b-tagging efficiency (E b ) and the lepton selection efficiency (E ).The correction for the detector resolution effects (E r ) is dominated by the jet energy resolution.Finally, E m corrects for the efficiency loss associated with the selection criterion on the E miss T significance in the Z+2b-jets event selection.
To account for migrations between different b-jet multiplicities, a 2 × 2 matrix equation is used.Each efficiency factor is represented by a matrix (the matrices E and E m are diagonal).The matrices are applied in an order reflecting the order of the selection requirements.
This equation is used to obtain the cross sections for the production of a Z boson in association with exactly one b jet (σ Z+1b ) or at least two b jets (σ Z+2b ) from the numbers of reconstructed signal events in the Z+1b-jet and Z+2b-jets categories.
The MC signal sample is used to build the matrices, with efficiencies from the simulation rescaled to match the efficiencies observed in the data.The p T distributions for the leading (in p T ) and subleading b jets after the Z+2b-jets selection are shown in Fig. 4. The agreement between data and simulations in Figs. 2 and 4 justifies the use of this sample for the unfolding procedure.The inclusive cross section for the production of a Z boson in association with at least one b jet is the sum of the two cross sections in Eq. ( 2), namely, σ Z+b ≡ σ Z+1b + σ Z+2b .The ratio of this cross section to the cross section for the production of a Z boson with any kind of jet is denoted σ Z+b/Z+j .The cross sections are defined using the same acceptance for the different lepton flavours: events have leptons with p T > 20 GeV and |η | < 2.4, a dilepton invariant mass 76 < M < 106 GeV, and jets with p j T > 25 GeV and |η j | < 2.1, and a separation between the leptons and the jets of ∆R( , j) > 0.5.
The terms in Eq. ( 2) related to the b-tagging and E miss T efficiencies are found to be very similar for the muon and the electron channels, as expected.For the lepton selection efficiencies, results are found to be almost identical between the two b-jet multiplicity bins, which is expected since the requirement of ∆R( , j) > 0.5 effectively renders the lepton selection insensitive to the jet multiplicity.

Systematic uncertainties
The following sources of systematic uncertainties are considered: • Background from light-parton jets: For the estimate of the background due to mistagged b jets, the main source of uncertainty arises from the fit uncertainty in the fraction of b jets in the Z+1b-jet and Z+2b-jets samples.Another source of uncertainty originates from the ambiguity when estimating the number of events containing zero, one, or two b jets in the Z+2b-jets sample.The corresponding systematic uncertainty is estimated by varying the (mis)tagging efficiencies according to their uncertainties.Studies show that no significant differences are observed when comparing the M SV templates obtained with different MC generators, and that the template acquired from simulation correctly describes the distribution observed in data [38,43].
• Background from tt: The main source of uncertainty in the estimate of the tt background is the statistical uncertainty from the fit.An additional uncertainty originates from the modelling of the signal and background shapes.The probability density functions used in the estimate of the tt background are obtained in three distinct ways: with templates based on simulation and on data, and by modelling the contributions with an empirical parameterization.The systematic uncertainty is estimated from the differences between the three methods.
• The ZZ background: The uncertainty in the overall normalization is taken from the CMS measurement [41].Correlated sources of uncertainties (such as the luminosity) are ignored to avoid double counting.
All background-related systematic uncertainties are listed in Table 2, and are propagated to the cross section estimate following Eq.( 1).Other systematic uncertainties, estimated via Eq.( 2), are: • The b-tagging efficiency and the mistagging fraction: The uncertainties of the btagging efficiencies and mistagging fractions are estimated in the data as functions of the p T and η of the jet, combining the various methods discussed in Ref. [38].These uncertainties affect the b-tagging efficiencies as described in Section 5.The p T -dependent uncertainties in the jet tagging efficiency, 3-8% for p T > 30 GeV and 12% for p T < 30 GeV, are propagated to the b-tagging data/simulation scale factors, by varying these according to the corresponding uncertainties with the flavor of each jet.The uncertainty in the mistagging fraction, which enters the calculation of the event weight at second order, is found to have a negligible impact.
• Jet energy scale (JES) and resolution (JER): The jet energy calibration is based on MC simulations, while residual corrections are used to account for the small differences between data and simulation.The JES uncertainty is taken from Ref. [36] and amounts to 3-5% depending on the p T and η of the jets.The JER uncertainty is taken to be 10%, after degrading the simulated resolution by 10% to match that measured in the data.Both affect E r .Studies of simulated samples show that these JES corrections are good for jets from bottom quarks.
• Effect from pileup: The total inelastic cross section used to infer the pileup in data from the instantaneous luminosity is varied by ±5%, thereby affecting the pileup distribution in the simulated samples and covering the uncertainties due to pileup modelling.It is then propagated to the estimation of the unfolding matrices where it affects mainly the lepton efficiency factors through the lepton isolation requirements.
• Requirement on E miss T : The requirement on the E miss T significance removes ∼2% of the Z+2b-jets signal contribution, which is evaluated from simulation.The systematic uncertainty is estimated by varying each component entering the E miss T calculation within its uncertainty.This includes contributions from JES and JER as discussed above, unclustered energy (10%), τ leptons (3%), electrons and photons (0.6-1.5%), and muons (0.2%) [39].
• MC statistics: While the MC statistics suffice for the Z+1b-jet sample, they lead to uncertainties of several percent in correction factors involving the Z+2b-jets sample.
• Luminosity: The uncertainty of the integrated luminosity recorded by CMS is 2.2% in the 2011 data set [21].
• Dilepton selection efficiencies: The systematic uncertainty of the scale factor per lepton, which is applied to simulated events to compensate for data/simulation differences, is obtained with the tag-and-probe method, and is less than 0.4% for muons and 1.0% for electrons.
• Theory: The effect of uncertainties in the renormalization and factorization scales is estimated using MCFM [2].The impact of scale variations on the p T of the b jets is used in the unfolding procedure to estimate the effect on the cross section.Similarly, the p T of the dilepton pair is varied according to the difference observed between data and simulation to estimate the impact on the unfolding.
Furthermore, the effect due to MPIs on the acceptance of Z+b-jets events is studied by artificially reducing their contribution by a factor two.This is done by applying a veto on the azimuthal angle ∆φ Z,bb , which has been shown to be a discriminant observable for MPIs [44].The effect of this requirement on the cross sections has been found to be less than 0.5%.
Together, this leads to an uncertainty of at most 3% in the cross sections.
• Vertex association: For the estimate of the cross section ratio σ Z+b/Z+j , an additional uncertainty arises from the contribution of jets not associated with the primary vertex.After the requirement on the momentum fraction of tracks originating from the primary vertex, the background due to pileup is estimated from simulation to be 2.2%.The efficiency of the requirement is estimated from the distribution of this observable in data before applying the requirement.The corresponding systematic uncertainty is evaluated by comparing the distributions of this observable in data and simulation; it is assumed that the difference observed for the variable used for the vertex association is entirely due to events originating from pileup.This assumption results in a systematic uncertainty of 18% in the pileup contamination, fully correlated between the electron and muon channels.
The systematic uncertainties are summarized in Table 3.The uncertainties are presented separately for the muon and electron channels, and for the Z+1b-jet and Z+2b-jets measurements.

Kinematic observables
One of the observables of interest for searches in the Z+2b-jets final state is the invariant mass of the b-jet pair (M bb ).This observable is, for example, used in the study of the Higgs boson produced in association with a Z boson and decaying into two b jets, in the Z( )H(bb) final state [4,5].Other kinematic observables in the Z+2b-jets final state relevant to searches for undiscovered processes are the transverse momentum of the dilepton (p Z T ) and the dijet (p bb T ) pair, and the angle between the dilepton pair and the dijet pair (∆φ Z,bb ).The distributions of these observables are compared with the predictions from MADGRAPH, including uncertainties due to the jet energy scale and the b-tagging efficiencies, as well as the uncertainties due to limited MC statistics.More than two jets are b-tagged in less than 2% of the Z+2b-jets events, and in this case the two highest-p T jets are considered.The distributions of M bb and p bb T , presented in Fig. 5 (top left and top right, respectively), show agreement with the predictions.The excess of data in the overflow bin at high values of p bb T is not concentrated in any particular region.The distribution of ∆φ Z,bb , shown in Fig. 5 (bottom left), shows agreement with the predictions as well, both in the collinear and back-toback regions.This is especially relevant with respect to contributions from MPIs, which are expected to have less correlated kinematics than those from the Z+2b-jets process, and will therefore give a uniform distribution in ∆φ Z,bb .
On the other hand, the p Z T distribution shows a harder spectrum in data than predicted, as shown in Fig. 5 (right bottom).An overall excess of events is observed for p Z T > 80 GeV, in particular in the region around 100 GeV.This trend is consistent with the earlier CMS publication [9], where a similar discrepancy is observed for the p Z T observable in the Z+b-jets final state.A harder spectrum for the p Z T observable is predicted in four-flavour calculations with massive b quarks at NLO [14], which might explain the observed disagreement.
The effect of the disagreement on the estimate of the cross sections has been studied and is included in the systematic uncertainties, as described in Section 6.Furthermore, a bin-by-bin reweighting of the predictions according to the observed discrepancy in the p Z T observable has been performed, and this improves the agreement in other observables where differences are observed.

Cross sections
The cross sections are estimated per b-jet multiplicity bin and for each lepton flavour separately.The results are summarized in Table 4.
Using the best linear unbiased estimator [45], results for the µµ and ee channels are found to be consistent with a χ 2 probability of 42% for the Z+1b and 78% for the Z+2b cases.They are therefore combined into a single measurement using the optimal set of coefficients that minimise the  3.91 ± 0.04 ± 0.23 3.84 ± 0.04 ± 0.24 σ Z+b/Z+j (%) 5.23 ± 0.04 ± 0.24 5.08 ± 0.05 ± 0.24 total uncertainty in the combined result, taking into account statistical and systematic uncertainties and their correlations.The results are summarized in Table 5 and are then compared with various predictions.
The expectations from MADGRAPH, in both the 5F and the 4F schemes, are estimated using a global K factor to correct the inclusive Drell-Yan cross section for next-to-NLO effects [28].The expectations from aMC@NLO, at NLO, are also estimated using both 5F calculations and 4F calculations with massive b quarks [14].The events simulated with MADGRAPH and aMC@NLO are interfaced with the PYTHIA parton shower simulation.The settings used for the predictions from MADGRAPH and aMC@NLO are described in detail in [12].
The NLO prediction from MCFM is at the parton level.The MCFM calculations are estimated with the CTEQ6mE PDF, and the renormalization and factorization scales are set to the invariant mass of the dilepton pair.Table 5: Cross sections for the production of a Z boson with exactly one b jet, with at least two b jets, with at least one b jet, and the ratio with respect to at least one jet of any flavour, showing the statistical and systematic uncertainties.The expectations from MADGRAPH, MCFM and aMC@NLO include uncertainties due to scale variations.Uncertainties in the theoretical predictions are estimated by varying the renormalization and factorization scales by a factor two up and down.For the MADGRAPH 5F prediction, the scales are varied in a correlated manner, whereas the scales are varied in an uncorrelated way for the other predictions, which leads to a larger estimate for the uncertainty.The uncertainties in the 4F predictions amount to 15-20%, as expected [46].Variations of the PDFs (using MSTW2008 [47], CTEQ6, and CT10 [48] PDF sets), jet matching scale (up to a factor of two), and mass of the b quark (between 4.4 and 5.0 GeV) all result in smaller uncertainties.A more detailed description of the methods to estimate these uncertainties is given in [12].
The measured cross sections are consistent, within uncertainties, with the expectations in the 5F scheme from both MADGRAPH and aMC@NLO.Compared to the predictions from MAD-GRAPH and aMC@NLO in the 5F scheme, the predictions from MCFM are approximately 20% lower.The predictions by MADGRAPH and aMC@NLO from calculations in the 4F scheme, compared to the predictions in the 5F scheme, show a reduction of the Z+1b-jet production rate, when the other b jet in the final state is produced outside of the acceptance.
A difference of approximately two standard deviations is observed when comparing to the parton-level prediction from MCFM for the Z+b-jets cross section.Since the correction factor from parton level to hadron level is smaller than one [9], this difference is not explained by hadronization effects.The difference remains when measuring the cross section ratio, which excludes an explanation based on experimental systematic effects that are shared between the Z+jets and the Z+b-jets final states, such as luminosity, and the reconstruction of jets and leptons.These results indicate that the difference observed with MCFM is specific to the modelling of the Z+b-jets final state.
The largest discrepancy is observed when comparing the measured Z+1b-jet cross section with the predictions in the 4F scheme.In particular, the prediction from aMC@NLO in the 4F scheme shows a discrepancy of more than two standard deviations compared to the measurement.

Conclusions
The production of Z( )+b-jets, with = µµ or ee, has been studied for events containing leptons with p T > 20 GeV, |η | < 2.4, a dilepton invariant mass 76 < M < 106 GeV, jets with p j T > 25 GeV and |η j | < 2.1, and a separation between the leptons and the jets of ∆R( , j) > 0.5.The Z+b-jets cross sections have been measured, at the level of stable final-state particles, for a Z boson produced with exactly one or at least two b jets.In addition, a cross section ratio has been extracted for a Z boson produced with at least one b jet relative to a Z boson produced with at least one jet.
The cross section measurements are in agreement with the expectations from MADGRAPH and aMC@NLO in the five-flavour scheme.A difference of approximately two standard deviations is observed when comparing the cross sections with the predictions from MCFM at the parton level, and the comparison with the cross section ratio indicates that the difference is specific to the modelling of the Z+b-jets final state.Comparisons with the predictions in the fourflavour scheme, in particular from aMC@NLO, show a disagreement of more than two standard deviations in the Z+1b-jet final state.
Comparisons of the kinematic properties of Z+2b-jets production with the predictions from MADGRAPH in the five-flavour scheme show potential limitations of the existing MC event generators that employ the matrix element plus parton shower approach at leading order with massless b quarks.While these observations should be confirmed with more data, next-toleading-order simulations and/or simulations with massive quarks could possibly provide a better description of the data in certain regions of phase space.

Figure 1 :
Figure 1: Distribution of the invariant mass of the electron pair in a sample of events containing two electrons and two b-tagged jets and requiring E miss T significance < 10.Overlaid are the distributions after a fit of the tt fraction within the wide dilepton invariantmass window: 61 < M < 121 GeV.

Figure 2 :
Figure 2: Distribution of the E miss T significance variable in a sample of events containing two leptons and two b-tagged jets and within the default mass window, 76 < M < 106 GeV.The simulated distributions are normalized using the theoretical predictions.The last bin contains the overflow.

Figure 3 :
Figure 3: Distributions of the secondary vertex mass of the leading (in p T ) b-tagged jet of the dimuon Z+2b-jets sample (left) and the subleading b-tagged jet of the dielectron Z+2b-jets sample (right).The overlaid distributions are the results of the fit described in the text.

Figure 4 :
Figure 4: The combined muon+electron distributions of the p T of the leading-p T (left) and subleading-p T (right) b-tagged jet for the Z+2b-jets sample.The simulated samples are normalized to the theoretical predictions.The last bin in both distributions contains the overflow, and the uncertainties in the simulations are shown as a hatched band.The data/simulation ratio shows the separate contributions to this uncertainty: the band represents the statistical uncertainty in the simulated yield, and the lines indicate the uncertainties related to the jet energy scale (dashed) and the b-tag scale factors (solid).

Figure 5 :
Figure 5: Distributions of kinematic observables for the Z+2b-jets selection of the combined electron and muon samples, and a comparison with the simulated samples that are normalized to the theoretical predictions.Top left: the dijet mass of the two b-tagged jets.Top right: the p T distribution of the dijet pair.Left bottom: the azimuthal angle φ between the Z boson and the dijet system.Right bottom: the p T distribution of the dilepton pair.The right-most bin in the last three plots contains the overflow.Uncertainties in the predictions are shown as a hatched band.The data/simulation ratio shows the separate contributions to this uncertainty: the band represents the statistical uncertainty on the simulated yield, and the lines indicate the uncertainties related to the jet energy scale (dashed) and the b-tagging scale factors (solid).
tifique, and Fonds voor Wetenschappelijk Onderzoek; the Brazilian Funding Agencies (CNPq, CAPES, FAPERJ, and FAPESP); the Bulgarian Ministry of Education and Science; CERN; the Chinese Academy of Sciences, Ministry of Science and Technology, and National Natural Science Foundation of China; the Colombian Funding Agency (COLCIENCIAS); the Croatian Ministry of Science, Education and Sport, and the Croatian Science Foundation; the Research Promotion Foundation, Cyprus; the Ministry of Education and Research, Recurrent financing contract SF0690030s09 and European Regional Development Fund, Estonia; the Academy of Finland, Finnish Ministry of Education and Culture, and Helsinki Institute of Physics; the Institut National de Physique Nucléaire et de Physique des Particules / CNRS, and Commissariat à l' Énergie Atomique et aux Énergies Alternatives / CEA, France; the Bundesministerium f ür Bildung und Forschung, Deutsche Forschungsgemeinschaft, and Helmholtz-Gemeinschaft Deutscher Forschungszentren, Germany; the General Secretariat for Research and Technology, Greece; the National Scientific Research Foundation, and National Innovation Office, Hungary; the Department of Atomic Energy and the Department of Science and Technology, India; the Institute for Studies in Theoretical Physics and Mathematics, Iran; the Science Foundation, Ireland; the Istituto Nazionale di Fisica Nucleare, Italy; the Korean Ministry of Education, Science and Technology and the World Class University program of NRF, Republic of Korea; the Lithuanian Academy of Sciences; the Ministry of Education, and University of Malaya (Malaysia); the Mexican Funding Agencies (CINVESTAV, CONACYT, SEP, and UASLP-FAI); the Ministry of Business, Innovation and Employment, New Zealand; the Pakistan Atomic Energy Commission; the Ministry of Science and Higher Education and the National Science Centre, Poland; the Fundac ¸ão para a Ciência e a Tecnologia, Portugal; JINR, Dubna; the Ministry of Education and Science of the Russian Federation, the Federal Agency of Atomic Energy of the Russian Federation, Russian Academy of Sciences, and the Russian Foundation for Basic Research; the Ministry of Education, Science and Technological Development of Serbia; the Secretaría de Estado de Investigaci ón, Desarrollo e Innovaci ón and Programa Consolider-Ingenio 2010, Spain; the Swiss Funding Agencies (ETH Board, ETH Zurich, PSI, SNF, UniZH, Canton Zurich, and SER); the National Science Council, Taipei; the Thailand Center of Excellence in Physics, the Institute for the Promotion of Teaching Science and Technology of Thailand, Special Task Force for Activating Research and the National Science and Technology Development Agency of Thailand; the Scientific and Technical Research Council of Turkey, and Turkish Atomic Energy Authority; the Science and Technology Facilities Council, UK; the US Department of Energy, and the US National Science Foundation.Individuals have received support from the Marie-Curie programme and the European Research Council and EPLANET (European Union); the Leventis Foundation; the A. P. Sloan Foundation; the Alexander von Humboldt Foundation; the Belgian Federal Science Policy Office; the Fonds pour la Formation à la Recherche dans l'Industrie et dans l'Agriculture (FRIA-Belgium); the Agentschap voor Innovatie door Wetenschap en Technologie (IWT-Belgium); the Ministry of Education, Youth and Sports (MEYS) of Czech Republic; the Council of Science and Industrial Research, India; the Compagnia di San Paolo (Torino); the HOMING PLUS programme of Foundation for Polish Science, cofinanced by EU, Regional Development Fund; and the Thalis and Aristeia programmes cofinanced by EU-ESF and the Greek NSRF.

Table 2 :
The estimates of the purities, the tt fractions, and the ZZ backgrounds for the various b-jet multiplicities and lepton flavours, including statistical and systematic uncertainties.

Table 3 :
Fractional uncertainties in the measured cross sections, grouped according to the correlation between the channels.

Table 4 :
Cross sections at the particle level for the production of a Z boson with exactly one b jet, with at least two b jets, and with at least one b jet, and the ratio with respect to the production of a Z boson in association with at least one jet of any flavour.The first uncertainty is statistical, and the second systematic.