Measurement of jet multiplicity distributions in tt production in pp collisions at √ s = 7 TeV The

The normalised differential top quark-antiquark production cross section is measured as a function of the jet multiplicity in proton-proton collisions at a centre-of-mass energy of 7 TeV at the LHC with the CMS detector. The measurement is performed in both the dilepton and lepton+jets decay channels using data corresponding to an integrated luminosity of 5.0 fb−1. Using a procedure to associate jets to decay products of the top quarks, the differential cross section of the tt production is determined as a function of the additional jet multiplicity in the lepton+jets channel. Furthermore, the fraction of events with no additional jets is measured in the dilepton channel, as a function of the threshold on the jet transverse momentum. The measurements are compared with predictions from perturbative quantum chromodynamics and no significant deviations are observed.


Introduction
Precise measurements of the top quark-antiquark (tt) production cross section and top-quark properties performed at the CERN Large Hadron Collider (LHC) provide crucial information for testing the predictions of perturbative quantum chromodynamics (QCD) at large energy scales and in processes with multiparticle final states.
About half of the tt events are expected to be accompanied by additional hard jets that do not originate from the decay of the tt pair (tt +jets). In this paper, these jets will be referred to as additional jets. These processes typically arise from either initial-or final-state QCD radiation, providing an essential handle to test the validity and completeness of higher-order QCD calculations of processes leading to multijet events. Calculations at next-to-leading order (NLO) are available for tt production in association with one [1] or two [2] additional jets. The correct description of tt +jets production is important to the overall LHC physics program since it constitutes an important background to processes with multijet final * e-mail: cms-publication-committee-chair@cern.ch states, such as associated Higgs-boson production with a tt pair, with the Higgs boson decaying into a bb pair, or final states predicted in supersymmetric theories. Anomalous production of additional jets accompanying a tt pair could be a sign of new physics beyond the standard model [3].
This paper presents studies of the tt production with additional jets in the final state using data collected in protonproton (pp) collisions with centre-of-mass energy √ s = 7 TeV with the Compact Muon Solenoid (CMS) detector [4]. The analysis uses data recorded in 2011, corresponding to a total integrated luminosity of 5.0 ± 0.1 fb −1 . For the first time, the tt cross section is measured differentially as a function of jet multiplicity and characterised both in terms of the total number of jets in the event, as well as the number of additional jets with respect to the leading-order hardinteraction final state. Kinematic properties of the additional jets are also investigated. The results are corrected for detector effects and compared at particle level with theoretical predictions obtained using different Monte Carlo (MC) event generators.
The differential cross sections as a function of jet multiplicity are measured in both the dilepton (ee, µµ, and eµ) and +jets ( = e or µ) channels. For the dilepton channel, data containing two oppositely charged leptons and at least two jets in the final state are used, while for the +jets channel, data containing a single isolated lepton and at least three jets are used. Following the analysis strategy applied to the measurement of other tt differential cross sections [5], the results are normalised to the inclusive cross section measured in situ, eliminating systematic uncertainties related to the normalisation. Lastly, the fraction of events that do not contain additional jets (gap fraction), first measured by ATLAS [6], is determined in the dilepton channel as a function of the threshold on the transverse momentum ( p T ) of the leading additional jet and of the scalar sum of the p T of all additional jets.
The measurements are performed in the visible phase space, defined as the kinematic region in which all selected final-state objects are produced within the detector accep-tance. This avoids additional model uncertainties due to the extrapolation of the measurements into experimentally inaccessible regions of phase space.
The paper is structured as follows. A brief description of the CMS detector is provided in Sect. 2. Section 3 gives a description of the event simulation, followed by details of the object reconstruction and event selection in Sect. 4. A discussion of the sources of systematic uncertainties is given in Sect. 5. The measurement of the differential cross section is presented as a function of the jet multiplicity in Sect. 6 and as a function of the additional jet multiplicity in Sect. 7. The study of the additional jet gap fraction is described in Sect. 8. Finally, a summary is given in Sect. 9.

The CMS detector
The central feature of the CMS apparatus is a superconducting solenoid, 13 m in length and 6 m in diameter, which provides an axial magnetic field of 3.8 T. The bore of the solenoid is outfitted with various particle detection systems. Charged-particle trajectories are measured with silicon pixel and strip trackers, covering 0 ≤ φ < 2π in azimuth and |η| < 2.5 in pseudorapidity, where η is defined as η = − ln[tan(θ/2)], with θ being the polar angle of the trajectory of the particle with respect to the anticlockwisebeam direction. A lead tungstate crystal electromagnetic calorimeter (ECAL) and a brass/scintillator hadron calorimeter (HCAL) surround the tracking volume. The calorimetry provides excellent resolution in energy for electrons and hadrons within |η| < 3.0. Muons are measured up to |η| < 2.4 using gas-ionisation detectors embedded in the steel flux return yoke outside the solenoid. The detector is nearly hermetic, providing accurate measurements of any imbalance in momentum in the plane transverse to the beam direction. The two-level trigger system selects most interesting final states for further analysis. A detailed description of the CMS detector can be found in Ref. [4].

Event simulation
The reference simulated tt sample used in the analysis is generated with the MadGraph (v. 5.1.1.0) matrix element generator [7], with up to three additional partons. The generated events are subsequently processed using pythia (v. 6.424) [8] to add parton showering using the MLM prescription [9] for removing the overlap in phase space between the matrix element and the parton shower approaches. The pythia Z2 tune is used to describe the underlying event [10]. The top-quark mass is assumed to be m t = 172.5 GeV. The proton structure is described by the CTEQ6L1 [11] parton distribution functions (PDFs).
Additional tt and W+jets MadGraph samples are generated using different choices for the common factorisation and renormalisation scale (μ 2 F = μ 2 R = Q 2 ) and for the jetparton matching threshold. These are used to determine the systematic uncertainties due to model uncertainties and for comparisons with the measured distributions. The nominal Q 2 scale is defined as m 2 t + p 2 T (jet). This is varied between 4Q 2 and Q 2 /4. For the reference MadGraph sample, a jetparton matching threshold of 20 GeV is chosen, while for the up and down variations, thresholds of 40 and 10 GeV are used, respectively.
In addition to MadGraph, samples of tt events are generated with powheg and mc@nlo (v. 3.41) [16]. The CTEQ6M [11] PDF set is used in both cases. Both powheg and mc@nlo match calculations to full NLO accuracy with parton shower MC generators. For powheg, pythia is chosen for hadronisation and parton shower simulation, with the same Z2 tune utilised for other samples. For mc@nlo, herwig (v. 6.520) [17] with the default tune is used.
All generated samples are passed through a full detector simulation using Geant4 [29], and the number of additional pp collisions (pileup) is matched to the real distribution as inferred from data.

Event reconstruction and selection
The event selection is based on the reconstruction of the tt decay products. The top quark decays almost exclusively into a W boson and a b quark. Only the subsequent decays of one or both W bosons to a charged lepton and a neutrino are considered here. Candidate events are required to contain the corresponding reconstructed objects: isolated leptons and jets. The requirement of the presence of jets associated with b quarks or antiquarks (b jets) is used to increase the purity of the selected sample. The selection has been optimised independently in each channel to maximise the signal content and background rejection. In the dilepton channel, they are required to have a transverse momentum p T > 20 GeV, while in the +jets channel they are required to have p T > 30 GeV. In both cases they are required to be reconstructed within |η| < 2.4, and electrons from identified photon conversions are rejected. As an additional quality criterion, a relative isolation variable I rel is computed. This is defined as the sum of the p T of all neutral and charged reconstructed PF candidates inside a cone around the lepton (excluding the lepton itself) in the η-φ plane with radius R ≡ ( η) 2 + ( φ) 2 < 0.3, divided by the p T of the lepton. In the dilepton (e+jets) channel, electrons are selected as isolated if I rel < 0.12 (0.10).
Muon candidates are reconstructed from tracks that can be matched between the silicon tracker and the muon system [34]. They are required to have a transverse momentum p T > 20 GeV within the pseudorapidity interval |η| < 2.4 in the dilepton channel, and to have p T > 30 GeV and |η| < 2.1 in the +jets channel. Isolated muon candidates are selected by demanding a relative isolation of I rel < 0.20 (0.125) in the dilepton (μ+jets) channel.
Jets are reconstructed by clustering the particle-flow candidates [35] using the anti-k T algorithm with a distance parameter of 0.5 [36,37]. An offset correction is applied to take into account the extra energy clustered in jets due to pileup, using the FastJet algorithm [38] based on average pileup energy density in the event. The raw jet energies are corrected to establish a relative uniform response of the calorimeter in η and a calibrated absolute response in p T . Jet energy corrections are derived from the simulation, and are confirmed with in situ measurements with the energy balance of dijet and photon+jet events [35]. Jets are selected within |η| < 2.4 and with p T > 30 (35) GeV in the dilepton ( +jets) channel.
Jets originating from b quarks or antiquarks are identified with the Combined Secondary Vertex algorithm [39], which provides a b-tagging discriminant by combining secondary vertices and track-based lifetime information. The chosen working point used in the dilepton channel corresponds to an efficiency for tagging a b jet of about 80-85 %, while the probability to misidentify light-flavour or gluon jets as b jets (mistag rate) is around 10 %. In the +jets channel, a tighter requirement is applied, corresponding to a b-tagging efficiency of about 65-70 % with a mistag rate of 1 %. The probability to misidentify a c jet as b jet is about 40 % and 20 % for the working points used in the dilepton and +jets channels respectively [39].
The missing transverse energy (E miss T ) is defined as the magnitude of the sum of the momenta of all reconstructed PF candidates in the plane transverse to the beams.

Event selection
Dilepton events are collected using combinations of triggers which require two leptons fulfilling p T and isolation criteria. During reconstruction, events are selected if they contain at least two isolated leptons (electrons or muons) of opposite charge and at least two jets, of which at least one is identified as a b jet. Events with a lepton pair invariant mass smaller than 12 GeV are removed in order to suppress events from heavy-flavour resonance decays. In the ee and µµ channels, the dilepton invariant mass is required to be outside a Z-boson mass window of 91 ± 15 GeV (Z-boson veto), and E miss T is required to be larger than 30 GeV.
A kinematic reconstruction method [5] is used to determine the kinematic properties of the tt pair and to identify the two b jets originating from the decay of the top quark and antiquark. In the kinematic reconstruction the following constraints are imposed: the E miss T originated entirely from the two neutrinos; the reconstructed W-boson invariant mass of 80.4 GeV [40] and the equality of the reconstructed top quark and antiquark masses. The remaining ambiguities are resolved by prioritising those event solutions with two or one b-tagged jets over solutions using untagged jets. Finally, among the physical solutions, the solutions are ranked according to how the neutrino energies match with a simulated neutrino energy spectrum and the highest ranked one is chosen. The kinematic reconstruction yields no valid solution for about 11 % of the events. These are excluded from further analysis. A possible bias due to rejected solutions has been studied and found to be negligible.
In the e+jets channel, events are triggered by an isolated electron with p T > 25 GeV and at least three jets with p T > 30 GeV. Events in the μ+jets channel are triggered by the presence of an isolated muon with p T > 24 GeV fulfilling η requirements. Only triggered events that have exactly one high-p T isolated lepton are retained in the analysis. In the e+jets channel, events are rejected if any additional electron is found with p T > 20 GeV, |η| < 2.5, and relative isolation I rel < 0.20. In the μ+jets channel, events are rejected if any electron candidate with p T > 15 GeV, |η| < 2.5 and I rel < 0.20 is reconstructed. In both +jets channels events with additional muons with p T > 10 GeV, |η| < 2.5, and relative isolation I rel < 0.20 are rejected. The presence of at least three reconstructed jets is required. At least two of them are required to be b-tagged.
Only tt events from the decay channel under study are considered as signal. All other tt events are considered as background, including those containing leptons from τ decays, which are the dominant contribution to this background.

Background estimation
After the full event selection is applied, the dominant background in the eμ channel comes from other tt decay modes, estimated using simulation. In the ee and µµ channels, it arises from Z/γ * +jets production. The normalisation of this background contribution is derived from data using the events rejected by the Z-boson veto, scaled by the ratio of events failing and passing this selection estimated in simulation (R out/in ) [41]. The number of Z/γ * +jets → ee/µµ events near the Z-boson peak, N in Z/γ * , is given by the number of all events failing the Z-boson veto, N in , after subtracting the contamination from non-Z/γ * +jets processes. This contribution is extracted from eμ events passing the same selection, N in eµ , and corrected for the differences between the electron and muon identification efficiencies using a correction factor k. The Z/γ * +jets contribution is thus given by The factor k is estimated from k 2 = N eµ /N ee (N eµ /N µµ ) for the Z/γ * → e + e − (µ + µ − )+jets contribution, respectively. Here N ee (N µµ ) is the number of ee (µµ) events in the Z-boson region, without the requirement on E miss T . The remaining backgrounds, including single-top-quark, W+jets, diboson, and QCD multijet events are estimated from simulation.
In the +jets channel, the main background contributions arise from W+jets and QCD multijet events, which are greatly suppressed by the b-tagging requirement. A procedure based on control samples in data is used to extract the QCD multijet background. The leptons in QCD multijet events are expected to be less isolated than leptons from other processes. Thus, inverting the selection on the lepton relative isolation provides a relatively pure sample of QCD multijet events in data. Events passing the standard event selection but with an I rel between 0.3 and 1.0, and with at least one b-tagged jet are selected. The sample is divided in two: the sideband region (one b jet) and the signal region (≥2 b jets). The shape of the QCD multijet background is taken from the signal region, and the normalisation is determined from the sideband region. In the sideband region, the E miss T distribution of the QCD multijet model, other sources of background (determined from simulation), and the tt signal are fitted to data. The resulting scaling of QCD multijet background is applied to the QCD multijet shape from the signal region.
Since the initial state of LHC collision is enriched in up quarks with respect to down quarks, more W bosons are produced with positive charge than negative charge. In leptonic W-boson decays, this translates into a lepton charge asymmetry A. Therefore, a difference between the number of events with a positively charged lepton and those with a negatively charged lepton ( ±) is observed. In data, this quantity ( ± data ) is proportional to the number of W+jets events when assuming that only the charge asymmetry from W-boson production is significant. The charge asymmetry has been measured by CMS [42] and found to be well described by the simulation, thus the simulated value can be used to extract the number of W+jets events from data: N data W+jets = ± data /A. The correction factor on the W+jets normalisation, calculated before any b-tagging requirement, is between 0.81 and 0.92 depending on the W decay channel and the jet selection. Subsequently, b-tagging is applied to obtain the number of W+jets events in the signal region.
In addition, a heavy-flavour correction must be applied on the W+jets sample to account for the differences observed between data and simulation [43]. Using the matching between selected jets and generated partons, simulated events are classified as containing at least one b jet (W+bX), at least one c jet and no b jets (W+cX), or containing neither b jets nor c jets (W+light quarks). The rate of W+bX events is multiplied by 2 ± 1 and the rate of W+cX events is multiplied by 1 +1.0 −0.5 . No correction is applied to W+light-jets events. These correction factors are calculated in [43] in a phase space which is close to the one used in the analysis. The uncertainties in the correction factors are taken into account as systematic uncertainties. The total number of W+jets events is modified to conserve this number when applying the heavyflavour corrections. The remaining backgrounds, originating from single-top-quark, diboson, and Z/γ * +jets processes, are small and their contributions are estimated using simulation. The multiplicity and the p T distributions of the selected reconstructed jets are shown for the dilepton and +jets channels in Fig. 1. Good agreement for the jet multiplicity is observed between data and simulation for up to 5 (6) jets in the dilepton ( +jets) channels. For higher jet multiplicities, the simulation predicts slightly more events than observed in data. The modelling of the jet p T spectrum in data is shifted towards smaller values, covered by the systematic uncertainties. The uncertainty from all systematic sources, which are described in Sect. 5, is determined by estimating their effect on both the normalisation and the shape. The size of these global uncertainties does not reflect those in the final mea-surements, since they are normalised and, therefore, only affected by shape uncertainties.

Systematic uncertainties
Systematic uncertainties in the measurement arise from detector effects, background modelling, and theoretical assumptions. Each systematic uncertainty is investigated separately and estimated for each bin of the measurement by varying the corresponding efficiency, resolution, or scale within its uncertainty. For each variation, the measured nor-malised differential cross section is recalculated, and the difference between the varied result and the nominal result in each bin is taken as systematic uncertainty. The overall uncertainty in the measurement is obtained by adding all contributions in quadrature. The sources of systematic uncertainty, described below, are assumed to be uncorrelated. In the dilepton channels, the contribution from Z/γ * +jets processes as determined from data is varied in normalisation by ±30 % [41].
In the +jets channels, the uncertainty in the W+jets background arises from the contamination of other processes with a lepton charge asymmetry when extracting the rate from data, and from the uncertainty in the heavy-flavour correction factors. The rate uncertainty is estimated to range from 10 to 20 %, depending on the channel. The model uncertainty is estimated using samples with varied renormalisation and factorisation scales and jet-parton matching threshold. The QCD multijet background modelling uncertainty arises from the choice of the relative isolation requirement on the anti-isolated lepton used for the extraction of the background from data, the influence of the contamination from other processes on the shape, and the extrapolation from the sideband to the signal region. The total uncertainty is about 15 % to more than 100 %, depending on the channel. -Other systematic uncertainties The uncertainty associated with the pileup model is determined by varying the minimum bias cross section within its uncertainty of ±8 %. Other uncertainties taken into account originate from lepton trigger, isolation, and identification efficiencies; b-jet tagging efficiency and misidentification probability; integrated luminosity [49]; and the kinematic reconstruction algorithm used in the dilepton channels.
In the dilepton channels, the total systematic uncertainty is about 3 % at low jet multiplicities, and increases to about 20 % in the bins with at least five jets. In the +jets channels, the total systematic uncertainty is about 6 % at the lowest jet multiplicity, and increases to 34 % for events with at least 8 jets.
The dominant systematic uncertainties for both dilepton and +jets channels arise from the JES (with typical values from 2 to 20 %, depending on the jet multiplicity bin and cross section measurement) and the signal model including hadronisation, renormalisation and factorisation scales and jet-parton matching threshold (from 3 to 30 %). The typical systematic uncertainty due to JER ranges from 0.2 to 3 %, b-tagging from 0.3 to 2 %, pileup from 0.1 to 1.4 %, and background normalisation from 1.6 to 3.8 %. The uncertainty from other sources is below 0.5 %. The remaining uncertainties on the model arise from PDF and colour reconnection, varying from 0.1 to 1.5 % and from 1 to 5.8 %, respectively. In all channels, the systematic uncertainty for larger jet multiplicities is dominated by the statistical uncertainty of the simulated samples that are used for the evaluation of modelling uncertainties.

Normalised differential cross section as a function of jet multiplicity
The differential tt production cross section as a function of the jet multiplicity is measured from the number of signal events after background subtraction and correction for the detector efficiencies and acceptances. The estimated number of background events arising from processes other that tt production (N non tt BG ) is directly subtracted from the number of events in data (N ). The contribution from other tt decay modes is taken into account by correcting N -N non tt BG with the signal fraction, defined as the ratio of the number of selected tt signal events to the total number of selected tt events. This avoids the dependence on the inclusive tt cross section used for normalisation. The normalised differential cross section is derived by scaling to the total integrated luminosity and by dividing the corrected number of events by the cross section measured in situ for the same phase space. Because of the normalisation, those systematic uncertainties that are correlated across all bins of the measurement, and therefore only affect the normalisation, cancel out. In order to avoid additional uncertainties due to the extrapolation of the measurement outside of the phase space region probed experimentally, the differential cross section is determined in a visible phase space defined at the particle level by the kinematic and geometrical acceptance of the final-state leptons and jets. The visible phase space at particle level is defined as follows. The charged leptons from the tt decays are selected with |η| < 2.4 in dilepton events and |η| < 2.5 (2.1) in e+jets (μ+jets) final states, p T > 20 (30) GeV in the dilepton ( +jets) channels. A jet is defined at the particle level in a similar way as described in Sect. 4 for the reconstructed jets, by applying the anti-k T clustering algorithm to all stable particles (including neutrinos not coming from the hard interaction). Particle-level jets are rejected if the selected leptons are within a cone of R = 0.4 with respect to the jet, to avoid counting leptons misidentified as jets. A jet is defined as a b jet if it contains the decay products of a b hadron. The two b jets from the tt decay have to fulfill the kinematic requirements |η| < 2.4 and p T > 30 (35) GeV in the dilepton ( +jets) events. In the +jets channels, a third jet with the same properties is also required.
Effects from trigger and detector efficiencies and resolutions, leading to migrations of events across bin boundaries and statistical correlations among neighbouring bins, are corrected by using a regularised unfolding method [5,50,51]. A response matrix that accounts for migrations and efficiencies is calculated from simulated tt events using the reference MadGraph sample. The event migration in each bin is controlled by the purity (number of events reconstructed and generated in one bin divided by the total number of reconstructed events in that bin) and the stability (number of events reconstructed and generated in one bin divided by the total number of generated events in that bin). In these measurements, the purity and stability in the bins is typically 60 % or higher. The generalised inverse of the response matrix is used to obtain

Fig. 2
Normalised differential tt production cross section as a function of the jet multiplicity for jets with p T >30 GeV (top) and p T > 60 GeV (bottom) in the dilepton channel. The measurements are compared to predictions from MadGraph+pythia, powheg+pythia, and mc@nlo+herwig (left), as well as from MadGraph with varied renor-malisation and factorisation scales, and jet-parton matching threshold (right). The inner (outer) error bars indicate the statistical (combined statistical and systematic) uncertainty. The shaded band corresponds to the combined statistical and systematic uncertainty the unfolded distribution from the measured distribution by applying a χ 2 technique. To avoid non-physical fluctuations, a smoothing prescription (regularisation) is applied [5,52]. The unfolded data are subsequently corrected to take into account the acceptance in the particle level phase space.
The measured normalised differential cross sections are consistent among the different dilepton and +jets channels. The final results in the dilepton and +jets channels are obtained from the weighted average of the individual measurements, using the statistical uncertainty as the weight. The result from the combination of e+jets and μ+jets channels is defined for the pseudorapidity range |η| < 2.1, i.e. according to the selection criterion of the μ+jets channel. The difference of this result to that for the pseudorapidity range |η| < 2.5 has been estimated to be less than 0.4 % in any of the bins of the jet multiplicity distribution. In the combination, the differences in the |η|-range between μ+jets and e+jets channels are therefore neglected.
The normalised differential tt production cross section, 1/σ dσ/dN jets , as a function of the jet multiplicity, N jets , is shown in Tables 1 and 2, and Fig. 2 for the dilepton channel and jets with p T > 30 (60) GeV. For the +jets channel it is shown in Table 3 and Fig. 3 for jets with p T > 35 GeV. In the tables, the experimental uncertainties are divided between Table 3 Normalised differential tt production cross section as a function of the jet multiplicity for jets with p T > 35 GeV in the +jets channel. The statistical, systematic, and total uncertainties are also shown. The main experimental and model systematic uncertainties are dis-played: JES and the combination of renormalisation and factorisation scales, jet-parton matching threshold, and hadronisation (in the the dominant (JES) and other (JER, b-tagging, pileup, lepton identification, isolation, and trigger efficiencies, background contribution and integrated luminosity) contributions. The model uncertainties are also divided between the dominant (renormalisation and factorisation scales, jet-parton matching threshold, and hadronisation) and other (PDF and colour reconnection) contributions. The measurements are compared to the predictions from MadGraph and powheg, both interfaced with pythia, and from mc@nlo interfaced with herwig.
The predictions from MadGraph+pythia and powheg +pythia are found to provide a reasonable description of the data. In contrast, mc@nlo+herwig generates fewer events in bins with large jet multiplicities. The effect of the variation of the renormalisation and factorisation scales and jet-parton matching threshold in MadGraph+pythia is compared with the reference MadGraph+pythia simulation. The choice of lower values for both these parameters seems to provide a worse description of the data for higher jet multiplicities.

Normalised differential cross section as a function of the additional jet multiplicity
The normalised differential tt production cross section is also determined as a function of the number of additional jets accompanying the tt decays in the +jets channel. This measurement provides added value to the one presented in Sect. 6 by distinguishing jets from the tt decay products and jets coming from additional QCD radiation. This is particularly interesting in final states with many jets. For this measurement, the event selection follows the prescription discussed in Sect. 4, and requires at least four jets (in order to perform a full event reconstruction later) with p T > 30 GeV and |η| < 2.4. The p T requirement is lowered to gain more data and reduce the statistical uncertainty. The particle-level jets, defined as described in Sect. 6 but with p T > 30 GeV, are counted as additional jets if their distance to the tt decay products is R > 0.5. We consider the following objects as tt decay products: two b quarks, two light quarks from the hadronically decaying W boson, and the lepton from the leptonically decaying W boson; the neutrino is not included. The simulated tt events are classified into three categories according to the number of additional jets (0, 1, and ≥2) selected according to this definition. Figure 4 illustrates the contributions of tt events with 0, 1, and ≥2 additional jets to the number of reconstructed jets in the simulation.
A full event reconstruction of the tt system is performed in order to create a variable sensitive to additional jets, taking into account all possible jet permutations. The most likely permutation is determined using a χ 2 minimisation, where the χ 2 is given by: where m rec t had and m rec t lep are the reconstructed invariant masses of the hadronically and the leptonically decaying top quark, respectively, and m W had is the reconstructed invariant mass of the W boson from the hadronic top-quark decay. The parameters m true and σ t had , σ t lep , and σ W had are the mean value and standard deviations of the reconstructed mass distributions in the tt simulation. In each event, all jet permutations in which only b-tagged jets are assigned to b quarks are considered. The permutation with the smallest χ 2 value is chosen as the best hypothesis. For events containing the same number of reconstructed jets (N jets ) the variable χ 2 provides good discrimination between events classified as tt + 0, 1, and ≥2 additional jets. The discrimination power is due to the sensitivity of the event reconstruction to the relation between N jets  to get a large χ 2 value because one of the four jets from the tt decay partons is missing for a correct event reconstruction.
The measurement of the fractions of tt events with 0, 1, and ≥2 additional jets is performed using a binned maximumlikelihood fit of the χ 2 templates to data, simultaneously in both +jets channels. The normalisations of the signal templates (tt + 0, 1, and ≥2 additional jets) are free parameters in the fit. For the normalisations of the background processes, Gaussian constraints corresponding to the uncertainties of the background predictions are applied. It has been verified that the use of log-normal constraints gives similar results. The result of the fit is shown in Fig. 5. The QCD multijet and W+jets templates are estimated using the data-based methods described in Sect. 4.
The normalisations for the three signal templates are applied to the predicted differential cross section in the visible phase space, calculated using the simulated tt sample from MadGraph+pythia. This phase space is defined as in Sect. 6 with the requirement of four particle level jets with p T > 30 GeV. This provides the differential cross section as a function of the number of additional jets, which is finally normalised to the total cross section measured in the same phase space. The results are shown in Fig. 6 and summarised in Table 4.
For each tt + additional jet template used in the maximumlikelihood fit, a full correlation is assumed between the rate of events that fulfill the particle-level selection and the rate of events that do not. Therefore, a single template is used for both parts. Including an additional template made from events that are not inside the visible phase space leads to fit results that are compatible within the estimated uncertainties. To check the model dependency, the fit is repeated using simulated data from mc@nlo+herwig and powheg+pythia instead of MadGraph+pythia. The results are stable within the uncertainties.
The sources of systematic uncertainties are the same as those discussed in Sect. 5, except for the background normalisations, which are constrained in the fit. Their effect is propagated to the fit uncertainty, which is quoted as the statistical uncertainty. The impact of the systematic uncertainties on the extracted fractions of tt + 0, 1, and ≥2 additional jets is evaluated using pseudo-experiments. The most important contributions to the systematic uncertainties originate from JES (up to 7 %) and modelling uncertainties: hadronisation (up to 6 %), jet-parton matching threshold (up to 5 %), and renormalisation and factorisation scales (up to 4 %).
The mc@nlo+herwig prediction produces fewer events with ≥2 additional jets than data, which are well described by MadGraph+pythia and powheg+pythia. The prediction from MadGraph+pythia with lower renormalisation and factorisation scales provides a worse description of the data. These observations are in agreement with those presented in Sect. 6.

Additional jet gap fraction
An alternative way to investigate the jet activity arising from quark and gluon radiation produced in association with the tt system is to determine the fraction of events that do not contain additional jets above a given threshold. This measurement is performed using events in the dilepton decay channel after fulfilling the event reconstruction and selection requirements discussed in Sect. 4. The additional jets are defined as those not assigned to the tt system by the kinematic reconstruction described in Sect. 4.2.
A threshold observable, referred to as gap fraction [6], is defined as:

Fig. 6
Normalised differential tt production cross section as a function of the number of additional jets in the +jets channel. The measurement is compared to predictions from MadGraph+pythia, powheg+pythia, and mc@nlo+herwig (top), as well as from Mad-Graph with varied renormalisation and factorisation scales, and jetparton matching threshold (bottom). The inner (outer) error bars indicate the statistical (combined statistical and systematic) uncertainty. The shaded band corresponds to the combined statistical and systematic uncertainty where N total is the number of selected events and N ( p T ) is the number of events that do not contain additional jets above a p T threshold in the whole pseudorapidity range used in the analysis (|η| < 2.4). The pseudorapidity and p T distributions of the first and second leading (in p T ) additional reconstructed jets are presented in Fig. 7. The distributions show good agreement between data and the simulation.
The veto can be extended beyond the additional leading jet criteria by defining the gap fraction as where N (H T ) is the number of events in which H T , the scalar sum of the p T of the additional jets (with p T > 30 GeV), is less than a certain threshold.
For each value of p T and H T thresholds, the gap fraction is evaluated at particle level in the visible phase space defined in Sect. 6. The additional jets at particle level are defined as all jets within the kinematic acceptance not including the two highest-p T b jets containing the decay products of different b hadrons. They are required to fulfill the condition that they are not within a cone of R = 0.4 from any of the two isolated leptons, as described in Sect. 6.
Given the large purity of the selected events for any value of p T and H T , a correction for detector effects is applied following a simpler approach than the unfolding method used in Sect. 6. Here, the ratio of the particle-level to the simulated gap fraction distributions, obtained with the tt sample from MadGraph, provides the correction which is applied to the data.
The measured gap-fraction distribution is compared to predictions from MadGraph+pythia, powheg+pythia, and mc@nlo+herwig, and to the predictions from the MadGraph samples with varied renormalisation and factorisation scales and jet-parton matching threshold. In Fig. 8 the gap fraction is measured as a function of the p T of the leading additional jet (left) and as a function of H T (right), with the thresholds (defined at the abscissa where the data point is shown) varied between 35 and 380 GeV. Table 4 Normalised differential tt production cross section as a function of the jet multiplicity for jets with p T > 30 GeV in the dilepton channel. The statistical, systematic, and total uncertainties are also shown. The main experimental and model systematic uncertainties are displayed: JES and the combination of renormalisation and factorisation scales, jet-parton matching threshold, and hadronisation (in the  The results are summarised in Tables 5 and 6, respectively. The measurements are consistent among the three dilepton channels. The gap fraction is lower as a function of H T showing that the measurement is probing quark and gluon emission beyond the first emission. The gap fraction is better described by mc@nlo +herwig compared to MadGraph+pythia and powheg+pythia. This result is not incompatible with the observation described above, because the gap fraction requires the jets to have a certain p T above the threshold, which does not imply necessarily large jet multiplicities. Decreasing the renormalisation and factorisation scales or matching threshold in the MadGraph sample worsens the agreement between data and simulation. The total systematic uncertainty is about 3.5 % for values of the threshold ( p T or H T ) below 40 GeV, and decreases to 0.2 % for values of the thresholds above 200 GeV. Dominant sources of systematic uncertainty arise from the uncertainty in the JES and the background contamination, corresponding to approximately 2 and 1 % systematic uncertainty, respectively, for the smallest p T and H T values. Other sources with smaller impact on the total uncertainty are the b-tagging efficiency, JER, pileup, and the procedure used to correct the data to particle level.  (right) in the dilepton channels. Data are compared to predictions from MadGraph+pythia, powheg+pythia, and mc@nlo+herwig (top), as well as from MadGraph with var-ied renormalisation and factorisation scales, and jet-parton matching threshold (bottom). The error bars on the data points indicate the statistical uncertainty. The shaded band corresponds to the combined statistical and total systematic uncertainty (added in quadrature)

Summary
Measurements of the normalised differential tt production cross section as a function of the number of jets in the dilepton (ee, µµ, and eμ) and +jets (e+jets, μ+jets) channels are presented. The measurements are performed using a data sample corresponding to an integrated luminosity of 5.0 fb −1 collected in pp collisions at √ s = 7 TeV with the CMS detector. The results are presented in the visible phase space and compared with predictions of perturbative quantum chromodynamics from MadGraph and powheg interfaced with pythia, and mc@nlo interfaced with herwig, as well as MadGraph with varied renormalisation and factorisation scales, and jet-parton matching threshold. The normalised differential tt production cross section is also measured as a function of the jets radiated in addition to the tt decay products in the +jets channel. The MadGraph+pythia and powheg+pythia predictions describe the data well up to high jet multiplicities, while mc@nlo+herwig predicts fewer events with large number of jets. The gap fraction is measured in dilepton events as a function of the p T of the leading additional jet and the scalar sum of the p T of the additional jets, and is also compared to different theoretical predictions. No significant deviations are observed between data and simulation. The mc@nlo+herwig model seems to more accurately describe the gap fraction for all values of the thresholds compared to MadGraph+pythia and powheg+pythia.