Measurement of jet activity in top quark events using the $e\mu$ final state with two $b$-tagged jets in $pp$ collisions at $\sqrt{s}=8$ TeV with the ATLAS detector

Measurements of the jet activity in $t\bar{t}$ events produced in proton--proton collisions at $\sqrt{s}=8$ TeV are presented, using $20.3\,$fb$^{-1}$ of data collected by the ATLAS experiment at the Large Hadron Collider. The events were selected in the dilepton $e\mu$ decay channel with two identified $b$-jets. The numbers of additional jets for various jet transverse momentum ($p_T$) thresholds, and the normalised differential cross-sections as a function of $p_T$ for the five highest-$p_T$ additional jets, were measured in the jet pseudorapidity range $|\eta|<4.5$. The gap fraction, the fraction of events which do not contain an additional jet in a central rapidity region, was measured for several rapidity intervals as a function of the minimum $p_T$ of a single jet or the scalar sum of $p_T$ of all additional jets. These fractions were also measured in different regions of the invariant mass of the $e\mu b\bar{b}$ system. All measurements were corrected for detector effects, and found to be mostly well-described by predictions from next-to-leading-order and leading-order $t\bar{t}$ event generators with appropriate parameter choices. The results can be used to further optimise the parameters used in such generators.


Introduction
The top quark plays a special role in the Standard Model and in some theories of physics beyond the Standard Model. The large top quark mass and large tt pair-production cross-section in pp collisions (242 ± 10 pb at √ s = 8 TeV [1]) make top quark production at the Large Hadron Collider (LHC) a unique laboratory for studying the behaviour of QCD at the highest accessible energy scales. The decays of top quarks to charged leptons, neutrinos and b-quarks also make such events a primary source of background in many searches for new physics. Therefore, the development of accurate modelling for events involving top quark production forms an important part of the LHC physics programme. Measurements of the activity of additional jets in tt events, i.e. jets not originating from the decay of the top quark and antiquark, but arising from quark and gluon radiation produced in association with the tt system, have been made by ATLAS [2,3] and CMS [4] using pp data at √ s = 7 TeV, and by CMS [5] at √ s = 8 TeV. These data are typically presented as particle-level results in well-defined fiducial regions, corrected to remove detector efficiency and resolution effects, and compared to the predictions of Monte Carlo (MC) generators through tools such as the Rivet framework [6]. Such comparisons indicate that some state-ofthe-art generators have difficulties in reproducing the data, whilst for others agreement with data can be improved with an appropriate choice of generator parameter values or 'tune', including those controlling QCD factorisation and renormalisation scales, and matching to the parton shower [7-11].
This paper presents two studies of the additional jet activity in tt events collected with the ATLAS detector in pp collisions at a centre-of-mass energy of √ s = 8 TeV. Top quark pairs are selected in the same way in both measurements, using the dilepton eµ final state with two jets identified ('tagged') as likely to contain b-hadrons. Distributions of the properties of additional jets in these events are normalised to the crosssection (σ eµbb ) for events passing this initial selection, requiring the electron, muon and two b-tagged jets to have transverse momentum p T > 25 GeV and pseudorapidity 1 |η| < 2.5.
In the first study, the normalised particle-level cross-sections for additional jets with |η| < 4.5 and p T > 25 GeV are measured differentially in jet rank and p T ; with rank i = 1 to 5, where i = 1 denotes the leading (highest p T ) additional jet. These normalised differential cross-sections are then used to obtain the multiplicity distributions for additional jets as a function of the minimum p T threshold for such extra jets.
The additional-jet differential cross-section measurements are complemented by a second study measuring the jet 'gap fraction', i.e. the fraction of events where no additional jet is present within a particular interval of jet rapidity, denoted by ∆y. The gap fraction is measured as a function of the jet p T threshold, 1 ATLAS uses a right-handed coordinate system with its origin at the nominal interaction point in the centre of the detector, and the z axis along the beam line. Pseudorapidity is defined in terms of the polar angle θ as η = − ln tan θ/2, and transverse momentum and energy are defined relative to the beamline as p T = p sin θ and E T = E sin θ. The azimuthal angle around the beam line is denoted by φ, and distances in (η, φ) space by ∆R = (∆η) 2 + (∆φ) 2 . The rapidity is defined as y = 1 2 ln E+pz E−pz , where p z is the z-component of the momentum and E is the energy of the relevant object. starting from a minimum Q 0 of 25 GeV, where σ(Q 0 ) is the cross-section for events having no additional jets with p T > Q 0 , within the rapidity interval ∆y. Following the corresponding measurement at √ s = 7 TeV [2], four rapidity intervals ∆y are defined: |y| < 0.8, 0.8 < |y| < 1.5, 1.5 < |y| < 2.1 and the inclusive interval |y| < 2.1. These intervals are more restrictive than for the normalised additional jet cross-sections, which are measured over the wider angular range |η| < 4.5 corresponding to the full acceptance of the detector.
As well as f (Q 0 ), the gap fraction is measured as a function of a threshold Q sum placed on the scalar sum of the p T of all additional jets with p T > 25 GeV within the same rapidity intervals ∆y: The gap fraction measured as a function of Q 0 is sensitive to the leading p T emission accompanying the tt system, whereas the gap fraction based on Q sum is sensitive to all accompanying hard emissions. Finally, the gap fractions f (Q 0 ) and f (Q sum ) in the inclusive rapidity region |y| < 2.1 are also measured separately for four subsets of the invariant mass of the eµbb system m eµbb , which is related to the invariant mass of the produced tt system and is on average higher if produced from quark-antiquark rather than gluon-gluon initial states.
This paper is organised as follows. Section 2 describes the ATLAS detector and the data sample used for these measurements. Section 3 provides information about the Monte Carlo simulated samples used to model signal and background processes, and to compare with the measured results. The common object and event selection criteria are presented in Section 4, and sources of systematic uncertainty are discussed in Section 5. The measurement of the normalised jet differential cross-sections by rank and p T is described in Section 6 and the measurement of the gap fraction is presented in Section 7, in both cases including comparisons with the predictions of various tt event generators. Section 8 gives a summary and conclusions.

Detector and data sample
The ATLAS detector [12] at the LHC covers almost the full solid angle around the collision point, and consists of an inner tracking detector surrounded by a thin superconducting solenoid magnet producing a 2 T axial magnetic field, electromagnetic and hadronic calorimeters, and an external muon spectrometer incorporating three large toroidal magnet systems. The inner detector consists of a high-granularity silicon pixel detector and a silicon microstrip tracker, together providing precision tracking in the pseudorapidity range |η| < 2.5, complemented by a transition radiation tracker providing tracking and electron identification information for |η| < 2.0. A lead/liquid-argon (LAr) electromagnetic calorimeter covers the region |η| < 3.2, and hadronic calorimetry is provided by steel/scintillator tile calorimeters for |η| < 1.7 and copper/LAr hadronic endcap calorimeters covering 1.5 < |η| < 3.2. The calorimeter system is completed by forward LAr calorimeters with copper and tungsten absorbers which extend the coverage to |η| = 4.9. The muon spectrometer consists of precision tracking chambers covering the region |η| < 2.7, and separate trigger chambers covering |η| < 2.4. A three-level trigger system, using custom hardware followed by two software-based levels, is used to reduce the event rate to about 400 Hz for offline storage.
The analyses were performed on the 2012 ATLAS proton-proton collision data sample, corresponding to an integrated luminosity of 20.3 fb −1 at √ s = 8 TeV after the application of detector status and data quality requirements. The integrated luminosity was measured using the methodology described in Ref.
[13] applied to beam separation scans performed in November 2012, and has a relative uncertainty of 2.8 %. Events were required to pass either a single-electron or single-muon trigger, with thresholds chosen such that the efficiency plateau is reached for leptons with p T > 25 GeV passing offline selections. Each triggered event also includes the signals from an average of 20 additional inelastic pp collisions in the same bunch crossing (referred to as pile-up).
Alternative tt simulation samples were used to evaluate systematic uncertainties, and were compared with the data measurements after unfolding for detector effects. Samples were produced with Powheg with h damp = ∞ interfaced to Herwig (version 6.520) [37,38] with the ATLAS AUET2 tune [39] and Jimmy (version 4.31) [40] for underlying-event modelling. Samples with h damp = m t , which softens the tt p T spectrum, improving the agreement between data and simulation at √ s = 7 TeV [7], were generated by combining Powheg with either Pythia6 with the P2011C tune or Pythia8 (version 8.186) with the A14 tune [41]. Samples were also produced with MC@NLO (version 4.01) [42,43] interfaced to Herwig and Jimmy, with the generator's default renormalisation and factorisation scales of m 2 t + (p 2 T,t + p 2 T,t )/2 where p T,t and p T,t are the transverse momenta of the top quark and antiquark. Several leading-order 'multi-leg' generators were also studied. The Alpgen generator (version 2.13) [44] was used with leadingorder matrix elements for tt production accompanied by up to three additional light partons, and dedicated matrix elements for tt plus bb or cc production, interfaced to Herwig and Jimmy. An alternative sample was generated with Alpgen interfaced to Pythia6 with the P2011C tune, including up to four additional light partons. The MLM parton-jet matching scheme [44] was applied to avoid double-counting of configurations generated by both the parton shower and the matrix-element calculation. A further sample was generated using MadGraph 5 (version 1.5.11) [45] with up to three additional partons and using MLM matching, interfaced to Pythia6 with the P2011C tune. Finally, three pairs of samples with matching scale and parton shower parameters tuned to explicitly vary the amount of additional radiation in tt events were used, generated using AcerMC (version 3.8) [46], Alpgen or MadGraph, each interfaced to Pythia6 with either the RadLo or RadHi P2011C tunes [26]. The parameters of these samples were tuned to span the variations in radiation compatible with the ATLAS tt gap fraction measurements at √ s = 7 TeV [2] as discussed in detail in Ref. [7].
After the eµbb event selection, the expected non-tt contribution is dominated by Wt, the associated production of a W boson and a single top quark. This process is distinct from tt production when considered at leading order. But at NLO in QCD the two processes cannot be separated once the top quarks decay to Wb: the resulting WbWb final state can appear for example through both gg → tt → WbWb and gg → Wtb → WbWb, and the two processes interfere to an extent depending on the kinematics of the final state. However, the currently available generators do not allow a full treatment of this interference; instead they consider tt and Wt production as separate processes. Within this approximation, the 'diagram removal' and 'diagram subtraction' schemes have been proposed as alternatives for approximately handling the interference between the tt and Wt processes [47,48]. For this paper, Wt production was simulated as a process separate from tt, using Powheg + Pythia6 with the CT10 PDFs and the P2011C tune. The diagram removal scheme was used as the baseline and the diagram subtraction scheme was used to assess systematic uncertainties. A cross-section of 22.4 ± 1.5 pb was assumed for Wt production, determined by using the approximate NNLO prediction described in Ref. [49].
Other backgrounds with two prompt leptons arise from diboson production (WW, WZ and ZZ) accompanied by b-tagged jets, modelled using Alpgen + Herwig + Jimmy with CTEQ6L1 PDFs and with total cross-sections calculated using MCFM [50]; and Z → ττ(→ eµ)+jets, modelled using Alpgen + Pythia6 with CTEQ6L1 PDFs, and including leading-order matrix elements for Zbb production. The normalisation of this background was determined from data using Z → ee/µµ events with two b-tagged jets as described in Ref. [1]. The remaining background originates from events with one prompt and one misidentified lepton, e.g. a non-prompt lepton from the decay of a bottom or charm hadron, an electron from a photon conversion, hadronic jet activity misidentified as an electron, or a muon produced from an inflight decay of a pion or kaon. Such events can arise from tt production with one hadronically decaying W, modelled as for dileptonic tt production with Powheg + Pythia6; W+jets production, modelled as for Z+jets; and t-channel single-top production, modelled using AcerMC + Pythia6 with CTEQ6L1 PDFs. Previous studies have shown that these simulation samples provide a good model of the rate and kinematic distributions of eµbb events with one real and one misidentified lepton [1]. The expected contributions to the additional-jet distributions from tt production in association with a W, Z or Higgs boson are below the percent level. Other backgrounds, including processes with two misidentified leptons, are negligible.

Object and event selection
The two analyses use the same object and event selection as employed in the ATLAS inclusive tt crosssection analysis at √ s = 8 TeV [1]. Electrons were identified as described in Ref. [51], required to have transverse energy E T > 25 GeV and pseudorapidity |η| < 2.47, and to be isolated to reduce backgrounds from non-prompt and misidentified electrons. Electron candidates within the transition region between the barrel and endcap electromagnetic calorimeters, 1.37 < |η| < 1.52, were removed. Muons were identified as described in Ref. [52], required to have p T > 25 GeV and |η| < 2.5, and also required to be isolated.
Jets were reconstructed using the anti-k t algorithm [53,54] with radius parameter R = 0.4, starting from clusters of energy deposits in the calorimeters, calibrated using the local cluster weighting method [55]. Jets were calibrated using an energy-and η-dependent simulation-based scheme, with the effects of pileup on the jet energy measurement being reduced using the jet-area method described in Ref. [56]. After the application of in situ corrections based on data [57], jets were required to satisfy p T > 25 GeV and |η| < 4.5. To suppress the contribution from low-p T jets originating from pile-up interactions, a jet vertex fraction (JVF) requirement was applied to jets with p T < 50 GeV and |η| < 2.4 [58]. Such jets were required to have at least 50 % of the scalar sum of the p T of tracks associated with the jet originating from tracks associated with the event primary vertex, the latter being defined as the reconstructed vertex with the highest sum of associated track p 2 T . Jets with no associated tracks were also selected. To prevent double-counting of electron energy deposits as jets, jets within ∆R = 0.2 of a reconstructed electron were removed. Finally, to further suppress non-isolated leptons from heavy-flavour decays inside jets, electrons and muons within ∆R = 0.4 of selected jets were also discarded.
Jets containing b-hadrons were tagged using the MV1 algorithm, a multivariate discriminant making use of track impact parameters and reconstructed secondary vertices [59]. Jets were defined to be b-tagged if the MV1 discriminant value was larger than a threshold corresponding to a 70 % efficiency for tagging b-quark jets in tt events, giving a rejection factor of about 140 against light-quark and gluon jets, and about five against jets originating from charm quarks.
Events were required to have a reconstructed primary vertex with at least five associated tracks. Events with any jets failing jet quality requirements [57], or with any muons compatible with cosmic-ray interactions or suffering substantial energy loss through bremsstrahlung in the detector material, were removed. An event preselection was then applied, requiring exactly one electron and one muon selected as described above, with opposite-sign electric charges. At least one of the leptons was required to be matched to an electron or muon object triggering the event. Finally, selected events were required to have at least two b-tagged jets. The resulting eµbb event selection is similar to that of the √ s = 8 TeV sample with two b-tagged jets used in Ref. [1], except that events with three or more b-tagged jets are also accepted. 2 The numbers of preselected opposite-sign eµ and selected eµbb events are shown in Table 1. The observed event count after requiring at least two b-tagged jets is in good agreement with the prediction from the baseline simulation.
Additional jets were defined as those other than the two b-tagged jets used to select the event. For the jet normalised differential cross-section measurements, in the 3 % of selected events with three or more b-tagged jets, the jets with the two highest MV1 b-tagging weight values were taken to be the b-jets from the top quark decays, and any other b-tagged jets were considered as additional jets, along with all  Table 1: Selected numbers of events with an opposite-sign eµ pair, and with an opposite-sign eµ pair and at least two b-tagged jets in data, compared with the predictions from the baseline simulation, broken down into contributions from tt, Wt and minor background processes. The predictions are normalised to the same integrated luminosity as the data.
untagged jets. Distributions of the number of additional jets are shown for various jet p T thresholds in Figure 1. The p T distributions for reconstructed additional jets are shown in Figure 2, with the estimated contribution from 'unmatched jets' (defined in Section 4.2 below) shown separately. In both cases, the data are shown compared to the predictions from simulation with the baseline Powheg + Pythia6 (h damp = ∞) tt sample plus backgrounds, and the predictions from alternative tt simulation samples generated with Powheg + Pythia6 and Powheg + Pythia8 with h damp = m t , Powheg + Herwig with h damp = ∞ and MC@NLO + Herwig. The jet multiplicity distributions and p T spectra in the simulation samples are generally in reasonable agreement with those from data, except for MC@NLO + Herwig, which underestimates the number of events with three or more extra jets, and also predicts significantly softer jet p T spectra.
The gap fraction measurements use the same basic eµbb event selection, but restricting the additional jets to the central rapidity region, |y| < 2.1. If three or more jets were b-tagged, the two highest-p T jets were considered as the b-jets from the top quark decays, and the others as additional jets. This definition follows the p T -ordered selection used at particle level, and is different from that used in the differential cross-section analysis, as discussed in Sections 4.1 and 4.2 below. Distributions of the p T and |y| of the leading additional jet according to this definition are shown in Figure 3. The predictions generally describe the data well, and the trends seen are similar to those seen for the leading jet over the full rapidity region in Figure 2(a).

Particle-level selection
To facilitate comparisons with theoretical predictions, the measured jet differential cross-sections and gap fractions were corrected to correspond to the particle level in simulation, thus removing reconstruction efficiency and resolution effects. At particle level, electrons and muons were defined as those originating from W decays, including via the leptonic decay of a τ lepton (W → τ → e/µ). The electron and muon four-momenta were defined after final-state radiation, and 'dressed' by adding the four-momenta of all photons within a cone of size ∆R = 0.1 around the lepton direction, excluding photons from hadron decays or interactions with detector material. Jets were reconstructed using the anti-k t algorithm with radius parameter R = 0.4 from all final-state particles with mean lifetime greater than 3 × 10 −11 s, excluding dressed leptons and neutrinos not originating from the decays of hadrons. Particles from the underlying event were included, but those from overlaid pile-up collisions were not. Selected jets were     required to have p T > 25 GeV and |η| < 4.5, and those within ∆R = 0.2 of a particle-level electron were removed. Particle-level jets containing b-hadrons were identified using a ghost-matching procedure [60], where the four-momenta of b-hadrons were scaled to a negligible magnitude and included in the set of particles on which the jet clustering algorithm was run. Jets whose constituents included b-hadrons after this procedure were labelled as b-jets.
The particle-level eµbb event selection was defined by requiring one electron and one muon with p T > 25 GeV and |η| < 2.5, each separated from the nearest jet by ∆R > 0.4, and at least two b-jets with p T > 25 GeV and |η| < 2.5. This closely matches the event selection used at reconstruction level.

Jet matching
For the definition of the gap fraction at particle level, if three or more b-jets were found, the two highestp T jets were considered to be the b-jets from the top decays, and all other jets were considered to be additional jets, whether labelled b-jets or not. In contrast, the differential jet cross-section measurements require an explicit jet-by-jet matching of particle-level to reconstructed jets. This was achieved by first calculating the ∆R between each particle-level jet passing a looser requirement of p T > 10 GeV and each reconstructed b-tagged jet, considering the two with highest MV1 weight if more than two reconstructed jets were b-tagged. Ordering the b-tagged jets by MV1 weight was found to give a greater fraction of correct matches than the jet p T ordering used for the gap fraction measurements, where no jet matching is needed. If the closest reconstructed b-tagged jet was within ∆R < 0.4, the particle-level and reconstructed jets were considered matched. The procedure was then repeated with the remaining particle-level and reconstructed jets, allowing each particle-level and reconstructed jet to be matched only once. Reconstructed jets which remained unassociated with particle-level jets after this procedure are referred to as 'unmatched' jets; these originate from single particle-level jets which are split in two at reconstruction level (only one of which is matched), and from pile-up (since particles from pile-up collisions are not considered in the particle-level jet clustering). The contributions from such unmatched jets are shown separately in Figure 2.

Evaluation of systematic uncertainties
Monte Carlo simulation was used to determine selection efficiencies, detector resolution effects and backgrounds. The corresponding systematic uncertainties were evaluated as discussed in detail below, and propagated through the jet differential cross-section and gap fraction measurements.
tt modelling: Although the analyses measure the properties of additional jets in tt events, they are still slightly sensitive to the modelling of such jets in simulation due to the finite jet energy resolution and reconstruction efficiency, as well as the modelling of other tt event properties related to the leptons and b-jets from the top quark decays. The corresponding uncertainties were assessed by comparing samples from the different generator configurations described in Section 3. In the differential cross-section measurement, which is sensitive to the modelling of multiple additional jets, the uncertainty due to the choice of matrix-element generator was determined by comparing the NLO generator Powheg with the leading-order multi-leg generator MadGraph, both interfaced to Py-thia6. In the gap fraction measurements, which are more sensitive to an accurate modelling of the first additional jet, the corresponding uncertainty was assessed by comparing the NLO generators Powheg and MC@NLO, both interfaced to Herwig. The choice of parton shower and hadronisation model was studied for both analyses by comparing samples with Powheg interfaced either to Pythia6 or to Herwig. In all these cases, the full difference between the predictions from the two compared samples was assigned as the corresponding systematic uncertainty. The uncertainty due to the modelling of additional radiation was calculated as half the difference between the results using MadGraph + Pythia6 (differential cross-section) or Alpgen + Pythia6 (gap fraction) samples with tunes giving more or less parton shower radiation, spanning the results from the √ s = 7 TeV gap fraction measurement [2]. These three systematic components were added in quadrature to give the total tt modelling uncertainty.
Simulation statistical uncertainty: In addition to the modelling uncertainties discussed above, the size of the tt simulation samples was also taken into account.
Parton distribution functions: The uncertainties due to limited knowledge of the proton PDFs were evaluated by reweighting the MC@NLO + Herwig simulated tt sample based on the x and Q 2 values of the partons participating in the hard scattering in each event. The samples were reweighted using the eigenvector variations of the CT10 [23], MSTW2008 [18] and NNPDF 2.3 [36] NLO PDF sets. The final uncertainty was calculated as half the envelope encompassing the predictions from all three PDF sets along with their associated uncertainties, following the PDF4LHC recommendations [33].
Jet energy scale: The uncertainty due to the jet energy scale (JES) was evaluated by varying it in simulation using a model with 23 separate orthogonal uncertainty components [57]. These components cover in situ measurement uncertainties, the cross-calibration of different η regions, and the dependence on pile-up and the flavour of the jets. The total jet energy scale uncertainty varies in the range 1-6 % with a dependence on both jet p T and |η|.
Jet energy resolution/efficiency: The jet energy resolution (JER) was found to be well-modelled in simulation [61], and residual uncertainties were assessed by applying additional smearing to the simulated jet energies. The calorimeter jet reconstruction efficiency was measured in data using trackbased jets, and found to be generally well-described by the simulation. Residual uncertainties were assessed by discarding 2 % of jets with p T < 30 GeV; the uncertainties for higher-momentum jets are negligible. Both these uncertainties were symmetrised about the nominal value. The uncertainty due to the veto on events failing jet quality requirements is negligible.
Unmatched jets modelling: The modelling of the component of unmatched jets from pile-up collisions was checked by comparing the predictions from simulated tt events combined with either Powheg+Pythia8 pile-up simulation or 'zero-bias' data. The latter were recorded from randomly triggered bunch crossings throughout the data-taking period, and reweighted to match the instantaneous luminosity distribution in the simulated tt sample. The estimated number of additional jets per event from pile-up is 0.017 ± 0.002 in the central region used by the gap fraction measurements (|y| < 2.1) and 0.038 ± 0.005 over the full region used by the differential cross-section measurements (|η| < 4.5). The uncertainties represent the full difference between the rate in zero-bias data and simulation. The rate of unmatched jets in simulation was varied by these uncertainties in order to determine the effect on the results. In the differential cross-section measurements, the full rate of particle-level jets that were split in two at reconstruction level in the baseline simulation was taken as an additional uncertainty on the rate of unmatched jets.
Jet vertex fraction: In both measurements, the contribution of jets from pile-up within |η| < 2.4 was reduced by the JVF requirement described in Section 4. The uncertainties in the efficiency on nonpile-up jets of the JVF requirement were assessed by varying the cut value in simulation, based on studies of Z → ee and Z → µµ events [56].
Other detector uncertainties: The modelling of the electron and muon trigger and identification efficiencies, energy scales and resolutions were studied using Z → ee/µµ, J/ψ → ee/µµ and W → eν events in data and simulation, using the techniques described in Refs. [51,62,63]. The uncertainties in the efficiencies for b-tagging b, c and light-flavour jets were assessed using studies of b-jets containing muons, jets containing D * mesons, and inclusive jet events [59]. The resulting uncertainties in the measured normalised differential jet distributions and gap fractions are very small, since these uncertainties typically affect the numerators and denominators in a similar way.
Backgrounds: As shown in Table 1, the most significant background comes from Wt single-top events.
The uncertainty due to this background was assessed by conservatively doubling and removing the estimated Wt contribution, taking half the difference in the result between these extreme variations. The sensitivity to the modelling of Wt single-top events was also assessed by using a sample simulated with Powheg + Pythia6 using the diagram subtraction scheme [47,48] instead of the baseline diagram removal scheme. The uncertainty due to Z+jets and diboson background is negligible in comparison. In the gap fraction measurements, the additional background uncertainty from events with a misidentified lepton was also assessed by doubling and removing it, a conservative range according to the studies of Ref. [1]. In the jet differential cross-section measurements, the misidentification of jets as leptons induces migration in the additional-jet rank distributions, and is corrected for as part of the unfolding procedure. The resulting effects on the unfolding corrections are significantly smaller than the uncertainties from considering different tt generators, and no additional uncertainty was included.
Each independent uncertainty was evaluated according to the prescription above and then added in quadrature to obtain the total systematic uncertainty in the final measurements. Since both measurements are effectively ratios of cross-sections, normalised to the total number of selected eµbb events, many of the systematic uncertainties that typically contribute to a tt cross-section measurement cancel, such as those in the integrated luminosity, lepton trigger and identification efficiencies, lepton momentum scales and resolution, and b-jet energy scale and tagging efficiency. Instead, the significant systematic uncertainties are those that directly affect the measured additional-jet activity, i.e. systematic uncertainties in the jet energy scale and resolution, and the modelling of unmatched jets.
6 Measurement of jet multiplicities and p T spectra The normalised differential cross-sections for additional jets, corrected to the particle level, were measured as a function of jet multiplicity and p T as defined in Equation (1). The fiducial requirements for event and object selection are defined in Section 4.1, and include additional jets in the range |η| < 4.5. As discussed in Section 3, the fiducial region receives contributions from both the tt and Wt processes. Although the requirement for two b-tagged jets ensures that tt is dominant, once the Wt process is considered at NLO, the two processes cannot in principle be cleanly separated. Therefore the results are presented both with the Wt contribution subtracted, to allow comparison with the tt generators discussed in Section 3, and for the combined tt + Wt final state, which may be compared with future NLO calculations treating tt and Wt concurrently. In practice, since the results are normalised to the number of selected eµbb events, from tt or tt + Wt as appropriate in each case, and the predicted additional-jet distributions in simulated tt and Wt events are rather similar, the results from the two definitions are very close.

Correction to particle level
The correction procedure transforms the measured spectra shown in Figure 2, after background subtraction, to the particle-level spectra for events that pass the fiducial requirements. The unfolding was performed using a one-dimensional distribution encoding both the rank and p T of each additional jet in each selected eµbb event, as shown in Table 2 and graphically in Figure 4. The integral of the input (measured) distribution is the number of measured jets in the eµbb sample and the integral of the output (unfolded) distribution is the number of particle-level jets passing the fiducial requirements. This procedure involves several steps, as defined in the equation: Here, the bin indices j and k are functions of both jet p T and rank, with k corresponding to the appropriate p T bin of the jet of rank i at particle level under consideration. The expression 1 σ eµbb dσ jet i dp T represents the measured differential cross-section, i.e. the final number of corrected jets per event in each bin divided by ∆ k , the width of the p T bin in units of GeV. The number of events in data passing the eµbb selection requirements is represented by N events . The raw data event count reconstructed in bin j is represented by N j reco . The estimated additional-jet background, N j bkgd , is subtracted from this raw distribution. The factor g j corrects for migration across the fiducial boundaries in p T and η (e.g. cases where the reconstructed jet has p T > 25 GeV but the particle-level jet has p T < 25 GeV). The expression M −1 unfolded, k reco, j represents the application of an unfolding procedure mapping the number of jets reconstructed in bin j to the number of jets in bin k at particle level in events which pass both the reconstruction-and particle-level selections. The correction factor f k removes the bias in the unfolded additional-jet spectrum coming from the reconstruction-level selection, as discussed further below. The response matrix M unfolded, k reco, j encodes the fractions of jets in particle-level bin k which get reconstructed in bin j, with both k and j being obtained from the corresponding jet p T and rank. The matrix is filled from simulated events that pass both the reconstructed and particle-level selection requirements. Figure 4(a) provides a graphical representation of M unfolded, k reco, j . The matrix is largely diagonal, showing that jets are most likely to be reconstructed with the correct p T and rank. However, there are significant numbers of particle-level subleading jets reconstructed as leading jets and particle-level leading jets reconstructed as subleading jets, particularly when several jets in the event have similar low p T values. This type of migration motivates the simultaneous binning in both rank and p T .

ATLAS Simulation
A Bayesian iterative unfolding method [64] implemented in the RooUnfold [65] software package was used. The response matrix M is not unitary because in mapping from particle to reconstruction level, some events and objects are lost due to inefficiencies and some are gained due to misreconstruction or migration of objects from outside the fiducial acceptance into the reconstructed distribution. This results in the response matrix being almost singular, and it is therefore not possible to obtain stable unfolded results by inverting the response matrix and applying it to the measured data. Instead, an assumed particle-level distribution (the 'prior') was chosen, the response matrix applied and the resulting trial reconstruction set was compared to the observed reconstruction set. A new prior was then constructed from the old prior and the difference between the trial and the observed distributions. The procedure was iterated until the result became stable. For this analysis, two iterations were found to be sufficient, based on studies of the unfolding performance in simulated samples with reweighted jet p T distributions and from different generators.
This unfolding procedure gives unbiased additional-jet distributions for events passing both the particlelevel and reconstruction-level event selections. However, the reconstruction-level selection results in the unfolded distributions differing from those obtained using the particle-level selection alone. An additional contribution to the bias results from events where one of the two reconstructed b-tagged jets is actually a mistagged light jet. These biases were corrected using a bin-by-bin correction factor f k = N k truth /N k unfolded , where N k truth is the number of jets in bin k at particle level without the application of the reconstruction-level event selection. The correction was applied after the unfolding, as shown in Equation (4). Figure 4(b) shows the values of f for both the baseline and some alternative tt generators. The corresponding systematic uncertainty was assessed as part of the tt modelling uncertainty as discussed in Section 5.
The procedure described above provides the absolute numbers of additional jets in the number of events passing the eµbb fiducial requirements (N events ). This result was then normalised relative to N events to obtain the final distribution 1 σ eµbb dσ jet i dp T , which was finally integrated over jet p T to obtain the jet multiplicity distributions.

Determination of systematic uncertainties
All systematic uncertainties were evaluated as full covariance matrices including bin-to-bin correlations. The majority of uncertainties from Section 5 are defined in terms of an RMS width, with the assumption that the true distribution is Gaussian with a mean at the nominal value. In these cases, the covariance matrix was calculated from pseudo-experiments drawn from this distribution. Each pseudo-experiment was constructed by choosing the size of the systematic uncertainty randomly according to a Gaussian distribution, calculating the resulting effect at the reconstruction level and propagating it through the unfolding procedure. The covariance was then given by where N pseudo is the number of pseudo-experiments (typically 1000), N i is the nominal number of jets in bin i, and N i x is the number of jets in bin i for pseudo-experiment x. Some systematic uncertainties were evaluated by comparing an alternative model to the baseline. In these cases, the covariance was approximated by where δ i is the bias in bin i. This bias was determined by analysing the alternative model using Equation (4), with the response matrix and correction factors taken from the baseline.
The uncertainties calculated using Equation (5) include all detector modelling effects (e.g. jet energy scale and resolution), PDFs, the Wt cross-section and statistical uncertainties associated with the simulated samples. Uncertainties evaluated using Equation (6) include generator, radiation, parton shower and hadronisation contributions to the tt modelling uncertainty, and modelling of the unmatched jet background. Figure 5 shows the fractional uncertainties in the corrected jet distributions. In most bins, the statistical uncertainty dominates, with the largest systematic uncertainty coming from the jet energy scale.
6.3 Jet multiplicity and p T spectra results  Figures 8-9 show the normalised differential cross-sections 1 σ eµbb dσ jet i dp T for jets of rank i from one to four. In both cases, the expected contributions from Wt events were subtracted from the event counts before normalising the distributions, based on the baseline Powheg+Pythia6 Wt simulation sample. The same data are presented numerically in Table 2, both with and without subtraction of the Wt contribution, and including two p T bins for the fifth jet. The highest p T bin for each jet rank includes overflows, but the differential cross-sections are normalised using the bin widths ∆ derived from the upper p T bin limits listed in Table 2 and shown in Figures 8 and 9. Table 2: Normalised particle-level differential jet cross-sections as a function of jet rank and p T , both without (σ tt+Wt ) and with (σ tt ) the Wt contribution subtracted. The additional jets are required to have |η| < 4.5, corresponding to the full pseudorapidity range . The boundaries of each bin are given, together with the mean jet p T in each bin. The last bin for every jet rank includes overflows, but the differential cross-section values are determined using the upper bin limit given for that bin.

Bin Rank p T range [GeV]
Avg.  All the NLO generators provide a reasonable description of the leading jet, which might be expected since they include one additional jet in the matrix-element calculation of the tt process. Differences among the generators become larger with increasing jet rank, where the prediction from the NLO generators is determined mainly by the parton shower. In this region, the generators predict significantly different rates of additional-jet production. They also predict some differences in the shapes of the jet p T spectra. The MC@NLO + Herwig sample predicts the lowest rate of additional-jet production and underestimates the number of events with at least four additional jets by 40 %.
The same fully corrected data are compared to the leading-order multi-leg generators Alpgen + Pythia6, Alpgen + Herwig and MadGraph + Pythia6 in the second set of ratio plots in Figures 6-9. In all cases, the renormalisation and factorisation scales are set to the defaults provided by the code authors. For leadingorder generators, the predicted cross-section can depend strongly on the choice of QCD scales and parton shower parameters; Figures 6-9 also show the effects of the variations discussed in Section 3 for samples generated with AcerMC + Pythia6, Alpgen + Pythia6 and MadGraph + Pythia6. The measurement gives an uncertainty in the differential cross-sections that is smaller than the range spanned by these variations in the leading-order generators.
[     Figure 8: Unfolded normalised distributions of particle-level jet p T for the first and second additional jet in selected eµbb events. The data are shown as points with error bars indicating the statistical uncertainty, and are compared to simulation from several NLO tt generator configurations. The Wt contribution taken from Powheg + Pythia6 is subtracted from the data. The lower plots show the ratios of the different simulation predictions to data, with the shaded bands including both the statistical and systematic uncertainties of the data.   Figure 9: Unfolded normalised distributions of particle-level jet p T for the third and fourth additional jet in selected eµbb events. The data are shown as points with error bars indicating the statistical uncertainty, and are compared to simulation from several NLO tt generator configurations. The Wt contribution taken from Powheg + Pythia6 is subtracted from the data. The lower plots show the ratios of the different simulation predictions to data, with the shaded bands including both the statistical and systematic uncertainties of the data.  The level of agreement between the generator predictions and the data was assessed quantitatively using a χ 2 test taking into account all bins of the measured jet p T distributions with rank one to five. Since the systematic uncertainties and unfolding corrections induce large correlations between bins, the χ 2 was calculated from the full covariance matrix. Table 3 presents the resulting χ 2 values. Among the NLO generators, Powheg+Herwig, and Powheg + Pythia6 with h damp = ∞ or m t , agree reasonably well with the data. Powheg+Pythia8 is disfavoured and MC@NLO+Herwig gives a very poor description of the data. The leading-order multi-leg generators Alpgen + Pythia6 and MadGraph+Pythia6 agree reasonably well with data, whilst Alpgen+Herwig is slightly disfavoured. Of the three variations of Alpgen + Pythia6, the 'RadLo' variation with less radiation agrees best with data, suggesting that the scale used in the baseline ATLAS tune predicts too much radiation in the fiducial region of this measurement. For Mad-Graph+Pythia6 the opposite is true, and the 'q 2 down' tune, which corresponds to more radiation than the baseline tune, agrees best with data. The AcerMC + Pythia6 samples do not reproduce the data well, regardless of parameter choice.

Gap fraction measurements
The gap fraction f (Q 0 ) as defined in Equation (2) was measured by using the analogous definition for reconstructed jets, counting the number of selected eµbb events N and the number n(Q 0 ) of them that have no additional jets with p T > Q 0 within the rapidity interval ∆y: and similarly for the gap fraction based on Q sum . The values of N and n were first corrected to remove the background contributions estimated from simulation, including the Wt contribution, as this study focuses on the comparison of measured gap fractions with the predictions from the tt generators discussed in Section 3. The measured gap fraction f reco (Q 0 ) was then multiplied by a correction factor C(Q 0 ) to obtain the particle-level gap fraction f part (Q 0 ) defined as in Equation (2) using the particle-level definitions given in Section 4.1. The correction factor was evaluated using the values of f reco (Q 0 ) and f part (Q 0 ) obtained from the baseline Powheg + Pythia6 tt simulation sample: Systematic uncertainties arise in this procedure from the uncertainties in C(Q 0 ) and the backgrounds subtracted before the calculation of N and n.
The gap fractions f (Q 0 ) and f (Q sum ) were measured for the same rapidity regions as used in Ref. [2], namely |y| < 0.8, 0.8 < |y| < 1.5, 1.5 < |y| < 2.1 and the inclusive region |y| < 2.1. The sets of Q 0 and Q sum threshold values chosen also correspond to those in Ref. [2], and the steps correspond approximately to one standard deviation of the jet energy resolution. The values of the correction factor C(Q 0 ) (and similarly for Q sum ) deviate by at most 5 % from unity at low Q 0 and Q sum , and approach unity at higher threshold values. The small corrections reflect the high selection efficiency and high purity of the event samples; at each threshold Q 0 , the baseline simulation predicts that around 80 % of the selected reconstructed events that do not have a jet with p T > Q 0 also have no particle-level jet with p T > Q 0 . Therefore, a simple bin-by-bin correction method is adequate, rather than a full unfolding as used for the differential jet cross-section measurement.
The systematic uncertainties in the gap fraction measurements were evaluated as discussed in Section 5, and the uncertainties from different sources added in quadrature. The results are shown in Figure 10 as relative uncertainties ∆ f / f in the measured gap fraction for two illustrative rapidity intervals, |y| < 0.8 and |y| < 2.1. Figures 11 and 12 show the resulting measurements of the gap fraction f (Q 0 ) in data, corrected to the particle level. Figure 13 shows the analogous results for f (Q sum ), for the |y| < 0.8 and |y| < 2.1 regions only. The gap fraction plots and the first sets of ratio plots compare the data to the same NLO generator configurations as studied in Section 6.3. The middle ratio plots compare the data to the predictions of the leading-order multi-leg generators Alpgen + Pythia6, Alpgen + Herwig and MadGraph + Pythia6. The lower ratio plots compare the data to AcerMC + Pythia6, Alpgen + Pythia6 and MadGraph + Pythia6 samples with increased and decreased levels of parton shower radiation. The numerical values of the gap fraction measurements are presented as a function of Q 0 in Table 4 and as a function of Q sum in Table 5, together with the values predicted by the generators shown in the upper plots of Figures 11, 12 and 13. The matrix of statistical and systematic correlations is shown in Figure 14 for the gap fraction measurement at different values of Q 0 for the full central |y| < 2.1 rapidity region. Nearby points in Q 0 are highly correlated, while well-separated Q 0 points are less correlated. The full covariance matrix including correlations was used to calculate a χ 2 value for the consistency of each of the NLO generator predictions with the data in each veto region. The results are shown in Tables 6 and 7.

Gap fraction results in rapidity regions
All the NLO generators provide a reasonable description of the f (Q 0 ) distribution in the regions |y| < 0.8 and 0.8 < |y| < 1.5. All these generators are also consistent with the data in the most forward region (1.5 < |y| < 2.1), whereas at √ s = 7 TeV, they tended to lie below the data [2]. However, the current  and 150 GeV in data compared to the predictions from various tt simulation samples. The combination of statistical and systematic correlations between measurements at Q 0 = i and Q 0 = j is given as ρ i j .
Data ±(stat.)±(syst.)   ) for different veto-region rapidity intervals and Q sum values of 55, 150 and 300 GeV in data compared to the predictions from various tt simulation samples. The combination of statistical and systematic correlations between measurements at Q sum = i and Q sum = j is given as ρ i j .
Q 0 |y| < 0.8 0.8 < |y| < 1.5 1.5 < |y| < 2.1 |y| < 2.1 Generator 15.6 6.2×10   Figure 11: The measured gap fraction f (Q 0 ) as a function of Q 0 in different veto-region rapidity intervals ∆y, for (a) |y| < 0.8 and (b) 0.8 < |y| < 1.5. The data are shown by the points with error bars indicating the total uncertainty, and compared to the predictions from various tt simulation samples (see text) shown as smooth curves. The lower plots show the ratio of predictions to data, with the data uncertainty being indicated by the shaded band, and the Q 0 thresholds corresponding to the left edges of the histogram bins, except for the first bin.       measurements are significantly more precise in this region, thanks in particular to improvements in the jet energy scale calibration. Over the full rapidity range (|y| < 2.1), Powheg + Pythia8 provides the best description of the data, whilst Powheg + Pythia6 with h damp = m t and MC@NLO + Herwig predict slightly less radiation, and Powheg + Pythia6 with h damp = ∞ and Powheg + Herwig predict slightly more. Powheg + Pythia8 also provides the best description across the individual |y| regions. The results for f (Q sum ), which are sensitive to all the additional jets within the rapidity interval, show somewhat larger differences between the generators than those for f (Q 0 ). Over the rapidity region |y| < 2.1, Powheg + Pythia6 with h damp = m t and MC@NLO + Herwig are disfavoured. The latter generator combination also performs poorly for the differential cross-section measurements discussed in Section 6.
The leading-order generators Alpgen + Pythia6, Alpgen + Herwig and MadGraph +Pythia6 also provide a reasonable description of the gap fraction as a function of Q 0 and Q sum . The pairs of samples with increased/decreased radiation also bracket the data in all rapidity regions, except for AcerMC + Pythia6, which always predicts higher gap fractions than observed at high Q 0 and Q sum . As in the differential crosssection measurements, the data show a clear preference for the 'RadLo' variation for Alpgen + Pythia6 and the 'q 2 down' variation for MadGraph+Pythia6, across all rapidity regions. These data should therefore allow the uncertainties due to radiation modelling in tt events to be significantly reduced, once the models are tuned to these more precise √ s = 8 TeV results rather than the √ s = 7 TeV results used previously [2].

Gap fraction results in eµbb mass regions
The gap fraction was also measured over the full rapidity veto region |y| < 2.1 after dividing the data sample into four regions of m eµbb . The distribution of reconstructed m eµbb in the selected eµbb events is shown in Figure 15, and is reasonably well-reproduced by the baseline tt simulation sample. The distribution was divided into four regions at both reconstruction and particle level: m eµbb < 300 GeV, 300 < m eµbb < 425 GeV, 425 < m eµbb < 600 GeV and m eµbb > 600 GeV. These boundaries were chosen to minimise migration between the regions; in the baseline simulation, around 85 % of the reconstructed events in each m eµbb region come from the corresponding region at particle level. The corresponding correction factors C m (Q 0 ) which translate the measured gap fraction in the reconstruction-level m eµbb region to the corresponding particle-level gap fractions f m (Q 0 ) and f m (Q sum ), are of similar size to C(Q 0 ), with the exception of the highest m eµbb region, where they reach about 1.1 at low Q 0 . The systematic uncertainties in the gap fraction measurement in two m eµbb regions are shown in Figure 16. The magnitudes of the systematic uncertainties are comparable to those in the full m eµbb range, except for the highest m eµbb region where they are significantly larger. Figures 17 and 18 show the resulting measurements of the gap fractions as a function of Q 0 in the four m eµbb regions in data, compared to the same set of predictions as shown in Figures 11, 12 and 13. Tables 8  and 9 show the gap fractions at selected Q 0 and Q sum values in each invariant mass region, again compared to predictions from the first set of generators. Figure 19 gives an alternative presentation of the gap fraction f m (Q 0 ) as a function of m eµbb for four different Q 0 values. The χ 2 values for the consistency of the prediction from each NLO generator with data in the four mass regions are given in Tables 10  and 11. In general, the different generator configurations provide a good model of the evolution of the gap fraction distributions with m eµbb , and similar trends in the predictions of individual generators are seen as for the inclusive |y| < 2.1 results discussed in Section 7.1. However, it can be seen from Figures 18        Figure 17: The measured gap fraction f m (Q 0 ) as a function of Q 0 in the veto region |y| < 2.1 for the invariant mass regions (a) m eµbb < 300 GeV and (b) 300 < m eµbb < 425 GeV. The data are shown by the points with error bars indicating the total uncertainty, and compared to the predictions from various tt simulation samples (see text) shown as smooth curves. The lower plots show the ratio of predictions to data, with the data uncertainty being indicated by the shaded band, and the Q 0 thresholds corresponding to the left edges of the histogram bins, except for the first bin.
[      and 19 that in the 425 < m eµbb < 600 GeV region, the NLO generator predictions split into two groups, with Powheg + Herwig and Powheg + Pythia6 with h damp = ∞ being consistent with the data, and Powheg + Pythia6 with h damp = m t , Powheg + Pythia8 and MC@NLO + Herwig predicting a slightly larger gap fraction (and hence less radiation). In the region with m eµbb > 600 GeV, the measurement uncertainties are too large to discriminate between the predictions.

Conclusions
Studies of the additional jet activity in dileptonic tt events with an opposite-sign eµ pair and two btagged jets have been presented, using 20.3 fb −1 of √ s = 8 TeV pp collision data collected by the ATLAS detector at the LHC. The measurements were corrected to the particle level and defined in a fiducial region corresponding closely to the experimental acceptance, facilitating comparisons with the predictions of different Monte Carlo tt event generators. The additional-jet multiplicity for various jet p T thresholds has been measured in the pseudorapidity region |η| < 4.5, together with the normalised differential crosssections as a function of the first to the fourth jet p T . The gap fraction, the fraction of events with no additional jet above a certain p T threshold, has also been measured in the central rapidity region |y| < 2.1, for subsets of this y region, and as a function of the invariant mass of the eµbb system. Taken together, these measurements can help to characterise the production of additional jets in tt events, an important test of QCD and a significant source of systematic uncertainty in many measurements and searches for new physics at the LHC. The results will be made available in the HepData repository and through the Rivet analysis framework.
The measurements are generally well-described by the predictions of the next-to-leading-order generators used in ATLAS physics analyses. Both Powheg (interfaced to Pythia6, Pythia8 or Herwig) and MC@NLO + Herwig give good descriptions of the p T spectrum of the first additional jet, although MC@NLO + Herwig does not describe higher jet multiplicities, or the gap fraction as a function of a threshold on the sum of the p T of all additional jets. The leading-order multi-leg generators Alpgen, interfaced to Pythia6 or Herwig, and MadGraph interfaced to Pythia6, are also generally compatible with the data. The predictions of these generators are sensitive to the choice of QCD scale and parton shower parameters, and tuning to the precise measurements presented here offers considerable scope for reducing the range of parameter variations which need to be considered when evaluating tt modelling uncertainties, compared to the ranges derived from previous analyses based on smaller √ s = 7 TeV ATLAS data samples.
[3] ATLAS Collaboration, Measurement of the tt production cross-section as a function of jet multiplicity and jet transverse momentum in 7 TeV proton-proton collisions with the ATLAS detector, JHEP 01 (2015) 020, arXiv:1407.0891 [hep-ex].