Measurements of differential cross sections of top quark pair production in association with jets in ${pp}$ collisions at $\sqrt{s}=13$ TeV using the ATLAS detector

Measurements of differential cross sections of top quark pair production in association with jets by the ATLAS experiment at the LHC are presented. The measurements are performed as functions of the top quark transverse momentum, the transverse momentum of the top quark-antitop quark system and the out-of-plane transverse momentum using data from $pp$ collisions at $\sqrt{s}=13$ TeV collected by the ATLAS detector at the LHC in 2015 and corresponding to an integrated luminosity of 3.2 fb$^{-1}$. The top quark pair events are selected in the lepton (electron or muon) + jets channel. The measured cross sections, which are compared to several predictions, allow a detailed study of top quark production.


Introduction
The large number of top quark pair (tt) events produced at the Large Hadron Collider (LHC) allows detailed studies of the characteristics of tt production as a function of different kinematic variables. In this paper, the data collected by the ATLAS experiment in 2015 are used to measure differential cross sections for tt production in association with jets. The measurement of differential cross sections in different bins of jet multiplicity provides a better understanding of the effect of gluon radiation on tt kinematic variables than differential cross sections inclusive in the number of jets previously published by the ATLAS Collaboration at √ s = 13 TeV [1].
Since the top quark decays almost always to a W boson and a b-quark, the decay of a top quark pair produces six particles in the final state, whose identity depends on the decays of the intermediate W bosons. The channel considered in this analysis is characterised by the leptonic decay of one W boson and the hadronic decay of the other W boson; this is commonly referred to as semileptonic decay mode or +jets channel. The final-state configuration contains one electron or muon, one neutrino giving rise to missing transverse momentum (E miss T ) and four jets, two of which originate from b-quarks. Events may include additional jets from gluon radiation off initial-or final-state quarks. To study the dependence of this emission on the observables, three configurations are defined depending on the number of additional jets produced within the detector acceptance in association with the top quark pair: the "4-jet exclusive configuration" (no additional jets); the "5-jet exclusive configuration" (only one additional jet); and the "6-jet inclusive configuration" (two or more additional jets). The latter configuration is of particular interest since it provides a similar phase space to the one used by measurements such as Higgs boson production in association with two top quarks and searches with high jet multiplicity.
The three configurations with increasing number of additional jets are expected to provide a better understanding of the effect of gluon radiation on the kinematic variables of top quark pair production. ATLAS already published differential cross section measurements as a function of the number of additional jets [2][3][4] and of several kinematic variables [1, [5][6][7][8]. The results presented in this paper combine the two types of measurements to provide additional information about top quark production and explore the effect of the gluon radiation on tt kinematic variables. The CMS Collaboration published similar measurements [9][10][11][12].
The observables studied here are the transverse momentum (p T )1 of the top quark-antitop quark system (p tt T ) and the absolute value of the out-of-plane momentum (|p tt out |), defined as the projection of the top quark three-momentum onto the direction perpendicular to a plane defined by the other top quark and the beam axis (ẑ) in the laboratory frame [13]: p tt out = ì p t,had · ì p t,lep ×ẑ | ì p t,lep ×ẑ| , 1 ATLAS uses a right-handed coordinate system with its origin at the nominal interaction point (IP) in the centre of the detector and the z-axis along the beam pipe. The x-axis points from the IP to the centre of the LHC ring, and the y-axis points upward. Cylindrical coordinates (r,φ) are used in the transverse plane, φ being the azimuthal angle around the beam pipe. The pseudorapidity is defined in terms of the polar angle θ as η = − ln tan(θ/2) and the angular separation between particles is defined as ∆R = (∆φ) 2 + (∆η) 2 . The transverse momentum is the projection of the momentum on the transverse plane.
where ì p t,lep and ì p t,had are the momenta of the semileptonically and hadronically decaying top quarks, respectively. This observable is complementary to p tt T since p tt out is expected to be more sensitive to the direction of gluon radiation; for example the emission of a low p T jet at a large angle with respect to the plane defined by the two top quarks is expected to be better measured with p tt out than p tt T . In addition, the differential cross section as a function of the transverse momentum of the hadronic top quark (p t,had T ) is measured. In previous publications [1,8], differences between the data and the predictions by several Standard Model Monte Carlo (MC) event generators were observed. By measuring the differential cross section of this observable in different jet multiplicities it is possible to identify the regions of phase space in which the discrepancy is largest. The measured differential cross sections as functions of these three observables are compared to predictions from several MC event generators, namely P -B [14], M G 5_aMC@NLO [15] and S [16].

ATLAS detector
ATLAS is a multipurpose detector [17] that provides nearly full solid angle coverage around the interaction point. Charged-particle trajectories with pseudorapidity |η| < 2.5 are reconstructed in the inner detector, which comprises a silicon pixel detector, a silicon microstrip detector and a transition radiation tracker. The innermost pixel layer, the insertable B-layer [18], was added before the start of 13 TeV LHC operation at an average radius of 33 mm around a new, thinner beam pipe. The inner detector is embedded in a superconducting solenoid generating a 2 T axial magnetic field, allowing precise measurements of charged-particle momenta. Sampling calorimeters with several different designs span the pseudorapidity range up to |η| = 4.9. High-granularity liquid argon (LAr) electromagnetic (EM) calorimeters are used up to |η| = 3.2. Hadronic calorimeters based on scintillator-tile active material cover |η| < 1.7 while LAr technology is used for hadronic calorimetry in the region 1.5 < |η| < 4. 9. The calorimeters are surrounded by a muon spectrometer within a magnetic field provided by air-core toroid magnets with a bending integral of about 2.5 Tm in the barrel and up to 6 Tm in the end-caps. Three stations of precision drift tubes and cathode-strip chambers provide an accurate measurement of the muon track curvature in the region |η| < 2.7. Resistive-plate and thin-gap chambers provide muon triggering capability up to |η| = 2.4.
Data are selected from inclusive pp interactions using a two-level trigger system [19]. A hardware-based trigger uses custom-made hardware and coarser-granularity detector data to initially reduce the trigger rate to approximately 75 kHz from the original 40 MHz LHC bunch crossing rate. Next, a software-based high-level trigger, which has access to full detector granularity, is applied to further reduce the event rate to 1 kHz.

Data and simulation
The differential cross sections are measured using a data set collected during the 2015 LHC pp run at √ s = 13 TeV and with 25 ns bunch spacing. The average number of pp interactions per bunch crossing ranged from approximately 5 to 25, with a mean of 14. After applying data-quality assessment criteria based on beam, detector and data-taking quality, the available data correspond to a total integrated luminosity of 3.2 fb −1 .
The data were collected using a combination of multiple single-muon and single-electron triggers. For each lepton type, multiple trigger conditions are combined to maintain good efficiency in the full momentum range, while controlling the trigger rate. For electrons, the p T thresholds are 24 GeV, 60 GeV and 120 GeV, while for muons the thresholds are 20 GeV and 50 GeV. Isolation requirements are applied to the triggers with the lowest p T thresholds.
The signal and background processes are modelled with various MC event generators described below and summarised in Table 1. Multiple overlaid pp collisions were simulated with the soft QCD processes of P 8.186 [20] using parameter values from tune A2 [21] and the MSTW2008LO [22] set of parton distribution functions (PDFs). The EvtGen v1.2.0 program [23] was used to simulate the decay of bottom and charm hadrons, except for the S event generator. The detector response was simulated [24] in GEANT4 [25].

Signal simulation samples
In this section the MC samples used for the generation of tt events are described for the nominal sample, the alternative samples used to estimate systematic uncertainties and the other samples used in the postunfolding comparison. The top quark mass (m t ) was set to 172.5 GeV in all MC event generators.
For the generation of tt events, the P -B v2 event generator [14, 26,27], from now on called P , with the CT10 PDF set [28] was used for the matrix element calculations. The factorisation and hadronisation scales are set to m 2 t + p 2 T,t where m t and p T,t are the top quark mass and the transverse momentum of the top quark, respectively, evaluated for the underlying Born configuration. Events in which both W bosons decay hadronically were not included. For this process, the top quarks were decayed using MadSpin [29] to preserve all spin correlations, while parton shower, hadronisation, and the underlying event were simulated using P 6.428 [30] with the CTEQ6L1 PDF set [31] and the Perugia2012 tune [32]. The h damp parameter, which controls the p T of the first gluon or quark emission beyond the Born configuration in P , was set to the mass of the top quark [33]. The main effect of this parameter is to regulate the high-p T emission against which the tt system recoils. Signal tt events generated with those settings are referred to as the nominal signal sample.
To estimate the effect of the parton shower algorithm, a P +H ++ sample was generated with the same P settings as for the nominal sample. The parton shower, hadronisation and underlying event simulation were produced with H ++ [34] (version 2.7.1) using the UE-EE-5 tune [35] and the CTEQ6L1 PDF set.
The impact of the matrix element (ME) event generator choice is evaluated using events generated with M G 5_aMC@NLO+H ++ with the UE-EE-5 tune. The events were generated with version 2.1.1 of M G 5_aMC@NLO. NLO matrix elements and the CT10 PDF set were used for the tt hard-scattering process. These events were passed through a fast simulation using a parametrisation of the performance of the ATLAS electromagnetic and hadronic calorimeters [36] and full simulation of the response in the inner detector and muon spectrometer.
The effects of different levels of gluon radiation are evaluated using two samples with different factorisation and hadronisation scales relative to the nominal sample, as well as a different h damp parameter value. Specifically, in one sample the factorisation and hadronisation scales were reduced by a factor of 0.5, the h damp parameter was increased to 2m t and the 'radHi' tune variation from the Perugia2012 tune set is used. In the second sample, the factorisation and hadronisation scales were increased by a factor of two, the h damp parameter was unchanged and the 'radLo' tune variation from the Perugia2012 tune set was used.
The measured differential cross sections are compared to several additional tt MC samples [33, 37,38].
• Two M G 5_aMC@NLO+P 8 samples having two different hard-scattering scales, H T /2 2 and m 2 t + p 2 T,t and using the same A14 tune.
• Two P +P 8 samples simulated with different values of the h damp parameter (h damp = m t and h damp = 1.5m t ) also using the A14 tune.
• Two additional P +P 8 samples with alternative radiation settings: the factorisation and renormalisation scales are coherently varied by a factor of 2.0 (0.5) and the A14 tune 'Var3c Down' ('Var3c Up') variation is used.
• A P +H 7 sample generated with the h damp parameter set to 1.5m t and using the H7-UE-MMHT tune which use the NNPDF3.0 PDF [39] for the ME.
• A S 2.2.1 sample in which events were generated with a tt matrix element and up to one additional parton simulated at NLO and two, three and four partons at LO. The CT10 PDF set was used.
The tt samples are normalised using σ tt = 832 +20 −29 (scale) ± 35 (PDF) pb as calculated with the Top++2.0 program to next-to-next-to-leading order (NNLO) in perturbative QCD, including soft-gluon resummation to next-to-next-to-leading-log order (NNLL) (see Ref. [40] and references therein), and assuming a top quark mass m t = 172.5 GeV. The first uncertainty comes from the independent variation of the factorisation and renormalisation scales, µ F and µ R , while the second one is associated with variations in the PDF and α S , following the PDF4LHC prescription with the MSTW2008 68% CL NNLO, CT10 NNLO and NNPDF2.3 5f FFN PDF sets see Refs. [28,[41][42][43].

Background simulation samples
Several processes can produce the same final state as the tt semileptonic channel. The events produced by these backgrounds need to be estimated and subtracted from data to calculate the top quark pair cross sections. They are fully estimated using MC simulation with the exception of the W+jets background, for which data-driven techniques complement the MC simulation prediction. The processes considered are single-top quark production, W+jets and Z+jets production, diboson final states and top quark pairs produced in association with weak bosons (tt + W/Z/WW, denoted by ttV).
The simulation of single-top quark events from Wt and s-channel production was performed using the configuration described above for the nominal tt sample. The overlap between the Wt and tt samples was handled using the diagram-removal scheme [44]. Electroweak t-channel single-top quark events were generated using the P -B v1 event generator. The single-top quark cross sections for the tand s-channels are normalised using their NLO predictions, while for the Wt channel it is normalised using its NLO +NNLL prediction [45][46][47].
Inclusive samples containing single W or Z bosons in association with jets were simulated using the S 2.1.1 event generator [16]. Matrix elements were calculated with up to two partons at NLO and four partons at leading-order (LO) using the Comix [48] and OpenLoop [49] matrix element event generators and merged with the S parton shower [50] using the ME+PS@NLO prescription [51]. The CT10 PDF sets were used in conjunction with dedicated parton shower tuning developed by the authors of S . The Z+jets events are normalised using the NNLO cross sections [52] while the normalisation for the W+jets events is obtained with a data-driven method described in Section 5.
Diboson processes, with one of the bosons decaying hadronically and the other leptonically, were simulated using the S 2.1.1 event generator [16,53]. They are calculated for up to one (Z Z) or zero (WW, W Z) additional partons at NLO and up to three additional partons at LO using the Comix and OpenLoops matrix element event generators and merged with the S parton shower using the ME+PS@NLO prescription. The CT10 PDF sets were used in conjunction with dedicated parton shower tuning developed by the authors of S . The event generator cross sections, which are already at the NLO accuracy, are used in this case.
The ttV events were simulated using the M G 5_aMC@NLO event generator at LO interfaced to the P 8.186 parton shower model [54]. The matrix elements were simulated with up to two (tt + W), one (tt + Z) or no (tt + WW) extra partons. The ATLAS underlying-event tune A14 was used together with the NNPDF2.3LO PDF sets. The events are normalised using their respective NLO cross sections [15].

Object reconstruction and event selection
The following sections describe the reconstruction-and particle-level objects used to characterise the final-state event topology and to define the fiducial phase space regions for the measurements. The reconstruction level is applied to data and MC samples.

Detector-level object reconstruction
Primary vertices are formed from reconstructed tracks which are spatially compatible with the interaction region. The hard-scatter primary vertex is chosen to be the one with at least two associated tracks and the highest p 2 T , where the sum extends over all tracks with p T > 0.4 GeV matched to the vertex. Electron candidates are reconstructed by matching tracks in the inner detector to energy deposits in the EM calorimeter. They must satisfy a "tight" likelihood-based identification criterion based on shower shapes in the EM calorimeter, track quality and detection of transition radiation produced in the transition radiation tracker detector [55]. The reconstructed EM clusters are required to have a transverse energy E T > 25 GeV and a pseudorapidity |η| < 2.47, excluding the transition region between the barrel and the end-cap calorimeters (1.37 < |η| < 1.52). The associated track must have a longitudinal impact parameter |z 0 sinθ| < 0.5 mm and a transverse impact parameter significance |d 0 |/σ(d 0 ) < 5, where d 0 is measured with respect to the beam line. Isolation requirements based on calorimeter and tracking quantities are used to reduce the background from non-prompt and fake electrons [56]. The isolation criteria are p Tand η-dependent, and ensure an efficiency of 90% for electrons with p T of 25 GeV and 99% efficiency for electrons with p T of 60 GeV. The identification, isolation and trigger efficiencies are measured using electrons from Z boson decays [57].
Muon candidates are identified by matching tracks in the muon spectrometer to tracks in the inner detector [58]. The track p T is determined through a global fit of the hits which takes into account the energy loss in the calorimeters. Muons are required to have p T > 25 GeV and |η| < 2.5. To reduce the background from muons originating from heavy-flavour decays inside jets, muons are required to be isolated using track quality and isolation criteria similar to those applied to electrons. If a muon shares a track with an electron, it is likely to have undergone bremsstrahlung and hence the electron is not selected. Muon efficiencies are reconstruction and isolation efficiencies and for muon candidates with p T > 25 GeV these efficiencies are of 99% and are obtained using muons from J/ψ and Z decays.
Jets are reconstructed using the anti-k t algorithm [59] with radius parameter R = 0.4 as implemented in the FastJet package [60]. Jet reconstruction in the calorimeter starts from topological clustering of individual calorimeter cell signals calibrated to be consistent with electromagnetic or hadronic cluster shapes using corrections determined in simulation and inferred from test-beam data [61]. Jet four-momenta are then corrected for pile-up effects using the jet-area method [62]. To reduce the number of jets originating from pile-up, an additional selection criterion based on a jet-vertex tagging technique is applied. The jet-vertex tagging is a likelihood discriminant that combines information from several track-based variables [63] and the criterion is only applied to jets with p T < 60GeV and |η| < 2.4. Jets are calibrated using an energyand η-dependent simulation-based calibration scheme with in situ corrections based on data [64,65], and are accepted if they have p T > 25 GeV and |η| < 2.5.
For objects satisfying both the jet and lepton selection criteria, a procedure called "overlap removal" is applied to assign objects to a unique hypothesis. To prevent double-counting of electron energy deposits as jets, the jet closest to a reconstructed electron is discarded if they are ∆R < 0.2 apart. Subsequently, to reduce the impact of non-prompt electrons, if an electron is ∆R < 0.4 from a jet, then that electron is removed. If a jet has fewer than three tracks and is ∆R < 0.4 from a muon, the jet is removed. Finally, the muon is removed if it is ∆R < 0.4 from a jet with at least three tracks.
The purity of the selected tt sample is improved by identifying jets containing b-hadrons, so called btagged jets. This identification exploits the long lifetime of b-hadrons and the invariant mass of tracks from the corresponding reconstructed secondary vertex, which is on average several GeV larger than that in jets originating from gluons or light-flavour quarks. Information from the track impact parameters, secondary-vertex location and decay topology are combined in a multivariate algorithm (MV2c20) [66]. The operating point corresponds to an overall 77% b-tagging efficiency in tt events, with a corresponding rejection of charm-quark jets (light-flavour and gluon jets) by a factor of 4.5 (140) [66]. Jets that pass this selection are identified as b-tagged jets.
The E miss T vector is computed from the sum of the transverse momenta of the reconstructed calibrated physics objects (electrons, photons, hadronically decaying τ-leptons, jets and muons) together with the transverse energy deposited in the calorimeter cells, calibrated using tracking information, not associated with these objects [67]. To avoid double-counting of energy, the muon energy loss in the calorimeters is subtracted in the E miss T calculation. This variable is not used in the selection but is used in the top quark reconstruction described below.

Particle-level object definition
Particle-level objects are defined in simulated events using only stable particles, i.e. particles with a mean lifetime τ > 30 ps. The fiducial phase space for the measurements presented in this paper is defined using a series of requirements applied to particle-level objects analogous to those used in the selection of the reconstruction-level objects, described above.
Electrons and muons must not originate, either directly or through a τ decay, from a hadron in the MC event record. This ensures that the lepton is from the decay of a real W boson without requiring a direct match to it. The four-momenta of leptons are modified by adding the four-momenta of all photons within ∆R = 0.1 and not originating from hadron decays, to take into account final-state photon radiation. Such leptons are then required to have p T > 25 GeV and |η| < 2.5.
Particle-level jets are reconstructed using the same anti-k t algorithm used at reconstruction level. The jet-reconstruction procedure takes as input all stable particles, except for leptons not from hadron decay as described above, inside a radius R = 0.4. Particle level jets are required to have p T > 25 GeV and |η| < 2.5. A jet is identified as a b-jet if a hadron containing a b-quark is matched to the jet through a ghost-matching technique described in Ref.
[62]; the hadron must have p T > 5 GeV. No overlap removal criteria are applied to particle-level objects. Neutrinos and charged leptons from hadron decays are included in particle-level jets.

Event selection and fiducial phase space definition
Events at both reconstruction and particle levels are required to contain exactly one electron or muon and at least four jets, with at least two tagged as b-jets. Each event is then unequivocally assigned to the 4-jet, 5-jet or 6-jet-inclusive configurations, depending on the number of reconstructed jets.
Dilepton tt events, where only one lepton satisfies the fiducial selection, are included by definition in the fiducial measurement. In the fiducial phase space definition, semileptonic tt decays into τ-leptons are considered as signal only if the τ-lepton decays leptonically.

Background determination and event yields
After the event selection, various backgrounds still contribute to the event yields. The different background contributions are estimated by using MC simulations or data-driven techniques as detailed below for each source. The latter are used when the accuracy of the MC simulation is not adequate, as in the case of W boson production in association with multiple jets and the background originating from jets mimicking the signature of charged leptons.
The single-top quark background is the largest background contribution in all considered regions, amounting to 5% of the total event yield and 30% of the total background estimate. This background is modelled with a MC simulation, and the event yields are normalised using calculations of their cross sections, as described in Section 3.
Multijet production processes, including tt production with all hadronic decay and tt decays into τ-leptons which then decay hadronically, have a large cross section and can mimic the lepton+jets signature due to hadrons misidentified as prompt leptons (fake leptons), conversion of photons for the electron channel or semileptonic decays of heavy-flavour hadrons (non-prompt real leptons). The multijet background is estimated directly from data by using a matrix method [68] in which signal and control regions are defined using lepton identification criteria. The method depends on the probability of a real (fake) lepton to pass the tight selection criteria, which is referred to as the real (fake) efficiency. These efficiencies are measured in data control regions dominated by real or fake lepton events. In the e+jets channel, the fake efficiency is parametrised as a function of p T and η, as well as the azimuthal angle difference between the lepton and the E miss T vector, ∆φ. In the µ+jets channel, the fake efficiency is calculated for low and high lepton p T . The low p T parametrisation depends on ∆φ, p T and E miss T , whereas the high p T parametrisation only uses p T . The real efficiencies are measured with the Z → events using the tag-and-probe method. In the e+jets channel, the efficiency is parametrised as a function of p T , whereas in the µ+jets channel the parametrisation depends on ∆φ and p T . The multijet background contributes to the total event yield at the level of approximately 4% and 30% of the total background estimate.
The W+jets background represents the third largest background, amounting to 2-3% of the total event yield and 20% of the total background estimate. The estimation of this background is performed using a combination of MC simulations and data-driven techniques; the S MC event generator is used to estimate the contribution from the W+jets process. The normalisation and the heavy-flavour fractions of this process, which are affected by large theoretical uncertainties, are determined from data. The overall W+jets normalisation is obtained by exploiting the expected charge asymmetry in the production of W + and W − bosons in pp collisions. This asymmetry is predicted by theory [69] and evaluated using MC simulations, assuming other processes are symmetric in charge except for a small contamination from single-top quark, ttV and W Z events, which is subtracted using MC simulations. The total number of W+jets events with a positively or negatively charged W boson (N W + + N W − ) in the sample is thus estimated using the following equation: where r MC is the ratio of the number of events with positively charged leptons to the number of events with negatively charged leptons in the MC simulations, and D + and D − are the numbers of events with positive and negative leptons in the data, respectively, corrected for the aforementioned non-W+jets chargeasymmetric contributions using simulation. The corrections due to event generator mis-modelling of W boson production in association with jets of different flavour (W+bb, W+cc, W+c, W+light flavours) are estimated using a dedicated control sample in data which uses the same lepton as for the signal but requiring exactly two jets. In their determination, the overall normalisation scaling factor obtained using Eq. (1) is applied first. Then heavy-flavour scaling factors obtained in the two-jet control region are extrapolated to the signal region using MC simulations, assuming constant relative rates for the signal and control regions. Taking into account the heavy-flavour scale factors, the overall normalisation factor is calculated again using Eq. (1). This iterative procedure is repeated until the total predicted W+jets yield in the two-jet control configuration agrees with the data yield. The procedure is explained in detail in Ref. [70].
The background contributions from Z+jets, ttV and diboson events are obtained from MC simulation, and the event yields are normalised using the theoretical calculations of their cross sections, as described in Section 3. The total contribution from these processes is 1-2% of the total event yield or 11-14% of the total background.
Dilepton top quark pair events can satisfy the event selection if one lepton does not satisfy the requirements listed above and at least two additional jets are produced. Events with at least a top quark decaying to a τ-lepton which subsequently decays leptonically, can also pass the event selection. These events contribute 3-5% to the total event yield, and are considered in the analysis at both reconstruction and particle levels.
Cases where both top quarks decay semileptonically into τ-leptons, and where both τ-leptons decay hadronically, are accounted for in the multijet background.
The event yields in the three configurations are displayed in Table 2 for data, simulated signal, and backgrounds. Figure 1 shows3 the comparison between data and predictions for the 4-jet configuration for different distributions. All of the distributions are shown for the combined +jets channel (combining electron and muon channels). The background contributions in the other configurations are similar, as shown in Figure 2. The event selection results in a total background contamination of 10-15%, depending on the configuration. A constant difference between data and prediction is observed in Figures 2(b) and 2(c), the same effect is also seen in the distribution of the number of jets, shown in Figure 3. This discrepancy has also been observed in studies of associated production of jets with top quark pairs [4].

Reconstruction of top quark kinematic properties
The two top quarks are reconstructed from their decay products so that the differential cross sections can be measured as functions of observables involving the top quark and the tt system. In the following, the leptonic (hadronic) top quark refers to the one that decays into a leptonically (hadronically) decaying W boson.
The pseudo-top algorithm [7] reconstructs the four-momenta of the top quarks and their complete decay chain from final-state objects, namely the charged lepton (electron or muon), missing transverse momentum, and four jets, two of which are b-tagged. Only about 14% of the selected events contain more than two b-tagged jets, in which case the two with the highest transverse momentum are considered as coming from the top quarks, while the others are considered for the W reconstruction. The same algorithm is used to reconstruct the kinematic properties of top quarks at reconstruction level and particle level in the three configurations.
The algorithm starts with the reconstruction of the neutrino four-momentum. While the x and y components of the neutrino momentum are set to the corresponding components of the missing transverse momentum, the z component is calculated by imposing a W boson mass constraint on the invariant mass of the chargedlepton-neutrino system. If the resulting quadratic equation has two real solutions, the one with the smaller value of |p z | is chosen. If the discriminant of the equation is negative, only the real part is considered.
The leptonically decaying W boson is reconstructed from the charged lepton and the reconstructed neutrino. The leptonic top quark is reconstructed from the leptonic W boson and the b-tagged jet closest in ∆R to the charged lepton. The hadronic W boson is reconstructed from the two jets whose invariant mass is closest to the mass of the W boson; only jets that do not pass the b-tagging requirements are considered. Finally, the hadronic top quark is reconstructed from the hadronic W boson and the other b-jet. This choice yields the best performance of the algorithm in terms of the correspondence between the reconstruction level and particle level.
The performance of the algorithm was studied in each of the three configurations. The algorithm reconstructs the masses of the hadronic W boson and the top quark with similar performances in all three configurations. Hence, the presence of additional jets in the 5-and 6-jet configurations, where different combinations in the jet assignment to the W boson are possible, does not impact the reconstruction significantly.

Measured observables
The goal of this analysis is to measure differential cross sections for observables in regions of phase space sensitive to gluon radiation. Three observables are chosen because they are shown to be sensitive to radiation or other effects correlated with the number of jets: p t,had T , p tt T and p tt out . Figure 4 shows the p tt T distributions for the three configurations. The p tt T distribution is expected to depend strongly on gluon radiation; if no additional jets beyond those of the tt decay are produced, the tt system should have small p tt T . If an additional jet is produced, the tt system recoils against it, hence it should take larger p T values. This effect is more pronounced with more additional jets, as observed in Figure 4. The p t,had T distributions for the three configurations are shown in Figure 5. The predictions tend to underestimate (overestimate) the data at low (high) p t,had T . This effect is most clearly observed in the 4-jet configuration; a stress test performed on the unfolding (described in Section 8) demonstrated that the difference between data and prediction does not affect the results. The p tt out distributions are shown in Figure 6; the shape of the measured distribution displays a small dependence on the number of additional jets.   The individual electron and muon channels have very similar corrections and give compatible results at reconstruction level. They are therefore combined by summing the distributions before the unfolding procedure.
For each observable, the unfolding procedure starts from the number of events at reconstruction level in bin j of the distribution (N j reco ), after subtracting the background events estimated as described in Section 5 (N j bg ). Next, the acceptance correction f j acc is defined as the ratio of the number of events passing both the particle-and reconstruction-level selections to the number of events passing the reconstruction-level selection. This factor corrects for events that are generated outside the fiducial phase space region but pass the reconstruction-level selection.
The reconstruction-level objects used to reconstruct the top quarks are required to be angularly matched to the corresponding particle-level object as assigned by the pseudo-top algorithm. The jets assigned to the W boson can be swapped. This requirement leads to a better correspondence between the particle and reconstruction levels. The matching requirement for the lepton, using the direction given by its associated track, is ∆R < 0.02 while jets are required to be within ∆R < 0. 35. The matching correction f j match is defined as the ratio of events matched among the events passing both the particle-level and reconstruction-level selections for the same number of jets; it corrects for events in which a match is not found.
The unfolding step uses a migration matrix (M) derived from simulated tt events which maps the binned particle-level events to the binned reconstruction-level events. The probability for particle-level events to be reconstructed in the same bin is therefore represented by the elements on the diagonal, and the off-diagonal elements describe the fraction of particle-level events that migrate into other bins. Therefore, the elements of each row add up to unity (within rounding). The number of bins is optimised for maximum information extraction under stable unfolding conditions. This is achieved by requiring that closure and stress tests are satisfied without introducing any bias. The unfolding is performed using four iterations to balance the unfolding stability with respect to the previous iteration (below 0.1%) and the growth of the statistical uncertainty. The effect of varying the number of iterations by one was found to be negligible. Finally, the efficiency is defined as the ratio of the number of matched events to the number of events passing the particle-level selection. This factor corrects for the inefficiency of the reconstruction.
The unfolding procedure for an observable X at particle level is summarised by the following expression for the absolute differential cross section: where the index j labels bins at reconstruction level while the i index labels bins at particle level; ∆X i is the bin width while L is the integrated luminosity, and the Bayesian unfolding is symbolised by M −1 i j . The integrated fiducial cross section is obtained by integrating the unfolded cross section over the bins, and its value is used to compute the normalised differential cross section: The unfolding of the observables is carried out independently in each configuration taking into account the bin-to-bin correlations within the distributions but not across jet multiplicity bins or among different observables within one jet multiplicity. Events that have a different number of jets at particle level and reconstruction level do not enter any migration matrix but are considered by the acceptance correction.

Systematic uncertainties
This section describes the estimation of systematic uncertainties related to object reconstruction and calibration, MC event generator modelling and background estimation. The uncertainty in the unfolded distribution is evaluated as follows. The considered distribution is varied at reconstruction level, unfolded using corrections from the nominal tt signal sample, and the unfolded distribution is compared to the particle-level distribution. All reconstruction-and background-related systematic uncertainties are evaluated using the nominal event generator, while alternative event generators are employed to assess uncertainties in the tt system modelling as discussed in Sec. 9.2. In these cases, the corrections derived from the event generator are used to unfold the reconstruction-level spectra of the alternative event generator.
The covariance matrix incorporating statistical and systematic uncertainties is obtained for each observable by summing two covariance matrices. The first covariance matrix includes statistical and systematic uncertainties from detector effects and background estimation by using pseudo-experiments to combine the sources. The second covariance matrix is derived by adding four separate covariance matrices corresponding to the effects of the signal modelling: event generator, parton shower and hadronisation, initialand final-state radiation (ISR/FSR) and PDF uncertainties. The bin-to-bin correlation values are set to unity for all these matrices.
The covariance matrices due to the statistical and systematic uncertainties are obtained for each observable by evaluating the covariance between the kinematic bins using pseudo-experiments. In particular, the correlations due to statistical fluctuations from the size of both the data sample and the simulated signal samples are evaluated by varying the event counts independently in every bin before unfolding, and then propagating the resulting variations through the unfolding. The full description of the method is provided in Ref. [73].

Experimental uncertainties
The jet energy scale (JES) uncertainty is estimated using a combination of simulations, test-beam data and in situ measurements [64, 74, 75]. Additional contributions from jet-flavour composition, η-intercalibration, hadrons passing through the calorimeter without interacting (punch-through), single-particle response, calorimeter response to different jet flavours, and pile-up are considered, resulting in 19 eigenvector uncertainty components. The uncertainty in the jet energy resolution (JER) is obtained with an in situ measurement of the jet response in dijet events [76].
The efficiency to tag jets containing b-hadrons is corrected in simulated events by applying scale factors, extracted from a tt dilepton sample, to account for the residual difference between data and simulation. Scale factors are also applied for jets originating from light or charm quarks that are misidentified as b-jets. The associated flavour-tagging uncertainties, split into eigenvector components, are computed by varying the scale factors within their uncertainties [77][78][79].
The lepton reconstruction efficiency in simulated events is corrected by scale factors derived from measurements of these efficiencies in data using a control region enriched in Z → + − events.

Signal modelling uncertainties
Uncertainties in the signal modelling affect the kinematic properties of simulated tt events as well as reconstruction-and particle-level efficiencies. To assess the uncertainty related to the matrix-element model and matching algorithm used in the MC event generator for the tt signal process, events simulated with M G 5_aMC@NLO + H ++ are unfolded using the migration matrix and correction factors derived from an alternative P +H ++ sample. The difference between the unfolded distribution and the known particle level distribution of the M G 5_aMC@NLO+H ++ sample is assigned as the uncertainty, which is then symmetrised.
To assess the impact of different parton-shower models, events simulated with P interfaced to H ++ are unfolded using the migration matrix and correction factors derived with the nominal sample. The difference between the unfolded distribution and the known particle-level distribution of the P +H ++ sample is assigned as the relative uncertainty, which is then symmetrised.
To evaluate the uncertainties related to the modelling of the initial-and final-state gluon radiation (ISR/FSR), tt MC samples with modified ISR/FSR modelling are used. The MC samples used for the evaluation of this uncertainty are generated using the P event generator interfaced to the P shower model, where the parameters are varied as described in Section 3. The impact of the uncertainty related to the PDF is assessed using the tt sample generated with a M G 5_aMC@NLO interfaced to H ++. PDF-varied corrections and response matrix for the unfolding procedure are obtained by reweighting the central PDF4LHC15 PDF set to the full set of its 30 eigenvectors as described in Ref. [42]. Using these corrections, the central M G 5_aMC@NLO+H ++ distribution is unfolded, the relative difference is computed with respect to the expected central particle-level spectrum, and the total uncertainty is obtained by adding these relative differences in quadrature. In addition, the difference between the central PDF4LHC15 and CT10 is evaluated in a similar way and added in quadrature to the PDF uncertainty.

Background modelling uncertainties
Systematic uncertainties affecting the backgrounds evaluated with MC simulation are estimated using an alternative background MC sample produced by rescaling the nominal background sample. The alternative sample, instead of the nominal one, is subtracted from data. The uncertainty is evaluated as the difference between the unfolded distribution using the alternative background MC sample and the nominal one.
A 15% normalisation uncertainty is applied to the single-top quark background. This includes the uncertainty associated with the emission of additional radiation which is evaluated to be smaller than 15%. The 5% theoretical uncertainty in the normalisation is also included.
In the case of the Z+jets and diboson backgrounds, the uncertainties include a contribution from the overall cross section normalisation as well as an additional 24% uncertainty per additional jet [80]: 48%, 72% and 96% in the 4-jet, 5-jet and 6-jet configurations, respectively.
The systematic uncertainties due to the overall normalisation and the heavy-flavour fractions of W+jets events are obtained by varying the data-driven scale factors. The overall impact of these uncertainties is less than 2%. Each detector systematic uncertainty includes the impact of those on the W+jets estimate. In addition, a 24% uncertainty per radiated jet, as described for the Z+jets and diboson samples, is applied to the W+jets background uncertainty.
The uncertainty in the background from non-prompt and fake leptons is evaluated by changing the selection used to define the control region and propagating the statistical uncertainty of the parametrisations of the efficiency to pass the tighter lepton requirements for real and fake leptons.
In addition, an extra 50% normalisation uncertainty is applied to this background to account for the remaining mis-modelling observed in various control regions. This systematic uncertainty also includes the impact of the normalisation on the estimation of the W+jets background.

Size of the simulated samples and luminosity uncertainty
Test distributions, created with independent Poisson fluctuations of the event count in each bin, are unfolded to account for the size of the simulated samples. The uncertainty is the standard deviation given by all unfolded distributions.
The uncertainty in the integrated luminosity is 2.1% and is derived, following techniques similar to those described in Ref. [81], from the luminosity scale calibration using a pair of x-y beam-separation scans performed in August 2015. Figure 7 presents the uncertainties as a function of p t,had T in the tt fiducial phase space differential cross sections. The uncertainties are between 8% and 25% for the absolute cross sections and between 4% and 9% in almost the full range of the normalised cross sections. In all configurations the uncertainties are larger at the low and high ends of the spectrum; this shape is due to the combination of different components. The background and JES uncertainties are bigger at low value in p T and decrease with the p T while the signal uncertainties have the opposite behaviour. Comparing Figures 7(b), 7(c) and 7(d) shows that the JES uncertainty increases with the number of jets and is the dominant uncertainty in the 6-jet configuration. The uncertainties for the other observables have similar values and behaviour. In the 4-jet configuration, the dominant uncertainty is due to flavour-tagging. The total uncertainties are reduced for the normalised cross sections because of the cancelling out of correlated uncertainties, such as the flavour-tagging and the JES uncertainties as seen by comparing Figures 7(a) and 7(b).

Results and comparisons with predictions
The measured differential cross sections as functions of p t,had T , p tt T and p tt out are shown in Figures 8-10 for the three configurations. All absolute differential cross sections are presented while only a selection of the normalised results is presented in which shape effects are more visible (this includes the p t,had T results in the 4-jet configuration and the p tt T and p tt out results in the 6-jet configuration). Several MC predictions are compared to data; a subset of the most relevant predictions is shown in the figures while the compatibility to data is tested for a comprehensive list of MC predictions and shown in Tables 3-8.
The level of agreement between the measured differential cross sections and the predictions is quantified using χ 2 values which are evaluated employing the full covariance matrices of the uncertainties; the uncertainties in the theoretical predictions are not included in this evaluation. The p-values (probabilities that the χ 2 is larger than or equal to the observed value) are then evaluated from the χ 2 and the number of degrees of freedom (NDF). The detailed procedure for the calculation of the χ 2 and p-values is described in Ref. [1].
The differential cross section as a function of p t,had T is shown in Figure 8. All MC predictions underestimate (overestimate) the data at low (high) values of p t,had T ; this tendency is reduced at higher jet multiplicity. This is consistent with the CMS results for the same observable and various jet multiplicities [12]. In addition, these results obtained here improve the understanding of similar effects observed in previous ATLAS analyses [1,8]; the effect is mainly due to events with exactly four jets. The χ 2 values for all predictions and all configurations are shown in Tables 3 and 4 for the absolute and normalised differential cross sections, respectively. In general, all predictions are compatible with the data in the 5-and 6-jet configurations for both the absolute and normalised differential cross sections while there is tension for the 4-jet configuration, especially for the absolute differential cross section. The main exception is the prediction obtained from the P +H ++ calculation, which is inconsistent with the measured differential cross sections for the 4-and 6-jet configurations.
The differential cross sections as a function of p tt T for different jet multiplicities are shown in Figure 9 and the χ 2 values are presented in Tables 5 and 6. In general, good agreement is observed in the 4-and 5-jet configurations while there is some tension in the 6-jet configuration. However, the χ 2 values show that the M G 5_aMC@NLO event generator is not compatible with data in the 4-and 5-jet configurations in both the absolute and normalised differential cross sections. This was not observed in the measurement inclusive in the number of jets [1] because different configurations are dominant at different values of p tt T . Indeed, the absolute cross section in the first two bins of the 4-jet configuration is larger than in the other configurations while the cross section in the last two bins is largest in the 6-jet configuration. Since the mis-modelling is observed in regions of p T in which the cross section in that configuration is subdominant, it could not be observed in the previous measurement. The P +H ++ prediction does not model the data well in all configurations. Furthermore, both P +P calculations with additional radiation ('radHi' and 'Var3c Up') are not compatible with the data for the 4-and 6-jet configurations for the absolute differential cross sections as shown in Table 5.
The differential cross sections as functions of p tt out are shown in Figure 10 and confirm the mis-modelling of the M G 5_aMC@NLO prediction for the 4-and 5-jet configurations observed for p tt T . The p-values shown in Tables 7 and 8 drop significantly at higher jet multiplicity for all predictions. Several predictions are not compatible with the absolute cross sections in the 6-jet configuration but have better agreement with the normalised cross sections; nevertheless, some discrimination is still observed with the normalised cross sections. As before, the P +H ++ prediction is not compatible with data in the 5-jet configuration.
The complementarity of p tt out and p tt T is highlighted by the different agreement with data of the P +H 7 prediction in the 6-jet configuration; in p tt out the agreement is poor (p-value of 0.06) while it is good in the p tt T observable (p-value of 0.54). Contrariwise, in the 4-jet configuration the P +P 6.428 'radHi' prediction has a poor agreement in the p tt T variable (p-value 0.01) while it is in good agreement in the p tt out observable (p-value 0.32). An example of the discriminating power of the analysis is given in Figure 11; several predictions with different values of the fragmentation and renormalisation scales and of the h damp parameter are compared to the measured differential cross sections for the 6-jet configuration. From the comparison shown in Figure 11(a), it can be seen that among the three P 6 predictions the best agreement is obtained by the 'radLo' calculation which is tuned to yield a lower amount of gluon radiation. This sample has an h damp = m t and the factorisation and renormalisation scales increased by a factor of two compared to their nominal value. Since the h damp parameter in the 'radLo' calculation is the same as the one in the nominal sample, it is possible to conclude that the reason for the different behaviour is due to the scale variation. A similar conclusion can be drawn for the comparison of the P +P 8 sample in Figure 11(c) where the 'Var3c Down' calculation shows the best agreement. Changing h damp has a small impact as shown in the comparison of the two P +P 8 predictions with different h damp presented in Figure 11(b). The relative levels of agreement between data and the 'radHi' and 'radLo' predictions in the 6-jet configuration is opposite to what was observed in Ref.
[4] where the 'radHi' prediction was observed to have a better agreement with data, e.g. for the jet multiplicity spectrum. This is not the only difference between the results of the two analyses; for example, in Ref. [4] M G 5_aMC@NLO+H ++ was compatible with data while it is not compatible in some of the combinations of variables and jet multiplicity considered in this paper. It is clear that MC models have difficulty describing the sets of observables listed in the two papers simultaneously, but it is hoped that the sensitivity of the measurements shown here with respect to various MC parameters will provide constraints for any future MC models. Figure 12 shows the ratio of the data to the nominal prediction for the normalised p t,had T and p tt T differential cross sections for the three configurations. It can be seen that the differences between the data and the prediction are largest for the 4-jet configuration. The description of the 5-and 6-jet configurations by the prediction is slightly better. For p tt T , the conclusions are less clear, and a reduction of the uncertainties would help to discriminate between the different predictions. Table 3: Comparison of the measured fiducial phase space absolute differential cross sections as a function of p t,had T and the predictions from several MC generators in different n-jet configurations. For each prediction a χ 2 and a p-value are calculated using the covariance matrix of the measured spectrum. The number of degrees of freedom (NDF) is equal to the number of bins in the distribution.  Table 4: Comparison of the measured fiducial phase space normalised differential cross sections as a function of p t,had T and the predictions from several MC generators in different n-jet configurations. For each prediction a χ 2 and a p-value are calculated using the covariance matrix of the measured spectrum. The number of degrees of freedom (NDF) is equal to the number of bins in the distribution minus one.  Table 5: Comparison of the measured fiducial phase space absolute differential cross sections as a function of p tt T and the predictions from several MC generators in different n-jet configurations. For each prediction a χ 2 and a p-value are calculated using the covariance matrix of the measured spectrum. The number of degrees of freedom (NDF) is equal to the number of bins in the distribution.

4-jet exclusive
5-jet exclusive 6-jet inclusive χ 2 /NDF p-value χ 2 /NDF p-value χ 2 /NDF p-value  Table 6: Comparison of the measured fiducial phase space normalised differential cross sections as a function of p tt T and the predictions from several MC generators in different n-jet configurations. For each prediction a χ 2 and a p-value are calculated using the covariance matrix of the measured spectrum. The number of degrees of freedom (NDF) is equal to the number of bins in the distribution minus one.
4-jet exclusive 5-jet exclusive 6-jet inclusive χ 2 /NDF p-value χ 2 /NDF p-value χ 2 /NDF p-value     Table 7: Comparison of the measured fiducial phase space absolute differential cross sections as a function of p tt out and the predictions from several MC generators in different n-jet configurations. For each prediction a χ 2 and a p-value are calculated using the covariance matrix of the measured spectrum. The number of degrees of freedom (NDF) is equal to the number of bins in the distribution.  Table 8: Comparison of the measured fiducial phase space normalised differential cross sections as a function of p tt out and the predictions from several MC generators in different n-jet configurations. For each prediction a χ 2 and a p-value are calculated using the covariance matrix of the measured spectrum. The number of degrees of freedom (NDF) is equal to the number of bins in the distribution minus one.  Figure 11: Normalised differential cross sections as a function of p tt T in the 6-jet inclusive configuration in the fiducial phase space. The dark shaded area is the statistical uncertainty and the light shaded area represents the total uncertainty. [GeV]

Conclusions
Measurements of differential cross sections for top quark pair production in association with jets are presented using data from the 13 TeV pp collisions collected by the ATLAS detector at the LHC in 2015, corresponding to an integrated luminosity of 3.2 fb −1 . Both the absolute and normalised differential cross sections are measured as functions of the top quark transverse momentum, the transverse momentum of the top quark pair system and the out-of-plane transverse momentum. The top quark pair events are selected in the lepton (electron or muon) + jets channel and three mutually exclusive configurations are defined according to the number of additional jets reconstructed in each event. Regions of phase space sensitive to the effects of gluon radiation are identified. The predictions of several Monte Carlo calculations are compared to the measurements. Differences between the data and some of the predictions are observed. The measured p tt out and p tt T distributions in the 6-jet configuration disfavour several predictions. The measured p t,had T distribution in the 4-jet configuration is underestimated by the predictions at low values and overestimated at high values; this tendency of the predictions is reduced at higher jet multiplicity. Overall, the measurements presented here improve the discriminating power of previous ATLAS results and the data have the potential to further constrain the MC models used to describe the top quark pair production.