Measurements of top-quark pair differential cross-sections in the lepton+jets channel in $pp$ collisions at $\sqrt{s}=8$ TeV using the ATLAS detector

Measurements of normalized differential cross-sections of top-quark pair production are presented as a function of the top-quark, $t\bar{t}$ system and event-level kinematic observables in proton-proton collisions at a centre-of-mass energy of $\sqrt{s}=8$ TeV}. The observables have been chosen to emphasize the $t\bar{t}$ production process and to be sensitive to effects of initial- and final-state radiation, to the different parton distribution functions, and to non-resonant processes and higher-order corrections. The dataset corresponds to an integrated luminosity of 20.3 fb$^{-1}$, recorded in 2012 with the ATLAS detector at the CERN Large Hadron Collider. Events are selected in the lepton+jets channel, requiring exactly one charged lepton and at least four jets with at least two of the jets tagged as originating from a $b$-quark. The measured spectra are corrected for detector effects and are compared to several Monte Carlo simulations. The results are in fair agreement with the predictions over a wide kinematic range. Nevertheless, most generators predict a harder top-quark transverse momentum distribution at high values than what is observed in the data. Predictions beyond NLO accuracy improve the agreement with data at high top-quark transverse momenta. Using the current settings and parton distribution functions, the rapidity distributions are not well modelled by any generator under consideration. However, the level of agreement is improved when more recent sets of parton distribution functions are used.

1 Introduction The large top-quark pair production cross-section at the LHC allows detailed studies of the characteristics of tt production to be performed with respect to different kinematic variables, providing a unique opportunity to test the Standard Model (SM) at the TeV scale. Furthermore, effects beyond the SM can appear as modifications of tt differential distributions with respect to the SM predictions [1] which may not be detectable with an inclusive cross-section measurement. A precise measurement of the tt differential cross-section therefore has the potential to enhance the sensitivity to possible effects beyond the SM, as well as to clarify the ability of the theoretical calculations in describing the cross-section.
The ATLAS [2-4] and CMS [5] experiments have published measurements of the tt differential crosssections at a centre-of-mass energy √ s = 7 TeV in pp collisions, both in the full phase space using parton-level variables and in fiducial phase-space regions using observables constructed from final-state particles (particle level); the CMS experiment also published measurements of the tt differential crosssections with data taken at √ s = 8 TeV [6]. The results presented here represent the natural extension of the previous ATLAS measurements of the tt differential cross-sections to the √ s = 8 TeV dataset, and benefit from higher statistics and reduced detector uncertainties.
In the SM, the top quark decays almost exclusively into a W boson and a b-quark. The signature of a tt decay is therefore determined by the W boson decay modes. This analysis makes use of the lepton+jets tt decay mode, where one W boson decays into an electron or a muon and a neutrino and the other W boson decays into a pair of quarks, with the two decay modes referred to as the e+jets and µ+jets channel, respectively. Events in which the W boson decays to an electron or muon through a τ lepton decay are also included. This paper presents a set of measurements of the tt production cross-section as a function of different properties of the reconstructed top quark and of the tt system. The results, unfolded both to a fiducial particle-level phase space and to the full phase space, are compared to the predictions of Monte Carlo (MC) generators and to perturbative QCD calculations beyond the next-to-leading-order (NLO) approximation. The goal of unfolding to a fiducial particle-level phase space and of using variables directly related to detector observables is to allow precision tests of QCD, avoiding large model-dependent extrapolation corrections to the parton-level top-quark and to a phase space region outside the detector sensitivity. However, full phase-space measurements represent a valid test of higher-order calculations for which event generation with subsequent parton showering and hadronization is not yet available. A subset of the observables under consideration has been measured by CMS [5].
In addition to the variables measured at √ s =7 TeV [2-4], a set of new measurements is presented. These variables, similar to those used in dijet measurements at large jet transverse momentum [7,8], are sensitive to effects of initial-and final-state radiation, to the different parton distribution functions (PDF), and to non-resonant processes including particles beyond the Standard Model [9]. Finally, observables constructed as a function of the transverse momenta of the W boson and the b-quark originating from the top quark have been found to be sensitive to non-resonant effects (when one or both top-quarks are off-shell) [10] and non-factorizable higher-order corrections [11].
The paper is organized as follows: Section 2 briefly describes the ATLAS detector, while Section 3 describes the data and simulation samples used in the measurements. The reconstruction of physics objects and the event selection is explained in Section 4. Section 5 describes the kinematic reconstruction of the tt pairs using the pseudo-top algorithm. Section 6 discusses the background processes affecting these measurements. Event yields for both the signal and background samples, as well as distributions of measured quantities before unfolding, are shown in Section 7. The measurements of the cross-sections are described in Section 8. Statistical and systematic uncertainties are discussed in Section 9. The results are presented in Section 10, where the comparison with theoretical predictions is also discussed. Finally, a summary is presented in Section 11.

The ATLAS detector
ATLAS is a multi-purpose detector [12] that provides nearly full solid angle coverage around the interaction point. This analysis exploits all major components of the detector. Charged-particle trajectories with pseudorapidity 1 |η| < 2.5 are reconstructed in the inner detector, which comprises a silicon pixel detector, a silicon microstrip detector and a transition radiation tracker (TRT). The inner detector is embedded in a 2 T axial magnetic field. Sampling calorimeters with several different designs span the pseudorapidity range up to |η| = 4.9. High-granularity liquid argon (LAr) electromagnetic (EM) calorimeters are available up to |η| = 3.2. Hadronic calorimeters based on scintillator-tile active material cover |η| < 1.7 while LAr technology is used for hadronic calorimetry from |η| = 1.5 to |η| = 4.9. The calorimeters are surrounded by a muon spectrometer within a magnetic field provided by air-core toroid magnets with a bending integral of about 2.5 Tm in the barrel and up to 6 Tm in the endcaps. Three stations of precision drift tubes and cathode-strip chambers provide an accurate measurement of the muon track curvature in the region |η| < 2.7. Resistive-plate and thin-gap chambers provide muon triggering capability up to |η| = 2.4. Data are selected from inclusive pp interactions using a three-level trigger system. A hardware-based trigger (L1) uses custom-made hardware and low-granularity detector data to initially reduce the trigger rate to approximately 75 kHz. The detector readout is then available for two stages of software-based triggers. In the second level (L2), the trigger has access to the full detector granularity, but only retrieves data for regions of the detector identified by L1 as containing interesting objects. Finally, the Event Filter (EF) system makes use of the full detector readout to finalize the event selection. During the 2012 run period, the selected event rate for all triggers following the event filter was approximately 400 Hz. the luminosity scale derived from beam-separation scans. The average number of interactions per bunch crossing in 2012 was 21. Data events are considered only if they are acquired under stable beam conditions and with all sub-detectors operational. The data sample is collected using single-lepton triggers; for each lepton type the logical OR of two triggers is used in order to increase the efficiency for isolated leptons at low transverse momentum. The triggers with the lower p T thresholds include isolation requirements on the candidate lepton, resulting in inefficiencies at high p T that are recovered by the triggers with higher p T thresholds. For electrons the two transverse momentum thresholds are 24 GeV and 60 GeV while for muons the thresholds are 24 GeV and 36 GeV.
Simulated samples are used to characterize the detector response and efficiency to reconstruct tt events, estimate systematic uncertainties and predict the background contributions from various processes. The response of the detector is simulated [14] using a detailed model implemented in GEANT4 [15]. For the evaluation of some systematic uncertainties, generated samples are passed through a fast simulation using a parameterization of the performance of the ATLAS electromagnetic and hadronic calorimeters [16]. Simulated events include the effect of multiple pp collisions from the same and previous bunch-crossings (in-time and out-of-time pile-up) and are re-weighted to match the same number of collisions as observed in data. All simulated samples are normalized to the integrated luminosity of the data sample; in the normalization procedure the most precise cross-section calculations available are used.
The nominal signal tt sample is generated using the Powheg-Box [17] generator, based on next-to-leadingorder QCD matrix elements. The CT10 [18] parton distribution functions are employed and the top-quark mass (m t ) is set to 172.5 GeV. The h damp parameter, which effectively regulates the high-p T radiation in Powheg, is set to the top-quark mass. Parton showering and hadronization are simulated with Pythia [19] (version 6.427) using the Perugia 2011C set of tuned parameters [20]. The effect of the systematic uncertainties related to the PDF for the signal simulation are evaluated using samples generated with MC@NLO [21] (version 4.01) using the CT10nlo PDF set, interfaced to Herwig [22] (version 6.520) for parton showering and hadronization, and Jimmy [23] (version 4.31) for the modelling of multiple parton scattering. For the evaluation of systematic uncertainties due to the parton showering model, a Powheg+Herwig sample is compared to a Powheg+Pythia sample. The h damp parameter in the Powheg+Herwig sample is set to infinity. The uncertainties due to QCD initial-and final-state radiation (ISR/FSR) modelling are estimated with samples generated with Powheg-Box interfaced to Pythia for which the parameters of the generation (Λ QCD , Q 2 max scale, transverse momentum scale for space-like parton-shower evolution and the h damp parameter) are varied to span the ranges compatible with the results of measurements of tt production in association with jets [24][25][26]. Finally, two additional tt samples are used only in the comparison against data. The first one is a sample of Powheg matrix elements generated with the nominal settings interfaced to Pythia8 [27] (version 8.186 and Main31 user hook) and the AU14 [28] set of tuned parameters. In the second sample, MadGraph [29] tt matrix elements with up to three additional partons are interfaced to Pythia using the matrix-element to parton-shower MLM matching scheme [30] and the Perugia 2011C set of tuned parameters [20].
The tt samples are normalized to the NNLO+NNLL cross-section of σ tt = 253 +13 −15 pb (scale, PDF and α S ), evaluated using the Top++2.0 program [31], which includes the next-to-next-to-leading-order QCD corrections and resums next-to-next-to-leading logarithmic soft gluon terms [32][33][34][35][36][37]. The quoted crosssection corresponds to a top-quark mass of 172.5 GeV. Each tt sample is produced requiring at least one semileptonic decay in the tt pair.
Single-top-quark processes for the s-channel, t-channel and Wt associated production constitute the largest background in this analysis. These processes are simulated with Powheg-Box using the PDF set CT10 and showered with Pythia (version 6.427) calibrated with the P2011C tune [20] and the PDF set CTEQ6L1 [38]. All possible production channels containing one lepton in the final state are considered. All samples are generated requiring the presence of a leptonically decaying W boson. The cross-sections multiplied by the branching ratios for the leptonic W decay employed for these processes are normalized to NLO+NNLL calculations [39][40][41].
Leptonic decays of vector bosons produced in association with high-p T jets, referred to as W+jets and Z+jets, constitute the second largest background in this analysis. Samples of simulated W/Z+jets events with up to five additional partons in the LO matrix elements are produced with the Alpgen generator (version 2.13) [42] using the PDF set CTEQ6L1 [38] and interfaced to Pythia (version 6.427) for parton showering; the overlap between samples is dealt with by using the MLM matching scheme [30]. Heavyflavour quarks are included in the matrix-element calculations to produce the Wbb, Wcc, Wc, Zbb and Zcc samples. The overlap between the heavy-flavour quarks produced by the matrix element and by the parton shower is removed. The W+jets samples are normalized to the inclusive W boson NNLO crosssection [43,44] and corrected by applying additional scale factors derived from data, as described in Section 6.
Diboson production is modelled using Herwig and Jimmy with the CTEQ6L1 PDF set [38] and the yields are normalized using the NLO cross-sections [45]. All possible production channels containing at least one lepton in the final states are considered.

Object definition and event selection
The lepton+jets tt decay mode is characterized by the presence of a high-p T lepton, missing transverse momentum due to the neutrino, two jets originating from b-quarks, and two jets from the hadronic W boson decay.
The following sections describe the detector-level, particle-level and parton-level objects used to characterize the final-state event topology and to define a fiducial phase-space region for the measurements. E T > 25 GeV. The associated track must have a longitudinal impact parameter |z 0 | < 2 mm with respect to the primary vertex. Isolation requirements, on calorimeter and tracking variables, are used to reduce the background from non-prompt electrons. The calorimeter isolation variable is based on the energy sum of cells within a cone of size ∆R < 0.2 around the direction of each electron candidate. This energy sum excludes cells associated with the electron cluster and is corrected for leakage from the electron cluster itself and for energy deposits from pile-up. The tracking isolation variable is based on the track p T sum around the electron in a cone of size ∆R < 0.3, excluding the electron track. In every p T bin both requirements are chosen to result separately in a 90% electron selection efficiency for prompt electrons from Z boson decays.
Muon candidates are defined by matching tracks in the muon spectrometer with tracks in the inner detector. The track p T is determined through a global fit of the hits which takes into account the energy loss in the calorimeters. The track is required to have |z 0 | < 2 mm and a transverse impact parameter significance, |d 0 /σ(d 0 )| < 3, consistent with originating in the hard interaction. Muons are required to have p T > 25 GeV and be within |η| < 2.5. To reduce the background from muons originating from heavy-flavour decays inside jets, muons are required to be separated by ∆R > 0.4 from the nearest jet, and to be isolated. They are required to satisfy the isolation requirement I < 0.05, where the isolation variable is the ratio of the sum of p T of tracks, excluding the muon, in a cone of variable size ∆R = 10 GeV/p T (µ) to the p T of the muon [46]. The isolation requirement has an efficiency of about 97% for prompt muons from Z boson decays.
Jets are reconstructed using the anti-k t algorithm [47] implemented in the FastJet package [48] with radius parameter R = 0.4. The jet reconstruction starts from topological clusters calibrated and corrected for pileup effects using the jet area method [49]. A residual correction dependent on the instantaneous luminosity and the number of reconstructed primary vertices in the event [50] is then applied. They are calibrated using an energy-and η-dependent simulation-based calibration scheme, with in situ corrections based on data [51] and are accepted if p T > 25 GeV and |η| < 2.5. To reduce the contribution from jets associated with pile-up, jets with p T < 50 GeV are required to satisfy |JVF| > 0.5, where JVF is the ratio of the sum of the p T of tracks associated with both the jet and the primary vertex, to the sum of p T of all tracks associated with the jet. Jets with no associated tracks or with |η| > 2.4 at the edge of the tracker acceptance are always accepted.
To prevent double-counting of electron energy deposits as jets, the closest jet lying within ∆R < 0.2 from a reconstructed electron is removed. To remove leptons from heavy-flavour decays, the lepton is discarded if the lepton is found to lie within ∆R < 0.4 from a selected jet axis.
The purity of the selected tt sample is improved by tagging jets containing b-hadrons, exploiting their long decay time and the large mass. Information from the track impact parameters, secondary vertex location and decay topology are combined in a neural-network-based algorithm (MV1) [52]. The operating point used corresponds to an overall 70% b-tagging efficiency in tt events, and to a probability to mis-identify light-flavour jets of approximately 1%.
The missing transverse momentum E miss T is computed from the vector sum of the transverse momenta of the reconstructed calibrated physics objects (electrons, photons, hadronically decaying τ leptons, jets and muons) as well as the transverse energy deposited in the calorimeter cells not associated with these objects [53]. Calorimeter cells not associated with any physics object are calibrated using tracking information 7 before being included in the E miss T calculation. The contribution from muons is added using their momentum. To avoid double counting of energy, the parameterized muon energy loss in the calorimeters is subtracted in the E miss T calculation.

Event selection at detector level
The event selection consists of a set of requirements based on the general event quality and on the reconstructed objects, defined above, that characterize the final-state event topology. Each event must have a reconstructed primary vertex with five or more associated tracks. The events are required to contain exactly one reconstructed lepton candidate with p T > 25 GeV geometrically matched to a corresponding object at the trigger level and at least four jets with p T > 25 GeV and |η| < 2.5. At least two of the jets have to be tagged as b-jets. The event selection is summarized in Table 1. The event yields are displayed in Table 2 for data, simulated signal, and backgrounds (the background determination is described in Section 6). Figure 1 shows, for some key distributions, the comparison between data and predictions normalized to the data integrated luminosity. The selection produces a quite clean tt sample, the total background being at the 10% level. The difference between data and predicted event yield is ∼ 7%, in fair agreement with the theoretical uncertainty on the tt total cross-section used to normalize the signal MC simulation (see Section 3).

Particle-level objects and fiducial phase-space definition
Particle-level objects are defined for simulated events in analogy to the detector-level objects described above. Only stable final-state particles, i.e. particles that are not decayed further by the generator, and unstable particles 2 that are to be decayed later by the detector simulation, are considered.
The fiducial phase space for the measurements presented in this paper is defined using a series of requirements applied to particle-level objects close to those used in the selection of the detector-level objects. The procedure explained in this section is applied to the tt signal only, since the background subtraction is performed before unfolding the data.
Electrons and muons must not originate, either directly or through a τ decay, from a hadron in the MC particle record. This ensures that the lepton is from an electroweak decay without requiring a direct match to a W boson. The four-momenta of leptons are modified by adding the four-momenta of all photons within ∆R = 0.1 that do not originate from hadron decays to take into account final-state QED radiation. Such leptons are then required to have p T > 25 GeV and |η| < 2.5. Electrons in the transition region (1.37 < η < 1.52 ) are rejected at the detector level but accepted in the fiducial selection. This difference is accounted for by the efficiency correction described in Section 8.1.
The particle-level missing transverse momentum is calculated from the four-vector sum of the neutrinos, discarding neutrinos from hadron decays, either directly or through a τ decay. Particle-level jets are clustered using the anti-k t algorithm with radius parameter R = 0.4, starting from all stable particles, except for selected leptons (e, µ, ν) and the photons radiated from the leptons. Particle-level jets are required to have p T > 25 GeV and |η| < 2.5. Hadrons containing a b-quark with p T > 5 GeV are associated with jets through a ghost matching technique as described in Ref. [49]. Particle b-tagged jets have p T > 25 GeV and |η| < 2.5. The events are required to contain exactly one reconstructed lepton candidate with p T > 25 GeV and at least four jets with p T > 25 GeV and |η| < 2.5. At least two of the jets have to be b-tagged. Dilepton events where only one lepton passes the fiducial selection are by definition included in the fiducial measurement.

Parton-level objects and full phase-space definition
Parton-level objects are defined for simulated events. Only top quarks decaying directly to a W boson and a b-quark in the simulation are considered 3 . The full phase space for the measurements presented in this paper is defined by the set of tt pairs in which one top quark decays semileptonically (including τ leptons) and the other decays hadronically. Events in which both top quarks decay semileptonically define the dilepton background, and are thus removed from the signal simulation.
2 Particles with a mean lifetime τ > 300 ps 3 These particles are labelled by a status code 155 in Herwig, 3 in Pythia and 22 in Pythia8 respectively.

Kinematic reconstruction
The pseudo-top algorithm [4] reconstructs the kinematics of the top quarks and their complete decay chain from final-state objects, namely the charged lepton (electron or muon), missing transverse momentum, and four jets, two of which are b-tagged. By running the same algorithm on detector-and particle-level objects, the degree of dependency on the details of the simulation is strongly reduced compared to correcting to parton-level top quarks.
In the following, when more convenient, the leptonically (hadronically) decaying W boson is referred to as the leptonic (hadronic) W boson, and the semileptonically (hadronically) decaying top quark is referred to as the leptonic (hadronic) top quark.
The algorithm starts with the reconstruction of the neutrino four-momentum. The z-component of the neutrino momentum is calculated using the W boson mass constraint imposed on the invariant mass of the system of the charged lepton and the neutrino. If the resulting quadratic equation has two real solutions, the one with smallest absolute value of |p z | is chosen. If the determinant is negative, only the real part is considered. The leptonic W boson is reconstructed from the charged lepton and the neutrino and the leptonic top quark is reconstructed from the leptonic W and the b-tagged jet closest in ∆R to the charged lepton. The hadronic W boson is reconstructed from the two non-b-tagged jets whose invariant mass is closest to the mass of the W boson. This choice yields the best performance of the algorithm in terms of the correlation between detector, particle and parton levels. Finally, the hadronic top quark is reconstructed from the hadronic W boson and the other b-jet. In events with more than two b-tagged jets, only the two with the highest transverse momentum are considered.

Background determination
The single-top-quark background is the largest background contribution, amounting to approximately 4% of the total event yield and 40% of the total background estimate.
The shape of the distributions of the kinematical variables of this background is evaluated with a Monte Carlo simulation, and the event yields are normalized to the most recent calculations of their crosssections, as described in Section 3. The overlap between the Wt and tt samples is handled using the diagram removal scheme [54].
The W+jets background represents the second largest background. After the event selection, approximately 3-4% of the total event yield and 35% of the total background estimate is due to W+jets events. The estimation of this background is performed using a combination of MC simulation and data-driven techniques. The Alpgen+Pythia W+jets samples, normalized to the inclusive W boson NNLO crosssection, are used as a starting point while the absolute normalization and the heavy-flavour fractions of this process, which are affected by large theoretical uncertainties, are determined from data.
The corrections for generator mis-modelling in the fractions of W boson production associated with jets of different flavour components (W + bb, W + cc, W + c) are estimated in a sample with the same lepton and E miss T selections as the signal selection, but with only two jets and no b-tagging requirements. The b-jet multiplicity, in conjunction with knowledge of the b-tagging and mis-tag efficiency, is used to extract the heavy-flavour fraction. This information is extrapolated to the signal region using MC simulation, assuming constant relative rates for the signal and control regions.
The overall W+jets normalization is then obtained by exploiting the expected charge asymmetry in the production of W + and W − bosons in pp collisions. This asymmetry is predicted by theory [55] and evaluated using MC simulation, while other processes in the tt sample are symmetric in charge except for a small contamination from single-top and WZ events, which is subtracted using MC simulation. The total number of W+jets events in the sample can thus be estimated with the following equation: where r MC is the ratio of the number of events with positive leptons to the number of events with negative leptons in the MC simulation, and D + and D − are the number of events with positive and negative leptons in the data, respectively.
Multi-jet production processes have a large cross-section and mimic the lepton+jets signature due to jets misidentified as prompt leptons (fake leptons) or semileptonic decays of heavy-flavour hadrons (nonprompt real leptons). This background is estimated directly from data by using the matrix-method technique [56]. The number of background events in the signal region is evaluated by applying efficiency factors to the number of events passing the tight (signal) and loose selection. The fake leptons efficiency is measured using data in control regions dominated by the multi-jet background with the real-lepton contribution subtracted using MC simulation. The real leptons efficiency is extracted from a tag-and-probe technique using leptons from Z boson decays. Fake leptons events contribute to the total event yield at approximately the 1-2% level.
Z+jets and diboson events are simulated with MC generators, and the event yields are normalized to the most recent theoretical calculation of their cross-sections. The total contribution of these processes is less than 1% of the total event yield or approximatively 10% of the total background.
Top-quark pair events with both top quarks and anti-top quarks decaying semileptonically (including decays to τ) can sometimes pass the event selection, contributing approximately 5% to the total event yield. The fraction of dileptonic tt events in each p T bin is estimated with the same MC sample used for the signal modelling. In the fiducial phase-space definition, semileptonic top-quark decays to τ leptons in lepton+jets tt events are considered as signal only if the τ lepton decays leptonically.

Observables
A set of measurements of the tt production cross-sections is presented as a function of kinematic observables. In the following, the indices had and lep refer to the hadronically and semileptonically decaying top quarks, respectively. The indices 1 and 2 refer respectively to the leading and sub-leading top quark, ordered by transverse momentum.

13
First, a set of baseline observables is presented: transverse momentum (p t,had T ) and absolute value of the rapidity ( y t,had ) of the hadronically decaying top quark (which was chosen over the leptonic top quark due to better resolution), and the transverse momentum (p tt T ), absolute value of the rapidity ( y tt ) and invariant mass (m tt ) of the tt system. These observables, shown in Figure 2, have been previously measured by the ATLAS experiment using the 7 TeV dataset [3, 4] except for y t,had which has not been measured in the full phase-space. The level of agreement between data and prediction is within the quoted uncertainties for y t,had , m tt and p tt T . A trend is observed in the p t,had T distribution, which is not well modelled at high values. A fair agreement between data and simulation is observed for large absolute values of the tt rapidity.
Furthermore, angular variables sensitive to a p T imbalance in the transverse plane, i.e. to the emission of radiation associated with the production of the top-quark pair, are employed to emphasize the central production region [8]. The angle between the two top quarks has been found to be sensitive to non-resonant contributions due to hypothetical new particles exchanged in the t-channel [7]. The rapidities of the two top quarks in the laboratory frame are denoted by y t,1 and y t,2 , while their rapidities in the tt centre-ofmass frame are y = 1 2 y t,1 − y t,2 and −y . The longitudinal motion of the tt system in the laboratory frame is described by the rapidity boost y tt boost = 1 2 y t,1 + y t,2 and χ tt = e 2|y | , which is closely related to the production angle. In particular, many signals due to processes not included in the Standard Model are predicted to peak at low values of χ tt [7]. Finally, observables depending on the transverse momentum of the decay products of the top quark have been found to be sensitive to higher-order corrections [10,11].
The following additional observables are measured: • The absolute value of the azimuthal angle between the two top quarks (∆φ tt ); • the absolute value of the out-of-plane momentum ( p tt out ), i.e. the projection of top-quark threemomentum onto the direction perpendicular to a plane defined by the other top quark and the beam axis (z) in the laboratory frame [8]: (2) • the longitudinal boost of the tt system in the laboratory frame (y tt boost ) [7]; • the production angle between the two top quarks (χ tt ) [7]; • the scalar sum of the transverse momenta of the two top quarks (H tt T ) [10,11] • and the ratio of the transverse momenta of the hadronic W boson and the top quark from which it originates (R Wt ) [10,11] These observables are shown in Figure 3 at detector level. All these variables show only modest agreement with data. In particular, at high values of H tt T , fewer events are observed with respect to the prediction. The longitudinal boost y tt boost is predicted to be less central than the data. Finally, R Wt is predicted to be lower than observed in the range 1.5-3.0.

Unfolding procedure
The underlying differential cross-section distributions are obtained from the detector-level events using an unfolding technique that corrects for detector effects. The iterative Bayesian method [57] as implemented in RooUnfold [58] is used. The individual e+jets and µ+jets channels give consistent results and are therefore combined by summing the event yields before the unfolding procedure.

Fiducial phase space
The unfolding starts from the detector-level event distribution (N reco ), from which the backgrounds (N bg ) are subtracted first. Next, the acceptance correction f acc corrects for events that are generated outside the fiducial phase-space but pass the detector-level selection.
In order to separate resolution and combinatorial effects, distributions evaluated using a Monte Carlo simulation are corrected to the level where detector-and particle-level objects forming the pseudo-top quarks are angularly well matched. The matching correction f match accounts for the corresponding efficiency. The matching is performed using geometrical criteria based on the distance ∆R. Each particle e (µ) is matched to the closest detector-level e (µ) within ∆R < 0.02. Particle-level jets are geometrically matched to the closest detector-level jet within ∆R < 0.4. If a detector-level jet is not matched to a particle-level jet, it is assumed to be either from pile-up or matching inefficiency and is ignored. If two jets are reconstructed as being ∆R < 0.4 from a single particle-level jet, the detector-level jet with smaller ∆R is matched to the particle-level jet and the other detector-level jet is unmatched.
The unfolding step uses a migration matrix (M) derived from simulated tt events which maps the binned generated particle-level events to the binned detector-level events. The probability for particle-level events to remain in the same bin is therefore represented by the elements on the diagonal, and the off-diagonal elements describe the fraction of particle-level events that migrate into other bins. Therefore, the elements of each row add up to unity as shown in Figure 4(d). The binning is chosen such that the fraction of events in the diagonal bins is always greater than 50%. The unfolding is performed using four iterations to balance the goodness of fit and the statistical uncertainty. The effect of varying the number of iterations by one was tested and proved to be negligible. Finally, the efficiency correction f eff corrects for events which pass the particle-level selection but are not reconstructed at the detector level.
All corrections are evaluated with simulation and are presented in Figure 4 for the case of the p T of the top quark decaying hadronically. This variable is particularly representative since the kinematics of the decay products of the top quark change substantially in the observed range. The decrease of the efficiency at high values is primarily due to the increasingly large fraction of non-isolated leptons and close or merged jets in events with high top-quark p T ; in order to improve the selection efficiency in this boosted kinematic region, jets with larger R radius, with respect to the one used in this study, are required [59]. A similar effect is observed in the tail of the tt transverse momentum and rapidity, small ∆φ tt angle and high H tt T distributions. The matching corrections reach the highest values, of the order of f match = 0.6-0.7, at low tt transverse momentum and large tt rapidity. Generally, the acceptance corrections are constant and close to unity, indicating very good correlation between the detector-and the particle-level reconstruction. This is also apparent from the high level of diagonality of the migration matrices, with correlations between particle and detector levels of 85-95%.
The unfolding procedure for an observable X at particle level is summarized by the expression where the index j iterates over bins of X at detector level while the i index labels bins at particle level; ∆X i is the bin width while L is the integrated luminosity and the Bayesian unfolding is symbolized by M −1 i j . The integrated cross-section is obtained by integrating the unfolded cross-section over the kinematic bins, and its value is used to compute the normalized differential cross-section 1/σ fid · dσ fid /dX i .

Full phase space
The measurements are extrapolated to the full phase space of the tt system using a procedure similar to the one described in Section 8.1. The only difference is in the value used for the binning. The binning used by the CMS experiment in Ref.
[5] is used for the observables measured by both experiments to facilitate future combinations. This binning is found to be compatible with the resolution of each observable. The fiducial phase-space binning is used for all the other observables. In order to unambiguously define leptonic and hadronic top quarks, the contribution of tt pairs decaying dileptonically is removed by applying a correction factorf ljets which represents the fraction of tt single-lepton events in the nominal sample. The τ leptons from the leptonically decaying W bosons are considered as signal regardless of the τ decay mode. The cross-section measurements are defined with respect to the top quarks before the decay (parton level) and after QCD radiation. Observables related to top quarks are extrapolated to the full phase-space starting from top quarks decaying hadronically at the detector level.
The acceptance correctionf acc corrects for detector-level events which are reconstructed outside the parton-level bin range for a given variable. The migration matrix (M) is derived from simulated tt events decaying in the single-lepton channel and the efficiency correctionf eff corrects for events which did not pass the detector-level selection.
The unfolding procedure is summarized by the expression where the index j iterates over bins of observable X at the detector level while the i index labels bins at the parton level; ∆X i is the bin width, B = 0.438 is the single-lepton branching ratio, L is the integrated luminosity and the Bayesian unfolding is symbolized byM −1 i j . The integrated cross-section is obtained by integrating the unfolded cross-section over the kinematic bins, and its value is used to compute the normalized differential cross-section 1/σ full · dσ full /dX i . To ensure that the results are not biased by the MC generator used for the unfolding procedure, a study is performed in which the particle-and parton-level spectra in simulation are altered by changing the shape of the distributions using continuous functions chosen depending on the observable. The studies confirm that these altered shapes are recovered within statistical uncertainties by the unfolding based on the nominal migration matrices.

Uncertainties
This section describes the estimation of systematic uncertainties related to object reconstruction and calibration, MC generator modelling and background estimation.
To evaluate the impact of each uncertainty after the unfolding, the reconstructed distribution expected from simulation is varied. Corrections based on the nominal Powheg-Box signal sample are used to correct for detector effects and the unfolded distribution is compared to the known particle-or parton-level distribution. All detector-and background-related systematic uncertainties have been evaluated using the same generator, while alternative generators have been employed to assess modelling systematic uncertainties (e.g. different parton showers). In these cases the corrections, derived from the nominal generator, are used to unfold the detector-level spectra of the alternative generator. The relative difference between the unfolded spectra and the corresponding particle-or parton-level spectra of the alternative generator is taken as the uncertainty related to the generator modelling. After the unfolding, each distribution is normalized to unit area.
The covariance matrices for the normalized unfolded spectra due to the statistical and systematic uncertainties are obtained by evaluating the covariance between the kinematic bins using pseudo-experiments. In particular, the correlations due to statistical fluctuations for both data and the signal are evaluated by varying the event counts independently in every bin before unfolding, and then propagating the resulting variations through the unfolding.

Object reconstruction and calibration
The jet energy scale uncertainty is derived using a combination of simulations, test beam data and in situ measurements

Signal modelling
The uncertainties of the signal modelling affect the kinematic properties of simulated tt events and reconstruction efficiencies.
To assess the uncertainty related to the generator, events simulated with MC@NLO+Herwig are unfolded using the migration matrix and correction factors derived from the Powheg+Herwig sample. The difference between the unfolded distribution and the known particle-or parton-level distribution of the MC@NLO+Herwig sample is assigned as the relative uncertainty for the fiducial or full phase-space distributions, respectively. This uncertainty is found to be in the range 2-5%, depending on the variable, increasing up to 10% at large p t T , m tt , p tt T and y tt . The observable that is most affected by these uncertainties is m tt in the full phase space.
To assess the impact of different parton-shower models, unfolded results using events simulated with Powheg interfaced to Pythia are compared to events simulated with Powheg interfaced to Herwig, using the same procedure described above to evaluate the uncertainty related to the tt generator. The resulting systematic uncertainties, taken as the symmetrized difference, are found to be typically at the 1-3% level.
In order to evaluate the uncertainty related to the modelling of the ISR/FSR, tt MC samples with modified ISR/FSR modelling are used. The MC samples used for the evaluation of this uncertainty are generated using the Powheg generator interfaced to Pythia, where the parameters are varied as described in Section 3. This uncertainty is found to be in the range 2-5%, depending on the variable of the tt system considered, and reaching the largest values at high y t and small p tt T . The impact of the uncertainty related to the PDF is assessed by means of tt samples generated with MC@NLO interfaced to Herwig. An envelope of spectra is evaluated by reweighting the central prediction of the CT10nlo PDF set, using the full set of 52 eigenvectors at 68% CL. This uncertainty is found to be less than 1%.
As a check, the effect of the uncertainty on the top-quark mass was evaluated and found to affect only the efficiency correction by less than 1%, consistent with what was observed by ATLAS for the analogous measurement with the 7 TeV data [4]. 21

Background modelling
Systematics affecting the background are modelled by adding to the signal spectrum the difference of the systematics-varied and nominal backgrounds.
The single-top background is assigned an uncertainty associated with the theoretical calculations used for its normalization [39][40][41]. The overall impact of this systematic uncertainty on the signal is around 0.5%.
The systematic uncertainties due to the overall normalization and the heavy-flavour fraction of W+jets events are obtained by varying the data-driven scale factors within the statistical uncertainty of the W+jets MC sample. The W+jets shape uncertainty is extracted by varying the renormalization and matching scales in Alpgen. The W+jets MC statistical uncertainty is also taken into account. The overall impact of this uncertainty is less than 1%.
The uncertainty on the background from non-prompt and fake-leptons is evaluated by varying the definition of loose leptons, changing the selection used to form the control region and propagating the statistical uncertainty of parameterizations of the efficiency to pass the tighter lepton requirements for real and fake leptons. The combination of all these components also affects the shape of the background. The overall impact of this systematic uncertainty is less than 1%.
A 50% uncertainty is applied to the normalization of the Z+jets background, including the uncertainty on the cross-section and a further 48% due to uncertainties related to the requirement of the presence of at least four jets. A 40% uncertainty is applied to the diboson background, including the uncertainty on the cross-section and a further 34% due to the presence of two additional jets. The overall impact of these uncertainties is less than 1%, and the largest contribution is due to the Z+jets background.

Results
In this section, comparisons between unfolded data distributions and several SM predictions are presented for the different observables discussed in Section 7. Events are selected by requiring exactly one lepton and at least four jets with at least two of the jets tagged as originating from a b-quark. Normalized differential cross-sections are shown in order to remove systematic uncertainties on the normalization.
The SM predictions are obtained using different MC generators. The Powheg-Box generator [17], denoted "PWG" in the figures, is employed with three different sets of parton shower models, namely Pythia [19], Pythia8 [27] and Herwig [22]. The other NLO generator is MC@NLO [21] interfaced with the Herwig parton shower. Generators at the LO accuracy are represented by MadGraph [29] interfaced with Pythia for parton showering, which calculates tt matrix elements with up to three additional partons and implements the matrix-element to parton-shower MLM matching scheme [30].
The level of agreement between the measured distributions and simulations with different theoretical predictions is quantified by calculating χ 2 values, employing the full covariance matrices, and inferring pvalues (probabilities that the χ 2 is larger than or equal to the observed value) from the χ 2 and the number of degrees of freedom (NDF). Uncertainties on the predictions are not included. The normalization constraint used to derive the normalized differential cross-sections lowers by one unit the NDF and the rank of the N b × N b covariance matrix, where N b is the number of bins of the spectrum under consideration [68]. In order to evaluate the χ 2 the following relation is used where V N b −1 is the vector of differences between data and prediction obtained by discarding one of the None of the predictions is able to correctly describe all the distributions, as also witnessed by the χ 2 values and the p-values listed in Table 3. In particular, a certain tension between data and all predictions is observed in the case of the hadronic top-quark transverse momentum distribution for values higher than about 400 GeV. No electroweak corrections [69] are included in these predictions, as these have been shown to have a measurable impact only at very high values of the top quark transverse momentum, leading to a slightly softer p t,had T spectrum as confirmed by the recent ATLAS measurement of the tt differential distribution of the hadronic top-quark p T for boosted top quarks [59]. The effect of electroweak corrections alone is not large enough to solve this discrepancy completely [59,74]. The shape of the y t,had distribution shows only a modest agreement for all the generators, with larger discrepancies observed in the forward region for Powheg+Pythia and Powheg+Pythia8.
For the m tt distribution, the Powheg+Pythia, Powheg+Pythia8 and Powheg+Herwig generators are in better agreement with the data. All generators are in good agreement in the p tt T spectrum except for Powheg+Herwig in the last bin. This observation suggests that setting h damp = m t in the Powheg samples improves the agreement at high values of the tt transverse momentum. The data at high values of tt rapidity is not adequately described by any of the generators considered. The same conclusions hold for the analogous distribution for the absolute spectra, although the overall agreement estimated with the χ 2 values and the p-values is better due to the larger uncertainties.
For the variables describing the hard-scattering interaction, the production angle χ tt is well described in the central region. The forward region, described by the tail of this observable and by the tail of the longitudinal boost y tt boost , is not described correctly by any of the generators under consideration. For the variables describing the radiation along the tt pair momentum direction, both p tt out and ∆φ tt indicate that the kinematics of top quarks produced in the collinear region (∆φ tt π/2) are described with fair agreement by all the generators, but the uncertainty is particularly large in this region. The tension observed in the p t,had T spectrum is reflected in the tail of the H tt T distribution. Finally, the ratio of the hadronic W boson and top-quark transverse momenta shows a mis-modelling in the range 1.5-3 for all the generators.
The difficulty in correctly predicting the data in the forward region was further investigated by studying the dependence of the predictions on different PDF sets. The study was performed for the rapidity observables y t,had , y tt and y tt boost , shown in Figure 10 and comparing the data with the predictions of MC@NLO+Herwig for more recent sets of parton distribution functions. The results exhibit a general improvement in the description of the forward region for the most recent PDF sets (CT14nlo [75], CJ12mid [76], MMHT2014nlo [77], NNPDF 3.0 NLO [78], METAv10LHC [79], HERAPDF 2.0 NLO [80]). The improvement with respect to CT10nlo is also clearly shown in Table 5 which lists the χ 2 and corresponding p-values for the different sets. The only exception is represented by the y t,had distribution using HERAPDF 2.0 NLO, for which a disagreement in the forward region is observed.

ATLAS
Fiducial phase-space

ATLAS
Fiducial phase-space

ATLAS
Fiducial phase-space  Figure 6: Fiducial phase-space normalized differential cross-sections as a function of the (a) invariant mass (m tt ), (b) transverse momentum (p tt T ) and (c) absolute value of the rapidity ( y tt ) of the tt system. The yellow bands indicate the total uncertainty on the data in each bin. The Powheg+Pythia generator with h damp = m t and the CT10nlo PDF is used as the nominal prediction to correct for detector effects.

ATLAS
Fiducial phase-space

ATLAS
Fiducial phase-space  Figure 9: Fiducial phase-space normalized differential cross-sections as a function of the (a) scalar sum of the transverse momenta of the hadronic and leptonic top quarks (H tt T ) and (b) the ratio of the hadronic W and the hadronic top transverse momenta (R Wt ). The yellow bands indicate the total uncertainty on the data in each bin. The Powheg+Pythia generator with h damp = m t and the CT10nlo PDF is used as the nominal prediction to correct for detector effects.  Figure 10: Fiducial phase-space normalized differential cross-sections as a function of the (a) absolute value of the rapidity of the hadronic top quark ( y t,had ), (b) absolute value of the rapidity ( y tt ) of the tt system and (c) longitudinal boost (y tt boost ). The yellow bands indicate the total uncertainty on the data in each bin. The MC@NLO+Herwig generator is reweighted using the new PDF sets to produce the different predictions. The Powheg+Pythia generator with h damp = m t and the CT10nlo PDF is used as the nominal prediction to correct for detector effects.
The set of Figures 11-14 presents the normalized tt full phase-space differential cross-sections as a function of the different observables. In particular, Figures 11(a) and 11(b) show the top-quark transverse momentum and the absolute value of the rapidity; Figures 12(a), 12(b) and 12(c) present the tt system invariant mass, transverse momentum and absolute value of the rapidity while the additional observables related to the tt system are shown in Figures 13 and 14. Regarding the comparison between data and predictions, the general picture, already outlined for the fiducial phase-space measurements, is still valid even though the uncertainties are much larger due to the full phase-space extrapolation. In particular, the predictions for the top-quark p T and H tt T tend to be in a better agreement with the data than what is observed in the fiducial phase-space. The χ 2 and corresponding p-values for the different observables and predictions are shown in Table 4.
In Figures 15-18 the normalized tt full phase-space differential cross-section as a function of p t T , y t , m tt and y tt are compared with theoretical higher-order QCD calculations.
The measurements are compared to four calculations that offer beyond-NLO accuracy: • an approximate next-to-next-to-leading-order (aNNLO) calculation based on QCD threshold expansions beyond the leading logarithmic approximation [81] using the CT14nnlo PDF [75]; • an approximate next-to-next-to-next-to-leading-order (aN 3 LO) calculation based on the resummation of soft-gluon contributions in the double-differential cross section at next-to-next-to-leadinglogarithm (NNLL) accuracy in the moment-space approach in perturbative QCD [82] using the MSTW2008nnlo PDF [83]; • an approximate NLO+NNLL calculation [84] using the MSTW2008nnlo PDF [83].
• a full NNLO calculation [85] using the MSTW2008nnlo PDF [83]. The NNLO prediction does not cover the highest bins in p t T and m tt . These predictions have been interpolated in order to match the binning of the presented measurements. Table 6 shows the χ 2 and p-values for these higher-order QCD calculations. Figures 15 and 16 show a comparison of the p t T and y t distributions to the aNNLO and aN 3 LO, and to the NNLO calculations respectively. The aN 3 LO calculation is seen to improve the agreement compared to the Powheg+Pythia generator in y t , but not in p t T . The aNNLO prediction produces a p t T distribution that is softer than the data at high transverse momentum and does not improve the description of y t . The NNLO calculation is in good agreement with both the p t T and y t distributions, in particular the disagreement seen at high p t T for the NLO generators is resolved by the NNLO calculation. The measurement of the invariant mass and transverse momentum of the tt system is compared to the NLO+NNLL prediction in Figure 17. The NLO+NNLL calculation shows a good agreement in the m tt spectrum and a very large discrepancy for high values of the tt transverse momentum. Figure 18 shows a comparison of the NNLO calculation to the m tt and y tt measurements. For the rapidity of the tt system, the NNLO calculation improves the agreement slightly compared to the Powheg +Pythia prediction, but some shape difference can be seen between data and prediction.

ATLAS
Full phase-space

ATLAS
Full phase-space

ATLAS
Full phase-space  Figure 13: Full phase-space normalized differential cross-sections as a function of the (a) production angle (χ tt ) and (b) longitudinal boost (y tt boost ) of the tt system. The grey bands indicate the total uncertainty on the data in each bin. The Powheg+Pythia generator with h damp = m t and the CT10nlo PDF is used as the nominal prediction to correct for detector effects.

ATLAS
Full phase-space   Figure 15: Full phase-space normalized differential cross-section as a function of the (a) transverse momentum (p t T ) and (b) absolute value of the rapidity of the top quark ( y t ) compared to higher-order theoretical calculations. The grey band indicates the total uncertainty on the data in each bin. The Powheg+Pythia generator with h damp = m t and the CT10nlo PDF is used as the nominal prediction to correct for detector effects.

ATLAS
Full phase-space

ATLAS
Full phase-space  Figure 16: Full phase-space normalized differential cross-section as a function of the (a) transverse momentum (p t T ) and (b) absolute value of the rapidity of the top quark ( y t ) compared to NNLO theoretical calculations [85] using the MSTW2008nnlo PDF set. The grey band indicates the total uncertainty on the data in each bin. The Powheg+Pythia generator with h damp = m t and the CT10nlo PDF is used as the nominal prediction to correct for detector effects.

ATLAS
Full phase-space  Figure 17: Full phase-space normalized differential cross-section as a function of the (a) invariant mass (m tt ) and (b) transverse momentum (p tt T ) of the tt system compared to higher-order theoretical calculations. The grey band indicates the total uncertainty on the data in each bin. The Powheg+Pythia generator with h damp = m t and the CT10nlo PDF is used as the nominal prediction to correct for detector effects.

ATLAS
Full phase-space

ATLAS
Full phase-space  Figure 18: Full phase-space normalized differential cross-section as a function of the (a) invariant mass (m tt ) and (b) absolute value of the rapidity ( y tt ) of the tt system compared to NNLO theoretical calculations [85] using the MSTW2008nnlo PDF set. The grey band indicates the total uncertainty on the data in each bin. The Powheg+Pythia generator with h damp = m t and the CT10nlo PDF is used as the nominal prediction to correct for detector effects.     Table 6: Comparison between the measured full phase-space normalized differential cross-sections and higher-order QCD calculations. For each variable and prediction a χ 2 and a p-value are calculated using the covariance matrix of each measured spectrum. The number of degrees of freedom (NDF) is equal to N b − 1 where N b is the number of bins in the distribution.

Conclusions
Kinematic distributions of the top quarks in tt events, selected in the lepton+jets channel, are measured in the fiducial and full phase space using data from 8 TeV proton-proton collisions collected by the ATLAS detector at the Large Hadron Collider, corresponding to an integrated luminosity of 20.3 fb −1 . Normalized differential cross-sections are measured as a function of the hadronic top-quark transverse momentum and rapidity, and as a function of the mass, transverse momentum, and rapidity of the tt system. In addition, a new set of observables describing the hard-scattering interaction (χ tt , y tt boost ) and sensitive to the emission of radiation along with the tt pair (∆φ tt , p tt out , H tt T , R Wt ) are presented. The measurements presented here exhibit, for most distributions and in large part of the phase space, a precision of the order of 5% or better and an overall agreement with the Monte Carlo predictions of the order of 10%.
The y tt and y tt boost distributions are not well modelled by any generator under consideration in the fiducial phase space, however the agreement improves when new parton distribution functions are used with the MC@NLO+Herwig generator.
All the generators under consideration consistently predict a ratio of the hadronic W boson and top-quark transverse momenta (R Wt ) with a mis-modelling of up to 10% in the range 1.5-3.
The tail of the p t,had T distribution is harder in all predictions than what is observed in data, an effect previously observed in measurements by ATLAS and CMS. The agreement improves when using the Herwig parton shower with respect to Pythia. The tension observed for Powheg+Pythia, Powheg+Pythia8 and MadGraph+Pythia in the p t T spectrum is reflected in the tail of the H tt T distribution. Similarly, both aN 3 LO and aNNLO predictions have a poor agreement in the p t T spectrum in the full phase space. However, the full NNLO calculation, which has just become available, is in good agreement with the p t T distribution, indicating the disagreement seen with the generators and other calculations is due to missing higher-order terms. The NNLO calculation also shows good agreement in the y t and m tt distributions. [3] ATLAS Collaboration, Measurements of normalized differential cross-sections for ttbar production in pp collisions at √ s = 7 TeV using the ATLAS detector, Phys. Rev. D 90 (2014) 072004, arXiv:1407.0371 [hep-ex].
[6] CMS Collaboration, Measurement of the differential cross section for top quark pair production in pp collisions at   [59] ATLAS Collaboration, Measurement of the differential cross-section of highly boosted top quarks as a function of their transverse momentum in √ s = 8 TeV proton-proton collisions using the ATLAS detector, (2015), arXiv:1510.03818 [hep-ex].