1 Introduction

The large top-quark pair production cross-section at the LHC allows detailed studies of the characteristics of \(t\bar{t}{}\) production to be performed with respect to different kinematic variables, providing a unique opportunity to test the Standard Model (SM) at the  \({\mathrm {TeV}}\) scale. Furthermore, effects beyond the SM can appear as modifications of \(t\bar{t}{}\) differential distributions with respect to the SM predictions [1] which may not be detectable with an inclusive cross-section measurement. A precise measurement of the \(t\bar{t}{}\) differential cross-section therefore has the potential to enhance the sensitivity to possible effects beyond the SM, as well as to clarify the ability of the theoretical calculations in describing the cross-section.

The ATLAS [24] and CMS [5] experiments have published measurements of the \(t\bar{t}{}\) differential cross-sections at a centre-of-mass energy \(\sqrt{s}=7\)  \({\mathrm {TeV}}\) in pp collisions, both in the full phase space using parton-level variables and in fiducial phase-space regions using observables constructed from final-state particles (particle level); the CMS experiment also published measurements of the \(t\bar{t}{}\) differential cross-sections with data taken at \(\sqrt{s}=8\)  \({\mathrm {TeV}}\) [6]. The results presented here represent the natural extension of the previous ATLAS measurements of the \(t\bar{t}{}\) differential cross-sections to the \(\sqrt{s}=8\)  \({\mathrm {TeV}}\) dataset, and benefit from higher statistics and reduced detector uncertainties.

In the SM, the top quark decays almost exclusively into a \(W\) boson and a b-quark. The signature of a \(t\bar{t}\) decay is therefore determined by the \(W\) boson decay modes. This analysis makes use of the lepton \(+\) jets \(t\bar{t}\) decay mode, where one \(W\) boson decays into an electron or a muon and a neutrino and the other \(W\) boson decays into a pair of quarks, with the two decay modes referred to as the e+jets and \(\mu \)+jets channel, respectively. Events in which the \(W\) boson decays to an electron or muon through a \(\tau \) lepton decay are also included.

This paper presents a set of measurements of the \(t\bar{t}\) production cross-section as a function of different properties of the reconstructed top quark and of the \(t\bar{t}\) system. The results, unfolded both to a fiducial particle-level phase space and to the full phase space, are compared to the predictions of Monte Carlo (MC) generators and to perturbative QCD calculations beyond the next-to-leading-order (NLO) approximation. The goal of unfolding to a fiducial particle-level phase space and of using variables directly related to detector observables is to allow precision tests of QCD, avoiding large model-dependent extrapolation corrections to the parton-level top-quark and to a phase space region outside the detector sensitivity. However, full phase-space measurements represent a valid test of higher-order calculations for which event generation with subsequent parton showering and hadronization is not yet available. A subset of the observables under consideration has been measured by CMS [5].

In addition to the variables measured at \(\sqrt{s}=\)7  \({\mathrm {TeV}}\) [24], a set of new measurements is presented. These variables, similar to those used in dijet measurements at large jet transverse momentum [7, 8], are sensitive to effects of initial- and final-state radiation, to the different parton distribution functions (PDF), and to non-resonant processes including particles beyond the Standard Model [9]. Finally, observables constructed as a function of the transverse momenta of the W boson and the b-quark originating from the top quark have been found to be sensitive to non-resonant effects (when one or both top-quarks are off-shell) [10] and non-factorizable higher-order corrections [11].

The paper is organized as follows: Sect. 2 briefly describes the ATLAS detector, while Sect. 3 describes the data and simulation samples used in the measurements. The reconstruction of physics objects and the event selection is explained in Sect. 4. Section  5 describes the kinematic reconstruction of the \({t\bar{t}}\) pairs using the pseudo-top algorithm. Section 6 discusses the background processes affecting these measurements. Event yields for both the signal and background samples, as well as distributions of measured quantities before unfolding, are shown in Sect. 7. The measurements of the cross-sections are described in Sect. 8. Statistical and systematic uncertainties are discussed in Sect. 9. The results are presented in Sect. 10, where the comparison with theoretical predictions is also discussed. Finally, a summary is presented in Sect. 11.

2 The ATLAS detector

ATLAS is a multi-purpose detector [12] that provides nearly full solid angle coverage around the interaction point. This analysis exploits all major components of the detector. Charged-particle trajectories with pseudorapidityFootnote 1 \(|\eta | <2.5\) are reconstructed in the inner detector, which comprises a silicon pixel detector, a silicon microstrip detector and a transition radiation tracker (TRT). The inner detector is embedded in a 2 T axial magnetic field. Sampling calorimeters with several different designs span the pseudorapidity range up to \(|\eta | = 4.9\). High-granularity liquid argon (LAr) electromagnetic (EM) calorimeters are available up to \(|\eta | = 3.2\). Hadronic calorimeters based on scintillator-tile active material cover \(|\eta | < 1.7\) while LAr technology is used for hadronic calorimetry from \(|\eta | = 1.5\) to \(|\eta | = 4.9\). The calorimeters are surrounded by a muon spectrometer within a magnetic field provided by air-core toroid magnets with a bending integral of about 2.5 Tm in the barrel and up to 6 Tm in the endcaps. Three stations of precision drift tubes and cathode-strip chambers provide an accurate measurement of the muon track curvature in the region \(|\eta | < 2.7\). Resistive-plate and thin-gap chambers provide muon triggering capability up to \(|\eta | = 2.4\).

Data are selected from inclusive pp interactions using a three-level trigger system. A hardware-based trigger (L1) uses custom-made hardware and low-granularity detector data to initially reduce the trigger rate to approximately 75 kHz. The detector readout is then available for two stages of software-based triggers. In the second level (L2), the trigger has access to the full detector granularity, but only retrieves data for regions of the detector identified by L1 as containing interesting objects. Finally, the Event Filter (EF) system makes use of the full detector readout to finalize the event selection. During the 2012 run period, the selected event rate for all triggers following the event filter was approximately 400 Hz.

3 Data and simulation samples

The differential cross-sections are measured using a dataset collected by the ATLAS detector during the 2012 LHC pp run at \(\sqrt{s}=8\) TeV, which corresponds to an integrated luminosity of \(20.3~\pm ~0.6\) fb\({^{-1}}\). The luminosity is measured using techniques similar to those described in Ref. [13] with a calibration of the luminosity scale derived from beam-separation scans. The average number of interactions per bunch crossing in 2012 was 21. Data events are considered only if they are acquired under stable beam conditions and with all sub-detectors operational. The data sample is collected using single-lepton triggers; for each lepton type the logical OR of two triggers is used in order to increase the efficiency for isolated leptons at low transverse momentum. The triggers with the lower \(p_{\mathrm {T}}\) thresholds include isolation requirements on the candidate lepton, resulting in inefficiencies at high \(p_{\mathrm {T}}\) that are recovered by the triggers with higher \(p_{\mathrm {T}}\) thresholds. For electrons the two transverse momentum thresholds are 24 and 60 \({\mathrm {GeV}}\) while for muons the thresholds are 24 and 36 \({\mathrm {GeV}}\).

Simulated samples are used to characterize the detector response and efficiency to reconstruct \(t\bar{t}\) events, estimate systematic uncertainties and predict the background contributions from various processes. The response of the detector is simulated [14] using a detailed model implemented in GEANT4 [15]. For the evaluation of some systematic uncertainties, generated samples are passed through a fast simulation using a parameterization of the performance of the ATLAS electromagnetic and hadronic calorimeters [16]. Simulated events include the effect of multiple pp collisions from the same and previous bunch-crossings (in-time and out-of-time pile-up) and are re-weighted to match the same number of collisions as observed in data. All simulated samples are normalized to the integrated luminosity of the data sample; in the normalization procedure the most precise cross-section calculations available are used.

The nominal signal \(t\bar{t}\) sample is generated using the Powheg-Box  [17] generator, based on next-to-leading-order QCD matrix elements. The CT10 [18] parton distribution functions are employed and the top-quark mass (\(m_{t}\)) is set to 172.5 \({\mathrm {GeV}}\). The \(h_\mathrm{damp}\)  parameter, which effectively regulates the high-\(p_{\mathrm {T}}\) radiation in Powheg, is set to the top-quark mass. Parton showering and hadronization are simulated with Pythia  [19] (version 6.427) using the Perugia 2011C set of tuned parameters [20]. The effect of the systematic uncertainties related to the PDF for the signal simulation are evaluated using samples generated with MC@NLO [21] (version 4.01) using the CT10nlo PDF set, interfaced to Herwig [22] (version 6.520) for parton showering and hadronization, and Jimmy [23] (version 4.31) for the modelling of multiple parton scattering. For the evaluation of systematic uncertainties due to the parton showering model, a Powheg +Herwig  sample is compared to a Powheg +Pythia  sample. The \(h_\mathrm{damp}\)  parameter in the Powheg +Herwig  sample is set to infinity. The uncertainties due to QCD initial- and final-state radiation (ISR/FSR) modelling are estimated with samples generated with Powheg-Box interfaced to Pythia for which the parameters of the generation (\(\Lambda _\mathrm{QCD}\), \(Q^2_\mathrm{max}\) scale, transverse momentum scale for space-like parton-shower evolution and the \(h_\mathrm{damp}\)  parameter) are varied to span the ranges compatible with the results of measurements of \(t\bar{t}\) production in association with jets [2426]. Finally, two additional \(t\bar{t}\) samples are used only in the comparison against data. The first one is a sample of Powheg matrix elements generated with the nominal settings interfaced to Pythia8  [27] (version 8.186 and Main31 user hook) and the AU14 [28] set of tuned parameters. In the second sample, MadGraph  [29] \({t\bar{t}}\) matrix elements with up to three additional partons are interfaced to Pythia using the matrix-element to parton-shower MLM matching scheme [30] and the Perugia 2011C set of tuned parameters [20].

The \(t\bar{t}\) samples are normalized to the NNLO+NNLL cross-section of \(\sigma _{t\bar{t}}=253^{+13}_{-15}\) pb (scale, PDF and \(\alpha _S\)), evaluated using the Top++2.0 program [31], which includes the next-to-next-to-leading-order QCD corrections and resums next-to-next-to-leading logarithmic soft gluon terms [3237]. The quoted cross-section corresponds to a top-quark mass of 172.5 \({\mathrm {GeV}}\). Each \(t\bar{t}\) sample is produced requiring at least one semileptonic decay in the \({t\bar{t}}\) pair.

Single-top-quark processes for the s-channel, t-channel and Wt associated production constitute the largest background in this analysis. These processes are simulated with Powheg-Box using the PDF set CT10 and showered with Pythia (version 6.427) calibrated with the P2011C tune [20] and the PDF set CTEQ6L1 [38]. All possible production channels containing one lepton in the final state are considered. All samples are generated requiring the presence of a leptonically decaying W boson. The cross-sections multiplied by the branching ratios for the leptonic W decay employed for these processes are normalized to NLO+NNLL calculations [3941].

Leptonic decays of vector bosons produced in association with high-\(p_{\mathrm {T}}\) jets, referred to as W+jets and Z+jets, constitute the second largest background in this analysis. Samples of simulated W / Z+jets events with up to five additional partons in the LO matrix elements are produced with the Alpgen generator (version 2.13) [42] using the PDF set CTEQ6L1 [38] and interfaced to Pythia (version 6.427) for parton showering; the overlap between samples is dealt with by using the MLM matching scheme [30]. Heavy-flavour quarks are included in the matrix-element calculations to produce the \(Wb\bar{b}\), \(Wc\bar{c}\), Wc, \(Zb\bar{b}\) and \(Zc\bar{c}\) samples. The overlap between the heavy-flavour quarks produced by the matrix element and by the parton shower is removed. The W+jets samples are normalized to the inclusive W boson NNLO cross-section [43, 44] and corrected by applying additional scale factors derived from data, as described in Sect. 6.

Diboson production is modelled using Herwig and Jimmy with the CTEQ6L1 PDF set [38] and the yields are normalized using the NLO cross-sections [45]. All possible production channels containing at least one lepton in the final states are considered.

4 Object definition and event selection

The lepton+jets \(t\bar{t}\) decay mode is characterized by the presence of a high-\(p_{\mathrm {T}}\) lepton, missing transverse momentum due to the neutrino, two jets originating from b-quarks, and two jets from the hadronic \(W\) boson decay.

The following sections describe the detector-level, particle-level and parton-level objects used to characterize the final-state event topology and to define a fiducial phase-space region for the measurements.

4.1 Detector-level objects

Primary vertices in the event are formed from reconstructed tracks such that they are spatially compatible with the luminous interaction region. The hard-scatter primary vertex is chosen to be the vertex with the highest \(\sum p_{\mathrm {T}}^2\) where the sum extends over all associated tracks with \(p_{\mathrm {T}}> 0.4\,\mathrm {{\mathrm {GeV}}{}}\).

Electron candidates are reconstructed by associating tracks in the inner detector with energy deposits in the EM calorimeter. They must satisfy identification criteria based on the shower shape in the EM calorimeter, on the track quality, and on the detection of the transition radiation produced in the TRT detector. The EM clusters are required to be in the pseudorapidity region \(|\eta | < 2.47\), excluding the transition region between the barrel and the endcap calorimeters (\(1.37< |\eta | < 1.52\)). They must have a transverse energy \(E_{\mathrm {T}}>25 \,\) \({\mathrm {GeV}}\). The associated track must have a longitudinal impact parameter \(|z_0|<2\) mm with respect to the primary vertex. Isolation requirements, on calorimeter and tracking variables, are used to reduce the background from non-prompt electrons. The calorimeter isolation variable is based on the energy sum of cells within a cone of size \(\Delta R < 0.2\) around the direction of each electron candidate. This energy sum excludes cells associated with the electron cluster and is corrected for leakage from the electron cluster itself and for energy deposits from pile-up. The tracking isolation variable is based on the track \(p_{\mathrm {T}}\) sum around the electron in a cone of size \(\Delta R < 0.3\), excluding the electron track. In every \(p_{\mathrm {T}}\) bin both requirements are chosen to result separately in a 90 % electron selection efficiency for prompt electrons from Z boson decays.

Muon candidates are defined by matching tracks in the muon spectrometer with tracks in the inner detector. The track \(p_{\mathrm {T}}\) is determined through a global fit of the hits which takes into account the energy loss in the calorimeters. The track is required to have \(|z_0|<2\) mm and a transverse impact parameter significance, \(|d_0/\sigma (d_0)|<3\), consistent with originating in the hard interaction. Muons are required to have \(p_{\mathrm {T}}>25\,\mathrm {{\mathrm {GeV}}{}}\) and be within \(|\eta |<2.5\). To reduce the background from muons originating from heavy-flavour decays inside jets, muons are required to be separated by \(\Delta R>0.4\) from the nearest jet, and to be isolated. They are required to satisfy the isolation requirement \(I^{\ell }< 0.05\), where the isolation variable is the ratio of the sum of \(p_{\mathrm {T}}\) of tracks, excluding the muon, in a cone of variable size \(\Delta R = 10\,\mathrm {{\mathrm {GeV}}{}}/p_{\mathrm {T}}(\mu )\) to the \(p_{\mathrm {T}}\) of the muon [46]. The isolation requirement has an efficiency of about 97 % for prompt muons from Z boson decays.

Jets are reconstructed using the anti-\(k_{t}\) algorithm [47] implemented in the FastJet package [48] with radius parameter \(R = 0.4\). The jet reconstruction starts from topological clusters calibrated and corrected for pile-up effects using the jet area method [49]. A residual correction dependent on the instantaneous luminosity and the number of reconstructed primary vertices in the event [50] is then applied. They are calibrated using an energy- and \(\eta \)-dependent simulation-based calibration scheme, with in situ corrections based on data [51] and are accepted if \(p_{\mathrm {T}}> 25\,\mathrm {{\mathrm {GeV}}{}}\) and \(|\eta | < 2.5\). To reduce the contribution from jets associated with pile-up, jets with \(p_{\mathrm {T}}< 50\,\mathrm {{\mathrm {GeV}}{}}\) are required to satisfy \(|\mathrm {JVF}| > 0.5\), where JVF is the ratio of the sum of the \(p_{\mathrm {T}}\) of tracks associated with both the jet and the primary vertex, to the sum of \(p_{\mathrm {T}}\) of all tracks associated with the jet. Jets with no associated tracks or with \(|\eta | > 2.4\) at the edge of the tracker acceptance are always accepted.

To prevent double-counting of electron energy deposits as jets, the closest jet lying within \(\Delta R < 0.2\) from a reconstructed electron is removed. To remove leptons from heavy-flavour decays, the lepton is discarded if the lepton is found to lie within \(\Delta R < 0.4\) from a selected jet axis.

The purity of the selected \(t\bar{t}\) sample is improved by tagging jets containing b-hadrons, exploiting their long decay time and the large mass. Information from the track impact parameters, secondary vertex location and decay topology are combined in a neural-network-based algorithm (MV1) [52]. The operating point used corresponds to an overall 70 % b-tagging efficiency in \(t\bar{t}\) events, and to a probability to mis-identify light-flavour jets of approximately 1 %.

The missing transverse momentum \(E_{\mathrm {T}}^\mathrm{miss}\) is computed from the vector sum of the transverse momenta of the reconstructed calibrated physics objects (electrons, photons, hadronically decaying \(\tau \) leptons, jets and muons) as well as the transverse energy deposited in the calorimeter cells not associated with these objects [53]. Calorimeter cells not associated with any physics object are calibrated using tracking information before being included in the \(E_{\mathrm {T}}^\mathrm{miss}\) calculation. The contribution from muons is added using their momentum. To avoid double counting of energy, the parameterized muon energy loss in the calorimeters is subtracted in the \(E_{\mathrm {T}}^\mathrm{miss}\) calculation.

Fig. 1
figure 1

Kinematic distributions of the combined electron and muon selections at the detector level: a lepton transverse momentum and b missing transverse momentum \(E_{\mathrm {T}}^\mathrm{miss}\), c jet multiplicity, d jet transverse momentum, e b-tagged jet multiplicity and f leading b-tagged jet \(p_{\mathrm {T}}\). Data distributions are compared to predictions using Powheg +Pythia as the \(t\bar{t}\) signal model. The hashed area indicates the combined statistical and systematic uncertainties on the total prediction, excluding systematic uncertainties related to the modelling of the \(t\bar{t}\) system

4.2 Event selection at detector level

The event selection consists of a set of requirements based on the general event quality and on the reconstructed objects, defined above, that characterize the final-state event topology. Each event must have a reconstructed primary vertex with five or more associated tracks. The events are required to contain exactly one reconstructed lepton candidate with \(p_{\mathrm {T}}> 25\,\mathrm {{\mathrm {GeV}}{}}\) geometrically matched to a corresponding object at the trigger level and at least four jets with \(p_{\mathrm {T}}> 25\,\mathrm {{\mathrm {GeV}}{}}\) and \(|\eta | < 2.5\). At least two of the jets have to be tagged as b-jets. The event selection is summarized in Table 1. The event yields are displayed in Table 2 for data, simulated signal, and backgrounds (the background determination is described in Sect. 6). Figure 1 shows, for some key distributions, the comparison between data and predictions normalized to the data integrated luminosity. The selection produces a quite clean \(t\bar{t}\)sample, the total background being at the 10 % level. The difference between data and predicted event yield is \(\sim \)7 %, in fair agreement with the theoretical uncertainty on the \(t\bar{t}\)total cross-section used to normalize the signal MC simulation (see Sect. 3).

Table 1 Summary of all requirements included in the event selection
Table 2 Event yields in the \(e+\)jets and \(\mu +\)jets channels after the selection. The signal model, denoted \(t\bar{t}\) in the table, is generated using Powheg +Pythia . The quoted uncertainties represent the sum in quadrature of the statistical and systematic uncertainties on each subsample. Neither modelling uncertainties nor uncertainties on the inclusive \({t\bar{t}}\) cross-section are included in the systematic uncertainties

4.3 Particle-level objects and fiducial phase-space definition

Particle-level objects are defined for simulated events in analogy to the detector-level objects described above. Only stable final-state particles, i.e. particles that are not decayed further by the generator, and unstable particlesFootnote 2 that are to be decayed later by the detector simulation, are considered.

The fiducial phase space for the measurements presented in this paper is defined using a series of requirements applied to particle-level objects close to those used in the selection of the detector-level objects. The procedure explained in this section is applied to the \(t\bar{t}\) signal only, since the background subtraction is performed before unfolding the data.

Electrons and muons must not originate, either directly or through a \(\tau \) decay, from a hadron in the MC particle record. This ensures that the lepton is from an electroweak decay without requiring a direct match to a W boson. The four-momenta of leptons are modified by adding the four-momenta of all photons within \(\Delta R=0.1\) that do not originate from hadron decays to take into account final-state QED radiation. Such leptons are then required to have \(p_{\mathrm {T}}> 25\,\mathrm {{\mathrm {GeV}}{}}\) and \(|\eta | < 2.5\). Electrons in the transition region (\(1.37< \eta < 1.52\) ) are rejected at the detector level but accepted in the fiducial selection. This difference is accounted for by the efficiency correction described in Sect. 8.1.

The particle-level missing transverse momentum is calculated from the four-vector sum of the neutrinos, discarding neutrinos from hadron decays, either directly or through a \(\tau \) decay. Particle-level jets are clustered using the anti-\(k_{t}\) algorithm with radius parameter \(R = 0.4\), starting from all stable particles, except for selected leptons (e, \(\mu \), \(\nu \)) and the photons radiated from the leptons. Particle-level jets are required to have \(p_{\mathrm {T}}> 25\,\mathrm {{\mathrm {GeV}}{}}\) and \(|\eta | < 2.5\). Hadrons containing a b-quark with \(p_{\mathrm {T}}> 5\,\mathrm {{\mathrm {GeV}}{}}\) are associated with jets through a ghost matching technique as described in Ref. [49]. Particle b-tagged jets have \(p_{\mathrm {T}}> 25\,\mathrm {{\mathrm {GeV}}{}}\) and \(|\eta | < 2.5\). The events are required to contain exactly one reconstructed lepton candidate with \(p_{\mathrm {T}}> 25\,\mathrm {{\mathrm {GeV}}{}}\) and at least four jets with \(p_{\mathrm {T}}> 25\,\mathrm {{\mathrm {GeV}}{}}\) and \(|\eta | < 2.5\). At least two of the jets have to be b-tagged. Dilepton events where only one lepton passes the fiducial selection are by definition included in the fiducial measurement.

4.4 Parton-level objects and full phase-space definition

Parton-level objects are defined for simulated events. Only top quarks decaying directly to a W boson and a b-quark in the simulation are considered.Footnote 3 The full phase space for the measurements presented in this paper is defined by the set of \({t\bar{t}}\) pairs in which one top quark decays semileptonically (including \(\tau \) leptons) and the other decays hadronically. Events in which both top quarks decay semileptonically define the dilepton background, and are thus removed from the signal simulation.

5 Kinematic reconstruction

The pseudo-top algorithm [4] reconstructs the kinematics of the top quarks and their complete decay chain from final-state objects, namely the charged lepton (electron or muon), missing transverse momentum, and four jets, two of which are b-tagged. By running the same algorithm on detector- and particle-level objects, the degree of dependency on the details of the simulation is strongly reduced compared to correcting to parton-level top quarks.

In the following, when more convenient, the leptonically (hadronically) decaying W boson is referred to as the leptonic (hadronic) W boson, and the semileptonically (hadronically) decaying top quark is referred to as the leptonic (hadronic) top quark.

The algorithm starts with the reconstruction of the neutrino four-momentum. The z-component of the neutrino momentum is calculated using the W boson mass constraint imposed on the invariant mass of the system of the charged lepton and the neutrino. If the resulting quadratic equation has two real solutions, the one with smallest absolute value of \(|p_z|\) is chosen. If the determinant is negative, only the real part is considered. The leptonic W boson is reconstructed from the charged lepton and the neutrino and the leptonic top quark is reconstructed from the leptonic W and the \(b\text{-tagged }\) jet closest in \(\Delta R\) to the charged lepton. The hadronic W boson is reconstructed from the two non-b-tagged jets whose invariant mass is closest to the mass of the W boson. This choice yields the best performance of the algorithm in terms of the correlation between detector, particle and parton levels. Finally, the hadronic top quark is reconstructed from the hadronic W boson and the other \(b\text{-jet }\). In events with more than two b-tagged jets, only the two with the highest transverse momentum are considered.

6 Background determination

The single-top-quark background is the largest background contribution, amounting to approximately 4 % of the total event yield and 40 % of the total background estimate.

The shape of the distributions of the kinematical variables of this background is evaluated with a Monte Carlo simulation, and the event yields are normalized to the most recent calculations of their cross-sections, as described in Sect. 3. The overlap between the Wt and \({t\bar{t}}\) samples is handled using the diagram removal scheme [54].

The W+jets background represents the second largest background. After the event selection, approximately 3–4 % of the total event yield and 35 % of the total background estimate is due to W+jets events. The estimation of this background is performed using a combination of MC simulation and data-driven techniques. The Alpgen+Pythia W+jets samples, normalized to the inclusive W boson NNLO cross-section, are used as a starting point while the absolute normalization and the heavy-flavour fractions of this process, which are affected by large theoretical uncertainties, are determined from data.

The corrections for generator mis-modelling in the fractions of W boson production associated with jets of different flavour components (\(W+b\bar{b}\), \(W+c\bar{c}\), \(W+c\)) are estimated in a sample with the same lepton and \(E_{\mathrm {T}}^{\mathrm {miss}}\) selections as the signal selection, but with only two jets and no b-tagging requirements. The b-jet multiplicity, in conjunction with knowledge of the b-tagging and mis-tag efficiency, is used to extract the heavy-flavour fraction. This information is extrapolated to the signal region using MC simulation, assuming constant relative rates for the signal and control regions.

The overall W+jets normalization is then obtained by exploiting the expected charge asymmetry in the production of \(W^+\) and \(W^-\) bosons in pp collisions. This asymmetry is predicted by theory [55] and evaluated using MC simulation, while other processes in the \(t\bar{t}\) sample are symmetric in charge except for a small contamination from single-top and WZ events, which is subtracted using MC simulation. The total number of W+jets events in the sample can thus be estimated with the following equation:

$$\begin{aligned} N_{W^+} + N_{W^-} = \left( \frac{r_\mathrm{MC} + 1}{r_\mathrm{MC} - 1}\right) (D_\mathrm{+} - D_\mathrm{-}), \end{aligned}$$
(1)

where \(r_\mathrm{MC}\) is the ratio of the number of events with positive leptons to the number of events with negative leptons in the MC simulation, and \(D_\mathrm{+}\) and \(D_\mathrm{-}\) are the number of events with positive and negative leptons in the data, respectively.

Multi-jet production processes have a large cross-section and mimic the lepton+jets signature due to jets misidentified as prompt leptons (fake leptons) or semileptonic decays of heavy-flavour hadrons (non-prompt real leptons). This background is estimated directly from data by using the matrix-method technique [56]. The number of background events in the signal region is evaluated by applying efficiency factors to the number of events passing the tight (signal) and loose selection. The fake leptons efficiency is measured using data in control regions dominated by the multi-jet background with the real-lepton contribution subtracted using MC simulation. The real leptons efficiency is extracted from a tag-and-probe technique using leptons from Z boson decays. Fake leptons events contribute to the total event yield at approximately the 1–2 % level.

Z+jets and diboson events are simulated with MC generators, and the event yields are normalized to the most recent theoretical calculation of their cross-sections. The total contribution of these processes is less than 1 % of the total event yield or approximatively 10 % of the total background.

Top-quark pair events with both top quarks and anti-top quarks decaying semileptonically (including decays to \(\tau \)) can sometimes pass the event selection, contributing approximately 5 % to the total event yield. The fraction of dileptonic \(t\bar{t}\) events in each \(p_{\mathrm {T}}\) bin is estimated with the same MC sample used for the signal modelling. In the fiducial phase-space definition, semileptonic top-quark decays to \(\tau \) leptons in lepton+jets \(t\bar{t}\) events are considered as signal only if the \(\tau \) lepton decays leptonically.

7 Observables

A set of measurements of the \(t\bar{t}\)  production cross-sections is presented as a function of kinematic observables. In the following, the indices had and lep refer to the hadronically and semileptonically decaying top quarks, respectively. The indices 1 and 2 refer respectively to the leading and sub-leading top quark, ordered by transverse momentum.

First, a set of baseline observables is presented: transverse momentum (\(p_{\mathrm {T}}^{t,\mathrm{had}}\)) and absolute value of the rapidity (\(|y^{t,\mathrm{had}}|\)) of the hadronically decaying top quark (which was chosen over the leptonic top quark due to better resolution), and the transverse momentum (\(p_{\mathrm {T}}^{t\bar{t}}\)), absolute value of the rapidity (\(|y^{t\bar{t}}|\)) and invariant mass (\(m^{t\bar{t}}\)) of the \(t\bar{t}\) system. These observables, shown in Fig. 2, have been previously measured by the ATLAS experiment using the 7  \({\mathrm {TeV}}\) dataset [3, 4] except for \(|y^{t,\mathrm{had}}|\) which has not been measured in the full phase-space. The level of agreement between data and prediction is within the quoted uncertainties for \(|y^{t,\mathrm{had}}|\), \(m^{t\bar{t}}\) and \(p_{\mathrm {T}}^{t\bar{t}}\). A trend is observed in the \(p_{\mathrm {T}}^{t,\mathrm{had}}\) distribution, which is not well modelled at high values. A fair agreement between data and simulation is observed for large absolute values of the \({t\bar{t}}\) rapidity.

Furthermore, angular variables sensitive to a \(p_{\mathrm {T}}\) imbalance in the transverse plane, i.e. to the emission of radiation associated with the production of the top-quark pair, are employed to emphasize the central production region [8]. The angle between the two top quarks has been found to be sensitive to non-resonant contributions due to hypothetical new particles exchanged in the t-channel [7]. The rapidities of the two top quarks in the laboratory frame are denoted by \(y^{t,\mathrm{1}}\) and \(y^{t,\mathrm{2}}\), while their rapidities in the \(t\bar{t}{}\) centre-of-mass frame are \({y^{\star }}= \frac{1}{2}\left( y^{t,\mathrm{1}}-y^{t,\mathrm{2}} \right) \) and \(-{y^{\star }}\). The longitudinal motion of the \(t\bar{t}\)  system in the laboratory frame is described by the rapidity boost \(y_\mathrm{boost}^{{t\bar{t}}}=\frac{1}{2} \left[ y^{t,\mathrm{1}} + y^{t,\mathrm{2}} \right] \) and \(\chi ^{{t\bar{t}}}=e^{2|{y^{\star }}|}\), which is closely related to the production angle. In particular, many signals due to processes not included in the Standard Model are predicted to peak at low values of \(\chi ^{{t\bar{t}}}\) [7]. Finally, observables depending on the transverse momentum of the decay products of the top quark have been found to be sensitive to higher-order corrections [10, 11].

The following additional observables are measured:

  • The absolute value of the azimuthal angle between the two top quarks (\(\Delta \phi ^{{t\bar{t}}}\));

  • the absolute value of the out-of-plane momentum (\(|p_\mathrm{out}^{{t\bar{t}}}|\)), i.e. the projection of top-quark three-momentum onto the direction perpendicular to a plane defined by the other top quark and the beam axis (z) in the laboratory frame [8]:

    $$\begin{aligned} |p_\mathrm{out}^{{t\bar{t}}}|= \left| \vec {p}^{~t, \mathrm{had}} \cdot \frac{\vec {p}^{~t,\mathrm{lep}} \times \hat{z}}{|\vec {p}^{~t,\mathrm{lep}}\times \hat{z}|} \right| ; \end{aligned}$$
    (2)
  • the longitudinal boost of the \({t\bar{t}}\) system in the laboratory frame (\(y_\mathrm{boost}^{{t\bar{t}}}\)) [7];

  • the production angle between the two top quarks (\(\chi ^{{t\bar{t}}}\)) [7];

  • the scalar sum of the transverse momenta of the two top quarks (\(H_\mathrm{T}^{{t\bar{t}}}\)) [10, 11]

  • and the ratio of the transverse momenta of the hadronic W boson and the top quark from which it originates (\(R_{Wt}\)) [10, 11]

    $$\begin{aligned} R_{Wt}= p_{\mathrm {T}}^{W,\mathrm{had}} / p_{\mathrm {T}}^{t,\mathrm{had}}. \end{aligned}$$
    (3)

These observables are shown in Fig. 3 at detector level. All these variables show only modest agreement with data. In particular, at high values of \(H_\mathrm{T}^{{t\bar{t}}}\), fewer events are observed with respect to the prediction. The longitudinal boost \(y_\mathrm{boost}^{{t\bar{t}}}\) is predicted to be less central than the data. Finally, \(R_{Wt}\) is predicted to be lower than observed in the range 1.5–3.0.

Fig. 2
figure 2

Distributions of observables of the combined electron and muon selections at detector level: a hadronic top-quark transverse momentum \(p_{\mathrm {T}}^{t,\mathrm{had}}\) and b absolute value of the rapidity \(|y^{t,\mathrm{had}}|\) , c \({t\bar{t}}\)  invariant mass \(m^{t\bar{t}}\), d transverse momentum \(p_{\mathrm {T}}^{t\bar{t}}\) and e absolute value of the rapidity \(|y^{t\bar{t}}|\) . Data distributions are compared to predictions, using Powheg +Pythia as the \(t\bar{t}\) signal model. The hashed area indicates the combined statistical and systematic uncertainties (described in Sect. 9) on the total prediction, excluding systematic uncertainties related to the modelling of the \(t\bar{t}\) system

Fig. 3
figure 3

Distributions of observables of the combined electron and muon selections at the detector level: a absolute value of the out-of-plane momentum \(p_\mathrm{out}^{{t\bar{t}}}\), b azimuthal angle between the two top quarks \(\Delta \phi ^{{t\bar{t}}}\), c production angle \(\chi ^{{t\bar{t}}}\), d longitudinal boost \(y_\mathrm{boost}^{{t\bar{t}}}\), e scalar sum of hadronic and leptonic top-quarks transverse momenta and f ratio of the hadronic W boson and the hadronic top-quark transverse momenta. Data distributions are compared to predictions, using Powheg +Pythia as the \(t\bar{t}\) signal model. The hashed area indicates the combined statistical and systematic uncertainties (described in Sect. 9) on the total prediction, excluding systematic uncertainties related to the modelling of the \(t\bar{t}\) system

8 Unfolding procedure

The underlying differential cross-section distributions are obtained from the detector-level events using an unfolding technique that corrects for detector effects. The iterative Bayesian method [57] as implemented in RooUnfold [58] is used. The individual \(e+\)jets and \(\mu +\)jets channels give consistent results and are therefore combined by summing the event yields before the unfolding procedure.

8.1 Fiducial phase space

The unfolding starts from the detector-level event distribution (\(N_\mathrm{reco}\)), from which the backgrounds (\(N_\mathrm{bg}\)) are subtracted first. Next, the acceptance correction \(f_\mathrm{acc}\) corrects for events that are generated outside the fiducial phase-space but pass the detector-level selection.

In order to separate resolution and combinatorial effects, distributions evaluated using a Monte Carlo simulation are corrected to the level where detector- and particle-level objects forming the pseudo-top quarks are angularly well matched. The matching correction \(f_\mathrm{match}\) accounts for the corresponding efficiency. The matching is performed using geometrical criteria based on the distance \(\Delta R\). Each particle e (\(\mu \)) is matched to the closest detector-level e (\(\mu \)) within \(\Delta R < 0.02\). Particle-level jets are geometrically matched to the closest detector-level jet within \(\Delta R < 0.4\). If a detector-level jet is not matched to a particle-level jet, it is assumed to be either from pile-up or matching inefficiency and is ignored. If two jets are reconstructed as being \(\Delta R< 0.4\) from a single particle-level jet, the detector-level jet with smaller \(\Delta R\) is matched to the particle-level jet and the other detector-level jet is unmatched.

The unfolding step uses a migration matrix (\(\mathcal {M}\)) derived from simulated \(t\bar{t}\) events which maps the binned generated particle-level events to the binned detector-level events. The probability for particle-level events to remain in the same bin is therefore represented by the elements on the diagonal, and the off-diagonal elements describe the fraction of particle-level events that migrate into other bins. Therefore, the elements of each row add up to unity as shown in Fig. 4d. The binning is chosen such that the fraction of events in the diagonal bins is always greater than 50 %. The unfolding is performed using four iterations to balance the goodness of fit and the statistical uncertainty. The effect of varying the number of iterations by one was tested and proved to be negligible. Finally, the efficiency correction \(f_\mathrm{eff}\) corrects for events which pass the particle-level selection but are not reconstructed at the detector level.

All corrections are evaluated with simulation and are presented in Fig. 4 for the case of the \(p_{\mathrm {T}}\) of the top quark decaying hadronically. This variable is particularly representative since the kinematics of the decay products of the top quark change substantially in the observed range. The decrease of the efficiency at high values is primarily due to the increasingly large fraction of non-isolated leptons and close or merged jets in events with high top-quark \(p_{\mathrm {T}}\); in order to improve the selection efficiency in this boosted kinematic region, jets with larger R radius, with respect to the one used in this study, are required [59]. A similar effect is observed in the tail of the \(t\bar{t}\) transverse momentum and rapidity, small \(\Delta \phi ^{{t\bar{t}}}\) angle and high \(H_\mathrm{T}^{{t\bar{t}}}\)  distributions. The matching corrections reach the highest values, of the order of \(f_\mathrm{match} = 0.6{-}0.7\), at low \(t\bar{t}\) transverse momentum and large \(t\bar{t}\) rapidity. Generally, the acceptance corrections are constant and close to unity, indicating very good correlation between the detector- and the particle-level reconstruction. This is also apparent from the high level of diagonality of the migration matrices, with correlations between particle and detector levels of 85–95 %.

The unfolding procedure for an observable X at particle level is summarized by the expression

$$\begin{aligned} \frac{\mathrm{d}\sigma ^\mathrm{fid}}{\mathrm{d}X^i} \equiv \frac{1}{\mathcal {L} \cdot \Delta X^i} \cdot f_\mathrm{eff}^i \cdot \sum _j \mathcal {M}_{ij}^{-1} \cdot f_\mathrm{match}^j \cdot f_\mathrm{acc}^j \cdot \left( N_\mathrm{reco}^j - N_\mathrm{bg}^j\right) \hbox {,} \end{aligned}$$
(4)

where the index j iterates over bins of X at detector level while the i index labels bins at particle level; \(\Delta X^i\) is the bin width while \(\mathcal {L}\) is the integrated luminosity and the Bayesian unfolding is symbolized by \(\mathcal {M}_{ij}^{-1}\).

The integrated cross-section is obtained by integrating the unfolded cross-section over the kinematic bins, and its value is used to compute the normalized differential cross-section \(1/\sigma ^\mathrm{fid}\cdot \mathrm{d}\sigma ^\mathrm{fid}/\mathrm{d}X^i\).

Fig. 4
figure 4

The a acceptance, b matching and c efficiency corrections, and the d detector-to-particle level migration matrix for the hadronic top-quark transverse momentum evaluated with the Powheg +Pythia simulation sample with \(h_\mathrm{damp}\!=\!m_{t}\) and using CT10nlo PDF. In Fig. ac the dashed linesillustrate the corrections evaluated on alternative ISR/FSR-varied samples. In Fig. d, the empty bins contain either no events or the number of events is less than 0.5 %

8.2 Full phase space

The measurements are extrapolated to the full phase space of the \({t\bar{t}}\)  system using a procedure similar to the one described in Sect. 8.1. The only difference is in the value used for the binning. The binning used by the CMS experiment in Ref. [5] is used for the observables measured by both experiments to facilitate future combinations. This binning is found to be compatible with the resolution of each observable. The fiducial phase-space binning is used for all the other observables. In order to unambiguously define leptonic and hadronic top quarks, the contribution of \(t\bar{t}\) pairs decaying dileptonically is removed by applying a correction factor \(\hat{f}_\mathrm{ljets}\) which represents the fraction of \(t\bar{t}\) single-lepton events in the nominal sample. The \(\tau \) leptons from the leptonically decaying W bosons are considered as signal regardless of the \(\tau \) decay mode. The cross-section measurements are defined with respect to the top quarks before the decay (parton level) and after QCD radiation. Observables related to top quarks are extrapolated to the full phase-space starting from top quarks decaying hadronically at the detector level.

The acceptance correction \(\hat{f}_\mathrm{acc}\) corrects for detector-level events which are reconstructed outside the parton-level bin range for a given variable. The migration matrix (\(\hat{\mathcal {M}}\)) is derived from simulated \(t\bar{t}\) events decaying in the single-lepton channel and the efficiency correction \(\hat{f}_\mathrm{eff}\) corrects for events which did not pass the detector-level selection.

The unfolding procedure is summarized by the expression

$$\begin{aligned} \frac{\mathrm{d}\sigma ^\mathrm{full}}{\mathrm{d}X^i} \equiv \frac{1}{\mathcal {L} \cdot \mathcal {B} \cdot \Delta X^i} \cdot \hat{f}_\mathrm{eff}^i \cdot \sum _j \hat{\mathcal {M}}_{ij}^{-1} \cdot \hat{f}_\mathrm{acc}^j \cdot \hat{f}_\mathrm{ljets}^i \cdot \left( N_\mathrm{reco}^j - N_\mathrm{bg}^j\right) \hbox {,} \end{aligned}$$
(5)

where the index j iterates over bins of observable X at the detector level while the i index labels bins at the parton level; \(\Delta X^i\) is the bin width, \(\mathcal {B}=0.438\) is the single-lepton branching ratio, \(\mathcal {L}\) is the integrated luminosity and the Bayesian unfolding is symbolized by \(\hat{\mathcal {M}}_{ij}^{-1}\).

The integrated cross-section is obtained by integrating the unfolded cross-section over the kinematic bins, and its value is used to compute the normalized differential cross-section \(1/\sigma ^\mathrm{full}\cdot \mathrm{d}\sigma ^\mathrm{full} / \mathrm{d}X^i\).

To ensure that the results are not biased by the MC generator used for the unfolding procedure, a study is performed in which the particle- and parton-level spectra in simulation are altered by changing the shape of the distributions using continuous functions chosen depending on the observable. The studies confirm that these altered shapes are recovered within statistical uncertainties by the unfolding based on the nominal migration matrices.

9 Uncertainties

This section describes the estimation of systematic uncertainties related to object reconstruction and calibration, MC generator modelling and background estimation.

To evaluate the impact of each uncertainty after the unfolding, the reconstructed distribution expected from simulation is varied. Corrections based on the nominal Powheg-Box signal sample are used to correct for detector effects and the unfolded distribution is compared to the known particle- or parton-level distribution. All detector- and background-related systematic uncertainties have been evaluated using the same generator, while alternative generators have been employed to assess modelling systematic uncertainties (e.g. different parton showers). In these cases the corrections, derived from the nominal generator, are used to unfold the detector-level spectra of the alternative generator. The relative difference between the unfolded spectra and the corresponding particle- or parton-level spectra of the alternative generator is taken as the uncertainty related to the generator modelling. After the unfolding, each distribution is normalized to unit area.

The covariance matrices for the normalized unfolded spectra due to the statistical and systematic uncertainties are obtained by evaluating the covariance between the kinematic bins using pseudo-experiments. In particular, the correlations due to statistical fluctuations for both data and the signal are evaluated by varying the event counts independently in every bin before unfolding, and then propagating the resulting variations through the unfolding.

9.1 Object reconstruction and calibration

The jet energy scale uncertainty is derived using a combination of simulations, test beam data and in situ measurements [6062]. Additional contributions from the jet flavour composition, calorimeter response to different jet flavours, and pile-up are taken into account. Uncertainties in the jet energy resolution are obtained with an in situ measurement of the jet response asymmetry in dijet events [63].

The efficiency to tag jets containing b-hadrons is corrected in simulation events by applying b-tagging scale factors, extracted in \(t\bar{t}\) and dijet samples, in order to account for the residual difference between data and simulation. Scale factors are also applied for jets originating from light quarks that are mis-identified as b-jets. The associated systematic uncertainties are computed by varying the scale factors within their uncertainties [52, 64, 65].

The lepton reconstruction efficiency in simulation is corrected by scale factors derived from measurements of these efficiencies in data using a \(Z \rightarrow \ell ^+ \ell ^-\) enriched control region. The lepton trigger and reconstruction efficiency scale factors, energy scale and resolution are varied within their uncertainties [66, 67].

The uncertainty associated with \(E_{\mathrm {T}}^\mathrm{miss}\) is calculated by propagating the energy scale and resolution systematic uncertainties to all jets and leptons in the \(E_{\mathrm {T}}^\mathrm{miss}\) calculation. Additional \(E_{\mathrm {T}}^\mathrm{miss}\) uncertainties arising from energy deposits not associated with any reconstructed objects are also included [53].

9.2 Signal modelling

The uncertainties of the signal modelling affect the kinematic properties of simulated \(t\bar{t}\) events and reconstruction efficiencies.

To assess the uncertainty related to the generator, events simulated with MC@NLO +Herwig are unfolded using the migration matrix and correction factors derived from the Powheg +Herwig sample. The difference between the unfolded distribution and the known particle- or parton-level distribution of the MC@NLO +Herwig sample is assigned as the relative uncertainty for the fiducial or full phase-space distributions, respectively. This uncertainty is found to be in the range 2–5 %, depending on the variable, increasing up to 10 % at large \(p_{\mathrm {T}}^{t}\), \(m^{t\bar{t}}\), \(p_{\mathrm {T}}^{t\bar{t}}\) and \(|y^{t\bar{t}}|\). The observable that is most affected by these uncertainties is \(m^{t\bar{t}}\) in the full phase space.

To assess the impact of different parton-shower models, unfolded results using events simulated with Powheg interfaced to Pythia are compared to events simulated with Powheg interfaced to Herwig, using the same procedure described above to evaluate the uncertainty related to the \(t\bar{t}\)generator. The resulting systematic uncertainties, taken as the symmetrized difference, are found to be typically at the 1–3 % level.

In order to evaluate the uncertainty related to the modelling of the ISR/FSR, \(t\bar{t}\) MC samples with modified ISR/FSR modelling are used. The MC samples used for the evaluation of this uncertainty are generated using the Powheg generator interfaced to Pythia, where the parameters are varied as described in Sect. 3. This uncertainty is found to be in the range 2–5 %, depending on the variable of the \(t\bar{t}\) system considered, and reaching the largest values at high \(|y^{t}|\) and small \(p_{\mathrm {T}}^{t\bar{t}}\).

The impact of the uncertainty related to the PDF is assessed by means of \(t\bar{t}\) samples generated with MC@NLO interfaced to Herwig. An envelope of spectra is evaluated by reweighting the central prediction of the CT10nlo PDF set, using the full set of 52 eigenvectors at 68 % CL. This uncertainty is found to be less than 1 %.

As a check, the effect of the uncertainty on the top-quark mass was evaluated and found to affect only the efficiency correction by less than 1 %, consistent with what was observed by ATLAS for the analogous measurement with the 7  \({\mathrm {TeV}}\) data [4].

9.3 Background modelling

Systematics affecting the background are modelled by adding to the signal spectrum the difference of the systematics-varied and nominal backgrounds.

The single-top background is assigned an uncertainty associated with the theoretical calculations used for its normalization [3941]. The overall impact of this systematic uncertainty on the signal is around 0.5 %.

The systematic uncertainties due to the overall normalization and the heavy-flavour fraction of W+jets events are obtained by varying the data-driven scale factors within the statistical uncertainty of the W+jets MC sample. The W+jets shape uncertainty is extracted by varying the renormalization and matching scales in Alpgen. The W+jets MC statistical uncertainty is also taken into account. The overall impact of this uncertainty is less than 1 %.

The uncertainty on the background from non-prompt and fake-leptons is evaluated by varying the definition of loose leptons, changing the selection used to form the control region and propagating the statistical uncertainty of parameterizations of the efficiency to pass the tighter lepton requirements for real and fake leptons. The combination of all these components also affects the shape of the background. The overall impact of this systematic uncertainty is less than 1 %.

A 50 % uncertainty is applied to the normalization of the Z+jets background, including the uncertainty on the cross-section and a further 48 % due to uncertainties related to the requirement of the presence of at least four jets. A 40 % uncertainty is applied to the diboson background, including the uncertainty on the cross-section and a further 34 % due to the presence of two additional jets. The overall impact of these uncertainties is less than 1 %, and the largest contribution is due to the Z+jets background.

10 Results

In this section, comparisons between unfolded data distributions and several SM predictions are presented for the different observables discussed in Sect. 7. Events are selected by requiring exactly one lepton and at least four jets with at least two of the jets tagged as originating from a b-quark. Normalized differential cross-sections are shown in order to remove systematic uncertainties on the normalization.

The SM predictions are obtained using different MC generators. The Powheg-Box generator [17], denoted “PWG” in the figures, is employed with three different sets of parton shower models, namely Pythia [19], Pythia8  [27] and Herwig [22]. The other NLO generator is MC@NLO [21] interfaced with the Herwig parton shower. Generators at the LO accuracy are represented by MadGraph  [29] interfaced with Pythia for parton showering, which calculates \({t\bar{t}}\) matrix elements with up to three additional partons and implements the matrix-element to parton-shower MLM matching scheme [30].

The level of agreement between the measured distributions and simulations with different theoretical predictions is quantified by calculating \(\chi ^2\) values, employing the full covariance matrices, and inferring p-values (probabilities that the \(\chi ^2\) is larger than or equal to the observed value) from the \(\chi ^2\) and the number of degrees of freedom (NDF). Uncertainties on the predictions are not included. The normalization constraint used to derive the normalized differential cross-sections lowers by one unit the NDF and the rank of the \(N_\mathrm{b} \times N_\mathrm{b}\) covariance matrix, where \(N_\mathrm{b}\) is the number of bins of the spectrum under consideration [68]. In order to evaluate the \(\chi ^2\) the following relation is used

$$\begin{aligned} \chi ^2 = V_{N_\mathrm{b}-1}^\mathrm{T} \cdot \mathrm{Cov}_{N_\mathrm{b}-1}^{-1} \cdot V_{N_\mathrm{b}-1}, \end{aligned}$$
(6)

where \(V_{N_\mathrm{b}-1}\) is the vector of differences between data and prediction obtained by discarding one of the \(N_\mathrm{b}\) elements and \(\mathrm{Cov}_{N_\mathrm{b}-1}\) is the \((N_\mathrm{b}-1) \times (N_\mathrm{b}-1)\) sub-matrix derived from the full covariance matrix discarding the corresponding row and column. The sub-matrix obtained in this way is invertible and allows the \(\chi ^2\) to be computed. The \(\chi ^2\) value does not depend on the choice of the element discarded for the vector \(V_{N_\mathrm{b}-1}\) and the corresponding sub-matrix \(\mathrm{Cov}_{N_\mathrm{b}-1}\).

The set of Figs. 59 presents the normalized \(t\bar{t}\)fiducial phase-space differential cross-sections as a function of the different observables. In particular, Fig. 5a, b show the distributions of the hadronic top-quark transverse momentum and the absolute value of the rapidity; Fig. 6a–c present the \({t\bar{t}}\) system invariant mass, transverse momentum, and absolute value of the rapidity, while the additional observables related to the \(t\bar{t}\) system and the ratio of the transverse momenta of the hadronically decaying W boson and top quark are shown in Figs. 7, 8 and 9.

None of the predictions is able to correctly describe all the distributions, as also witnessed by the \(\chi ^2\) values and the p-values listed in Table 3. In particular, a certain tension between data and all predictions is observed in the case of the hadronic top-quark transverse momentum distribution for values higher than about 400 \({\mathrm {GeV}}\). No electroweak corrections [69, 70, 7073] are included in these predictions, as these have been shown to have a measurable impact only at very high values of the top quark transverse momentum, leading to a slightly softer \(p_{\mathrm {T}}^{t,\mathrm{had}}\) spectrum as confirmed by the recent ATLAS measurement of the \(t\bar{t}\)differential distribution of the hadronic top-quark \(p_{\mathrm {T}}\) for boosted top quarks [59]. The effect of electroweak corrections alone is not large enough to solve this discrepancy completely [59, 74]. The shape of the \(|y^{t,\mathrm{had}}|\) distribution shows only a modest agreement for all the generators, with larger discrepancies observed in the forward region for Powheg +Pythia and Powheg +Pythia8.

For the \(m^{t\bar{t}}\) distribution, the Powheg +Pythia, Powheg +Pythia8 and Powheg +Herwig generators are in better agreement with the data. All generators are in good agreement in the \(p_{\mathrm {T}}^{t\bar{t}}\) spectrum except for Powheg +Herwig in the last bin. This observation suggests that setting \(h_\mathrm{damp}\!=\!m_{t}\) in the Powheg samples improves the agreement at high values of the \(t\bar{t}\)  transverse momentum. The data at high values of \({t\bar{t}}\) rapidity is not adequately described by any of the generators considered. The same conclusions hold for the analogous distribution for the absolute spectra, although the overall agreement estimated with the \(\chi ^2\) values and the p-values is better due to the larger uncertainties.

For the variables describing the hard-scattering interaction, the production angle \(\chi ^{{t\bar{t}}}\) is well described in the central region. The forward region, described by the tail of this observable and by the tail of the longitudinal boost \(y_\mathrm{boost}^{{t\bar{t}}}\), is not described correctly by any of the generators under consideration. For the variables describing the radiation along the \(t\bar{t}\) pair momentum direction, both \(|p_\mathrm{out}^{{t\bar{t}}}|\) and \(\Delta \phi ^{{t\bar{t}}}\) indicate that the kinematics of top quarks produced in the collinear region (\(\Delta \phi ^{{t\bar{t}}}\) \(\lesssim \pi /2\)) are described with fair agreement by all the generators, but the uncertainty is particularly large in this region. The tension observed in the \(p_{\mathrm {T}}^{t,\mathrm{had}}\) spectrum is reflected in the tail of the \(H_\mathrm{T}^{{t\bar{t}}}\) distribution. Finally, the ratio of the hadronic W boson and top-quark transverse momenta shows a mis-modelling in the range 1.5–3 for all the generators.

The difficulty in correctly predicting the data in the forward region was further investigated by studying the dependence of the predictions on different PDF sets. The study was performed for the rapidity observables \(|y^{t,\mathrm{had}}|\) , \(|y^{t\bar{t}}|\) and \(y_\mathrm{boost}^{{t\bar{t}}}\), shown in Fig. 10 and comparing the data with the predictions of MC@NLO +Herwig for more recent sets of parton distribution functions. The results exhibit a general improvement in the description of the forward region for the most recent PDF sets (CT14nlo [75], CJ12mid [76], MMHT2014nlo [77], NNPDF 3.0 NLO [78], METAv10LHC [79], HERAPDF 2.0 NLO [80]). The improvement with respect to CT10nlo is also clearly shown in Table 5 which lists the \(\chi ^2\) and corresponding p-values for the different sets. The only exception is represented by the \(|y^{t,\mathrm{had}}|\)  distribution using HERAPDF 2.0 NLO, for which a disagreement in the forward region is observed.

Fig. 5
figure 5

Fiducial phase-space normalized differential cross-sections as a function of the a transverse momentum (\(p_{\mathrm {T}}^{t,\mathrm{had}}\)) and b absolute value of the rapidity (\(|y^{t,\mathrm{had}}|\)) of the hadronic top quark. The yellow bands indicate the total uncertainty on the data in each bin. The Powheg +Pythia generator with \(h_\mathrm{damp}\!=\!m_{t}\) and the CT10nlo PDF is used as the nominal prediction to correct for detector effects

Fig. 6
figure 6

Fiducial phase-space normalized differential cross-sections as a function of the a invariant mass (\(m^{t\bar{t}}\)), b transverse momentum (\(p_{\mathrm {T}}^{t\bar{t}}\)) and c absolute value of the rapidity (\(|y^{t\bar{t}}|\) ) of the \({t\bar{t}}\)  system. The yellow bands indicate the total uncertainty on the data in each bin. The Powheg +Pythia generator with \(h_\mathrm{damp}\!=\!m_{t}\) and the CT10nlo PDF is used as the nominal prediction to correct for detector effects

Fig. 7
figure 7

Fiducial phase-space normalized differential cross-sections as a function of the \({t\bar{t}}\)  a production angle (\(\chi ^{{t\bar{t}}}\)) and b longitudinal boost (\(y_\mathrm{boost}^{{t\bar{t}}}\)). The yellow bands indicate the total uncertainty on the data in each bin. The Powheg +Pythia generator with \(h_\mathrm{damp}\!=\!m_{t}\) and the CT10nlo PDF is used as the nominal prediction to correct for detector effects

Fig. 8
figure 8

Fiducial phase-space normalized differential cross-sections as a function of the \({t\bar{t}}\)  a out-of-plane momentum (\(|p_\mathrm{out}^{{t\bar{t}}}|\)) and b azimuthal angle (\(\Delta \phi ^{{t\bar{t}}}\)). The yellow bands indicate the total uncertainty on the data in each bin. The Powheg +Pythia generator with \(h_\mathrm{damp}\!=\!m_{t}\) and the CT10nlo PDF is used as the nominal prediction to correct for detector effects

Fig. 9
figure 9

Fiducial phase-space normalized differential cross-sections as a function of the a  scalar sum of the transverse momenta of the hadronic and leptonic top quarks (\(H_\mathrm{T}^{{t\bar{t}}}\)) and b  the ratio of the hadronic W and the hadronic top transverse momenta (\(R_{Wt}\)). The yellow bands indicate the total uncertainty on the data in each bin. The Powheg +Pythia generator with \(h_\mathrm{damp}\!=\!m_{t}\) and the CT10nlo PDF is used as the nominal prediction to correct for detector effects

Fig. 10
figure 10

Fiducial phase-space normalized differential cross-sections as a function of the a absolute value of the rapidity of the hadronic top quark (\(|y^{t,\mathrm{had}}|\)), b absolute value of the rapidity (\(|y^{t\bar{t}}|\) ) of the \({t\bar{t}}\)  system and c longitudinal boost (\(y_\mathrm{boost}^{{t\bar{t}}}\)). The yellow bands indicate the total uncertainty on the data in each bin. The MC@NLO +Herwig generator is reweighted using the new PDF sets to produce the different predictions. The Powheg +Pythia generator with \(h_\mathrm{damp}\!=\!m_{t}\) and the CT10nlo PDF is used as the nominal prediction to correct for detector effects

The set of Figs. 1114 presents the normalized \(t\bar{t}\) full phase-space differential cross-sections as a function of the different observables. In particular, Fig. 11a, b show the top-quark transverse momentum and the absolute value of the rapidity; Fig. 12a–c present the \({t\bar{t}}\) system invariant mass, transverse momentum and absolute value of the rapidity while the additional observables related to the \(t\bar{t}\)system are shown in Figs. 13 and 14. Regarding the comparison between data and predictions, the general picture, already outlined for the fiducial phase-space measurements, is still valid even though the uncertainties are much larger due to the full phase-space extrapolation. In particular, the predictions for the top-quark \(p_{\mathrm {T}}\) and \(H_\mathrm{T}^{{t\bar{t}}}\)  tend to be in a better agreement with the data than what is observed in the fiducial phase-space. The \(\chi ^2\) and corresponding p-values for the different observables and predictions are shown in Table 4.

In Figs. 1518 the normalized \(t\bar{t}\)full phase-space differential cross-section as a function of \(p_{\mathrm {T}}^{t}\), \(|y^{t}|\), \(m^{t\bar{t}}\) and \(|y^{t\bar{t}}|\) are compared with theoretical higher-order QCD calculations.

The measurements are compared to four calculations that offer beyond–NLO accuracy:

  • an approximate next-to-next-to-leading-order (aNNLO) calculation based on QCD threshold expansions beyond the leading logarithmic approximation [81] using the CT14nnlo PDF [75];

  • an approximate next-to-next-to-next-to-leading-order (aN \(^3\) LO) calculation based on the resummation of soft-gluon contributions in the double-differential cross section at next-to-next-to-leading-logarithm (NNLL) accuracy in the moment-space approach in perturbative QCD [82] using the MSTW2008nnlo PDF [83];

  • an approximate NLO+NNLL calculation [84] using the MSTW2008nnlo PDF [83].

  • a full NNLO calculation [85] using the MSTW2008nnlo PDF [83]. The NNLO prediction does not cover the highest bins in \(p_{\mathrm {T}}^{t}\) and \(m^{t\bar{t}}\).

These predictions have been interpolated in order to match the binning of the presented measurements. Table 6 shows the \(\chi ^2\) and p-values for these higher-order QCD calculations.

Figures 15 and 16 show a comparison of the \(p_{\mathrm {T}}^{t}\) and \(|y^{t}|\) distributions to the aNNLO and aN \(^3\) LO, and to the NNLO calculations respectively. The aN \(^3\) LO calculation is seen to improve the agreement compared to the Powheg +Pythia generator in \(|y^{t}|\), but not in \(p_{\mathrm {T}}^{t}\). The aNNLO prediction produces a \(p_{\mathrm {T}}^{t}\) distribution that is softer than the data at high transverse momentum and does not improve the description of \(|y^{t}|\). The NNLO calculation is in good agreement with both the \(p_{\mathrm {T}}^{t}\) and \(|y^{t}|\) distributions, in particular the disagreement seen at high \(p_{\mathrm {T}}^{t}\) for the NLO generators is resolved by the NNLO calculation.

The measurement of the invariant mass and transverse momentum of the \(t\bar{t}\) system is compared to the NLO+NNLL prediction in Fig. 17. The NLO+NNLL calculation shows a good agreement in the \(m^{t\bar{t}}\) spectrum and a very large discrepancy for high values of the \(t\bar{t}\) transverse momentum. Figure 18 shows a comparison of the NNLO calculation to the \(m^{t\bar{t}}\) and \(|y^{t\bar{t}}|\) measurements. For the rapidity of the \(t\bar{t}\) system, the NNLO calculation improves the agreement slightly compared to the Powheg +Pythia prediction, but some shape difference can be seen between data and prediction.

Fig. 11
figure 11

Full phase-space normalized differential cross-sections as a function of the a transverse momentum (\(p_{\mathrm {T}}^{t}\)) and b the absolute value of the rapidity (\(|y^{t}|\) ) of the top quark. The grey bands indicate the total uncertainty on the data in each bin. The Powheg +Pythia generator with \(h_\mathrm{damp}\!=\!m_{t}\) and the CT10nlo PDF is used as the nominal prediction to correct for detector effects

Fig. 12
figure 12

Full phase-space normalized differential cross-sections as a function of the  a invariant mass (\(m^{t\bar{t}}\)), b transverse momentum (\(p_{\mathrm {T}}^{t\bar{t}}\)) and c absolute value of the rapidity (\(|y^{t\bar{t}}|\) ) of the \({t\bar{t}}\) system. The grey bands indicate the total uncertainty on the data in each bin. The Powheg +Pythia generator with \(h_\mathrm{damp}\!=\!m_{t}\) and the CT10nlo PDF is used as the nominal prediction to correct for detector effects

Fig. 13
figure 13

Full phase-space normalized differential cross-sections as a function of the  a production angle (\(\chi ^{{t\bar{t}}}\)) and b longitudinal boost (\(y_\mathrm{boost}^{{t\bar{t}}}\)) of the \({t\bar{t}}\) system. The grey bands indicate the total uncertainty on the data in each bin. The Powheg +Pythia generator with \(h_\mathrm{damp}\!=\!m_{t}\) and the CT10nlo PDF is used as the nominal prediction to correct for detector effects

Fig. 14
figure 14

Full phase-space normalized differential cross-sections as a function of the  a out-of-plane momentum (\(|p_\mathrm{out}^{{t\bar{t}}}|\)), b azimuthal angle (\(\Delta \phi ^{{t\bar{t}}}\)), and c  scalar sum of the transverse momenta of the hadronic and leptonic top quarks (\(H_\mathrm{T}^{{t\bar{t}}}\))) of the \({t\bar{t}}\) system. The grey bands indicate the total uncertainty on the data in each bin. The Powheg +Pythia generator with \(h_\mathrm{damp}\!=\!m_{t}\) and the CT10nlo PDF is used as the nominal prediction to correct for detector effects

Fig. 15
figure 15

Full phase-space normalized differential cross-section as a function of the a transverse momentum (\(p_{\mathrm {T}}^{t}\)) and b absolute value of the rapidity of the top quark (\(|y^{t}|\)) compared to higher-order theoretical calculations. The grey band indicates the total uncertainty on the data in each bin. The Powheg +Pythia generator with \(h_\mathrm{damp}\!=\!m_{t}\) and the CT10nlo PDF is used as the nominal prediction to correct for detector effects

Fig. 16
figure 16

Full phase-space normalized differential cross-section as a function of the a transverse momentum (\(p_{\mathrm {T}}^{t}\)) and b absolute value of the rapidity of the top quark (\(|y^{t}|\)) compared to NNLO theoretical calculations [85] using the MSTW2008nnlo PDF set. The grey band indicates the total uncertainty on the data in each bin. The Powheg +Pythia generator with \(h_\mathrm{damp}\!=\!m_{t}\) and the CT10nlo PDF is used as the nominal prediction to correct for detector effects

Fig. 17
figure 17

Full phase-space normalized differential cross-section as a function of the a invariant mass (\(m^{t\bar{t}}\) ) and b transverse momentum (\(p_{\mathrm {T}}^{t\bar{t}}\)) of the \(t\bar{t}\) system compared to higher-order theoretical calculations. The grey band indicates the total uncertainty on the data in each bin. The Powheg +Pythia generator with \(h_\mathrm{damp}\!=\!m_{t}\) and the CT10nlo PDF is used as the nominal prediction to correct for detector effects

Fig. 18
figure 18

Full phase-space normalized differential cross-section as a function of the a invariant mass (\(m^{t\bar{t}}\) ) and b absolute value of the rapidity (\(|y^{t\bar{t}}|\)) of the \(t\bar{t}\) system compared to NNLO theoretical calculations [85] using the MSTW2008nnlo PDF set. The grey band indicates the total uncertainty on the data in each bin. The Powheg +Pythia generator with \(h_\mathrm{damp}\!=\!m_{t}\) and the CT10nlo PDF is used as the nominal prediction to correct for detector effects

Table 3 Comparison between the measured fiducial phase-space normalized differential cross-sections and the predictions from several MC generators. For each variable and prediction a \(\chi ^2\) and a p-value are calculated using the covariance matrix of each measured spectrum. The number of degrees of freedom (NDF) is equal to \(N_\mathrm{b}-1\) where \(N_\mathrm{b}\) is the number of bins in the distribution
Table 4 Comparison between the measured full phase-space normalized differential cross-sections and the predictions from several MC generators. For each variable and prediction a \(\chi ^2\) and a p-value are calculated using the covariance matrix of each measured spectrum. The number of degrees of freedom (NDF) is equal to \(N_\mathrm{b}-1\) where \(N_\mathrm{b}\) is the number of bins in the distribution
Table 5 Comparison between the measured fiducial phase-space normalized differential cross-sections and the predictions from new PDF sets using the MC@NLO +Herwig generator. For each variable and prediction a \(\chi ^2\) and a p-value are calculated using the covariance matrix of each measured spectrum. The number of degrees of freedom (NDF) is equal to \(N_\mathrm{b}-1\) where \(N_\mathrm{b}\) is the number of bins in the distribution
Table 6 Comparison between the measured full phase-space normalized differential cross-sections and higher-order QCD calculations. For each variable and prediction a \(\chi ^2\) and a p-value are calculated using the covariance matrix of each measured spectrum. The number of degrees of freedom (NDF) is equal to \(N_\mathrm{b}-1\) where \(N_\mathrm{b}\) is the number of bins in the distribution

11 Conclusions

Kinematic distributions of the top quarks in \(t\bar{t}\) events, selected in the lepton+jets channel, are measured in the fiducial and full phase space using data from 8  \({\mathrm {TeV}}\) proton–proton collisions collected by the ATLAS detector at the Large Hadron Collider, corresponding to an integrated luminosity of 20.3 fb\(^{-1}\). Normalized differential cross-sections are measured as a function of the hadronic top-quark transverse momentum and rapidity, and as a function of the mass, transverse momentum, and rapidity of the \(t\bar{t}\) system. In addition, a new set of observables describing the hard-scattering interaction (\(\chi ^{{t\bar{t}}}\), \(y_\mathrm{boost}^{{t\bar{t}}}\)) and sensitive to the emission of radiation along with the \({t\bar{t}}\) pair (\(\Delta \phi ^{{t\bar{t}}}\), \(|p_\mathrm{out}^{{t\bar{t}}}|\), \(H_\mathrm{T}^{{t\bar{t}}}\), \(R_{Wt}\)) are presented.

The measurements presented here exhibit, for most distributions and in large part of the phase space, a precision of the order of 5 % or better and an overall agreement with the Monte Carlo predictions of the order of 10 %.

The \(|y^{t\bar{t}}|\) and \(y_\mathrm{boost}^{{t\bar{t}}}\)distributions are not well modelled by any generator under consideration in the fiducial phase space, however the agreement improves when new parton distribution functions are used with the MC@NLO +Herwig generator.

All the generators under consideration consistently predict a ratio of the hadronic W boson and top-quark transverse momenta (\(R_{Wt}\)) with a mis-modelling of up to 10 % in the range 1.5–3.

The tail of the \(p_{\mathrm {T}}^{t,\mathrm{had}}\) distribution is harder in all predictions than what is observed in data, an effect previously observed in measurements by ATLAS and CMS. The agreement improves when using the Herwig parton shower with respect to Pythia. The tension observed for Powheg +Pythia , Powheg +Pythia8   and MadGraph +Pythia in the \(p_{\mathrm {T}}^{t}\) spectrum is reflected in the tail of the \(H_\mathrm{T}^{{t\bar{t}}}\) distribution.

Similarly, both aN \(^3\) LO and aNNLO predictions have a poor agreement in the \(p_{\mathrm {T}}^{t}\) spectrum in the full phase space. However, the full NNLO calculation, which has just become available, is in good agreement with the \(p_{\mathrm {T}}^{t}\) distribution, indicating the disagreement seen with the generators and other calculations is due to missing higher-order terms. The NNLO calculation also shows good agreement in the \(|y^{t}|\) and \(m^{t\bar{t}}\) distributions.