Measurement of four-jet differential cross sections in $\sqrt{s}=8$ TeV proton-proton collisions using the ATLAS detector

Differential cross sections for the production of at least four jets have been measured in proton-proton collisions at $\sqrt{s} = 8$ TeV at the Large Hadron Collider using the ATLAS detector. Events are selected if the four anti-$k_{t}$ R=0.4 jets with the largest transverse momentum ($p_{T}$) within the rapidity range $|y|<2.8$ are well separated ($dR^{\rm min}_{4j}>0.65$), all have $p_{T}>64$ GeV, and include at least one jet with $p_{T}>100$ GeV. The dataset corresponds to an integrated luminosity of 20.3 $fb^{-1}$. The cross sections, corrected for detector effects, are compared to leading-order and next-to-leading-order calculations as a function of the jet momenta, invariant masses, minimum and maximum opening angles and other kinematic variables.


Introduction
The production of particle jets at hadron colliders such as the Large Hadron Collider (LHC) [1] provides a fertile testing ground for the theory describing strong interactions, Quantum Chromodynamics (QCD). In QCD, jet production is interpreted as the fragmentation of quarks and gluons produced in the scattering process followed by their subsequent hadronisation. At high transverse momenta (p T ) the scattering of partons can be calculated using perturbative QCD (pQCD) and experimental jet measurements are directly related to the scattering of quarks and gluons. The large cross sections for such processes allow for differential measurements in a wide kinematic range and stringent testing of the underlying theory.
This analysis studies events where at least four jets are produced in a hard-scatter process. These events are of particular interest as the corresponding Feynman diagrams require several vertices even at leadingorder (LO) in the strong coupling constant α S . The current state-of-the-art theoretical predictions for such processes are at next-to-leading-order in α S (next-to-leading-order perturbative QCD, NLO pQCD) [2,3], and they have recently been combined with parton shower (PS) simulations [4]. An alternative approach is taken by generators which provide a matrix element (ME) for the hardest 2 → 2 process while the rest of the jets are provided by a PS model, which implements a resummation of the leading-logarithmic terms (e.g. Pythia 8 [5] and Herwig++ [6]). It is also interesting to test multi-leg (i.e., 2 → n) LO pQCD generators (e.g. Sherpa [7] or MadGraph [8]), since they may provide adequate descriptions of the data in specific kinematic regions and have the advantage of being less computationally expensive than NLO calculations.
It is interesting to note that the previous ATLAS measurement of multi-jet production at √ s = 7 TeV [9] indicates that predictions may differ from data by ∼ 30% even at NLO [10]. This work explores a variety of kinematic regimes and topological distributions to test the validity of QCD calculations, including the PS approximation and the necessity of higher-order ME in Monte Carlo (MC) generators.
Additionally, four-jet events represent a background to many other processes at hadron colliders. Hence, the predictive power of the QCD calculations, in particular their ability to reproduce the shapes of the distributions studied in this analysis, is of general interest. While searches for new phenomena in multi-jet events use data-driven techniques to estimate the contribution from QCD events, as was done for example in ref. [11], these methods are tested in MC simulations. The accuracy of the theoretical predictions remains therefore important.
Three-jet events have been measured differentially by many experiments. Indeed it was observations of such events that heralded the discovery of the gluon [12][13][14][15]. More recently, at the LHC, ATLAS has measured the three-jet cross section differentially [16] and CMS has used the ratio of three to two jet events to measure α S [17]. Event shape variables have also been measured, showing sensitivity to higher-order pQCD effects [18,19]. Multi-jet cross sections have been measured previously at CMS [20], ATLAS [9], CDF [21,22] and D0 [23,24], although with smaller datasets and/or lower energy, and generally focussed on different observables. This paper presents the differential cross sections for events with at least four jets, studied as a function of a variety of kinematic and topological variables which include momenta, masses and angles. Events are selected if the four anti-k t R = 0.4 jets with the largest transverse momentum within the rapidity range |y| < 2.8 are well separated, all have p T > 64 GeV, and include at least one jet with p T > 100 GeV. The measurements are corrected for detector effects. The variables are binned in the leading jet p T and the total invariant mass, such that different regimes and configurations can be tested. The measurements are sensitive to the various mass scales in an event, the presence of forward jets, or the azimuthal configuration of the jets -that is, one jet recoiling against three, or two recoiling against two.
The structure of the paper is as follows. Section 2 introduces the ATLAS detector. The observables and phase space of interest are defined in section 3. The MC simulation samples studied in this work are summarised in section 4, while the theory predictions and their uncertainties are described in section 5. The trigger, jet calibration and data cleaning are presented in section 6. The unfolding of detector effects is detailed in section 7. Section 8 provides the experimental uncertainties included in the final distributions. Finally, the results are shown in section 9 and the conclusions are drawn in section 10.

The ATLAS detector
The ATLAS experiment [25] is a multi-purpose particle physics detector with a forward-backward symmetric cylindrical geometry and nearly 4π coverage in solid angle, with instrumentation up to |η| = 4.9. 1 The layout of the detector is based on four superconducting magnet systems, which comprise a thin solenoid surrounding the inner tracking detectors (ID) and a barrel and two end-cap toroids generating the magnetic field for a large muon spectrometer. The calorimeters are located between the ID and the muon system. The lead/liquid-argon (LAr) electromagnetic (EM) calorimeter is split into two regions: the barrel (|η| < 1.475) and the end-cap (1.375 < |η| < 3.2). The hadronic calorimeter is divided into four regions: the barrel (|η| < 0.8) and the extended barrel (0.8 < |η| < 1.7) made of scintillator/steel, the end-cap (1.5 < |η| < 3.2) with LAr/copper modules, and the forward calorimeter (3.1 < |η| < 4.9) composed of LAr/copper and LAr/tungsten modules.
A three-level trigger system [26] is used to select events for further analysis. The first level (L1) of the trigger reduces the event rate to less than 75 kHz using hardware-based trigger algorithms acting on a subset of detector information. The second level (L2) uses fast online algorithms, while the final trigger stage, called the Event Filter (EF), uses reconstruction software with algorithms similar to the offline versions. The last two software-based trigger levels, referred to collectively as the High-Level Trigger (HLT), further reduce the event rate to about 400 Hz.

Cross-section definition
This measurement uses jets reconstructed with the anti-k t algorithm [27] with four-momentum recombination as implemented in the FastJet package [28]. The radius parameter is R = 0.4.
Cross sections are calculated for events with at least four jets within the rapidity range |y| < 2.8. Out of those four jets, the leading one must have p T > 100 GeV, while the next three must have p T > 64 GeV. In addition, these four jets must be well separated from one another by ∆R min 4j > 0.65, where ∆R min 4j = min i, j∈ [1,4] i j (∆R i j ), and ∆R i j = (|y i − y j | 2 + |φ i − φ j | 2 ) 1/2 . This set of criteria is also referred to as the 'inclusive analysis cuts' to differentiate them from the cases where additional requirements are made, for example on the invariant mass of the four leading jets. The inclusive analysis cuts are mainly motivated by the triggers used to select events, described in section 6.1.
Cross sections are measured differentially as a function of the kinematic variables defined in table 1; the list includes momentum variables, mass variables and angular variables. The only jets used in all cases are the four leading ones in p T . The observables were selected for their sensitivity to differences between different Monte Carlo models of QCD processes and their ability to describe the dynamics of the events. For example, the H T variable is often used to set the scale of multi-jet processes. The four-jet invariant mass m 4j is representative of the largest energy scale in the event whereas m min 2j , the minimum dijet invariant mass, probes the smallest jet-splitting scale. The ratio m min 2j /m 4j therefore provides information about the range of energy scales relevant to the QCD calculation. The ∆φ min 2j and ∆y min 2j variables quantify the minimum angular separation between any two jets. The azimuthal variable ∆φ min 3j distinguishes events with pairs of nearby jets (which have large ∆φ min 3j ) from the recoil of three jets against one (leading to small ∆φ min 3j values). The rapidity variable ∆y min 3j works in a similar way. The ∆y max 2j and Σp central T variables are designed to be sensitive to events with forward jets. In order to build Σp central T , first the two jets with the largest rapidity interval in the event are identified, and then the scalar sum of the p T of the remaining two jets is calculated. the transverse plane, φ being the azimuthal angle around the beam pipe. The rapidity y is defined by 1 2 ln E+pz E−pz , the pseudorapidity in terms of the polar angle θ as η = − ln tan(θ/2).
Invariant mass of the four jets m min 2j /m 4j min i, j∈ [1,4] i j / m 4j Minimum invariant mass of two jets relative to invariant mass of four jets ∆φ min 2j min i, j∈ [1,4] i j |φ i − φ j | Minimum azimuthal separation of two jets ∆y min 2j min i, j∈ [1,4] i j |y i − y j | Minimum rapidity separation of two jets ∆φ min 3j min i, j,k∈ [1,4] i j k |φ i − φ j | + |φ j − φ k | Minimum azimuthal separation between any three jets ∆y min 3j min i, j,k∈ [1,4] i j k |y i − y j | + |y j − y k | Minimum rapidity separation between any three jets ∆y max 2j ∆y max i j = max i, j∈ [1,4] |y i − y j | Maximum rapidity difference between two jets Σp central T |p c T | + |p d T | If ∆y max 2j is defined by jets a and b, this is the scalar sum of the p T of the other two jets, c and d ('central' jets) Table 1: Definitions of the various kinematic variables measured. Only the four jets with the largest p T are considered in all cases. Different phase-space regions are probed by binning the variables in regions defined by a lower bound on p (1) T and m 4j . This allows one to distinguish between the two types of topologies characterised by ∆φ min 3j , or to track the position of the leading jet with respect to the forward-backward pair in the Σp central T variables. Table 2 summarises all the phase-space regions considered in the analysis for each of the variables. The resulting differential cross-section distributions are corrected for detector effects (unfolding) and taken to the so-called particle-jet level, or simply 'particle level'. In the MC simulations used in the unfolding procedure, particle jets are built from particles with a proper lifetime τ satisfying cτ > 10 mm, including muons and neutrinos from hadron decays. The event selection described above is applied to particle jets to define the phase space of the unfolded results.
Double parton interactions have not been investigated independently, so the measurement is inclusive in this respect. They are expected to contribute 1% or less to the results.

Monte Carlo samples
Monte Carlo samples are used to estimate experimental systematic uncertainties, deconvolve detector effects, and provide predictions to be compared with the data. Leading-order Monte Carlo samples are used for all three purposes. A set of theoretical calculations at higher orders, described in section 5, are also compared to the data. The full list of generators is shown in table 3.
The samples used in the experimental studies comprise two LO 2 → 2 generators, Pythia 8.160 [5] and Herwig++ 2.5.2 [6], and the LO multi-leg generator MadGraph5 v1.5.12 [8]. As described in the introduction, LO generators are still widely used in searches for new physics, which motivates the comparison of their predictions to the data.

Theoretical predictions
The results of the measurement are compared to NLO predictions, in addition to the LO samples described in section 4. These are calculated using BlackHat/Sherpa [2,3] and NJet/Sherpa [39,40], and have been provided by their authors. They are both fixed-order calculations with no PS and no hadronisation. Therefore, the results are presented at the parton-jet level, that is, using jets built from partons instead of hadrons. For the high-p T phase space covered in this analysis, non-perturbative corrections are expected to be small [41,42]. BlackHat performs one-loop virtual corrections using the unitarity method and onshell recursion. The remaining terms of the full NLO computation are obtained with AMEGIC++ [43,44], part of Sherpa. NJet makes a numerical evaluation of the one-loop virtual corrections to multi-jet production in massless QCD. The Born matrix elements are evaluated with the Comix generator [45,46] within Sherpa. Sherpa also performs the phase-space integration and infra-red subtraction via the Catani-Seymour dipole formalism. Both the BlackHat/Sherpa and NJet/Sherpa predictions use the CT10 PDFs.
The results are also compared to predictions provided by HEJ [47][48][49]. HEJ is a fully exclusive Monte Carlo event generator based on a perturbative cross-section calculation which approximates the hardscattering ME to all orders in the strong coupling constant α S for jet multiplicities of two or greater. The approximation is exact in the limit of large separation in rapidity between partons. The calculation uses the CT10 PDFs. As in the case of the NLO predictions, no PS or hadronisation are included.
The different predictions tested are expected to display various levels of agreement in different kinematic configurations. The generators which combine 2 → 2 parton matrix elements (MEs) with parton showers (PSs) are in principle not expected to provide a good description of the data, particularly in regions where the additional jets are neither soft nor collinear. A previous measurement of multi-jet cross sections at 7 TeV by the ATLAS Collaboration [9] found that the cross section predicted by MC models typically disagreed with the data by O(40%). It also found disagreements of up to 50% in the shape of the differential cross section measured as a function of p (1) T or H T . Nevertheless, there are also examples of exceptional cases where these calculations perform well, which adds interest to the measurement; for example, the same 7 TeV ATLAS paper observed that the shape of the p (4) T distribution was described by Pythia within just 10%. It is also interesting to test whether PSs based on an angular ordering perform better in angular variables such as ∆φ min 2j or ∆φ min 3j than those using momentum ordering. In contrast to PS predictions, multi-leg matrix element calculations matched to parton showers (ME+PS) were seen at 7 TeV to significantly improve the accuracy of the cross-section calculation and the shapes of the momentum observables. In the present analysis, such calculations are expected to perform better in events with additional high-p T jets and/or large combined invariant masses of jets. This is also the type of scenario where HEJ is expected to perform well, since it provides an all-order description of processes with more than two hard jets, and it is designed to capture the hard, wide-angle emissions which a standalone PS approach would miss. Variables such as ∆y max 2j , ∆y min 3j or Σp central T were included in the analysis with this purpose in mind. Finally, the fixed-order, four-jet NLO predictions are expected to provide a better estimation of the cross sections than the LO calculations. Interestingly, studies at 7 TeV found that the NLO cross section for four-jet events was ∼ 30% higher than the data [10].

Normalisation
To facilitate comparison with the data, the cross sections predicted by the LO generators as well as HEJ are multiplied by a scale factor. The factor is such that the integrated number of events in the region 500 GeV < p (1) T < 1.5 TeV which satisfy the inclusive analysis cuts in section 3 is equal to the corresponding number in data. The full set of normalisation factors is shown in table 3. No scale factor is ascribed to BlackHat/Sherpa and NJet/Sherpa such that the level of agreement with data can be assessed in light of the theoretical uncertainties, as discussed in section section 5.2.

Theoretical uncertainties
Theoretical uncertainties have been computed for HEJ and the NLO predictions. The sensitivity of the HEJ calculation to higher-order corrections was determined by the authors of the calculation by varying independently the renormalisation and factorisation scales by factors of √ 2, 2, 1/ √ 2 and 1/2 around the central value of H T /2. The total uncertainty is the result of taking the envelope of all the variations. The typical size of the uncertainty is +50% −30% , and it is not drawn on the figures for clarity. The central value of the renormalisation and factorisation scales used in the NJet/Sherpa and Black-Hat/Sherpa samples is also H T /2. Scale uncertainties are evaluated for NJet/Sherpa by simultaneously varying both scales by factors of 1/2 and 2. PDF uncertainties are obtained by reweighting the distributions for all the PDF error sets using LHAPDF [50], following the recommendations from ref. [51]. The additional PDF sets include variations in the value of α S . The sum in quadrature of the resulting scale and PDF variations defines the NLO theoretical uncertainty included in the result figures in section 9. The uncertainty is dominated by the scale component due to the rapid drop of the cross section with decreasing values of the renormalisation and factorisation scales. As a result, the uncertainty is significantly asymmetric.

Data selection and calibration
The data sample used was taken during the period from March to December 2012 with the LHC operating at a pp centre-of-mass energy of √ s = 8 TeV. The application of data-quality requirements results in an integrated luminosity of 20.3 fb −1 .  Figure 1: Schematic of the kinematic regions in which the four different jet triggers are used, including the total luminosity that each of them recorded. The term 4j45 (4j65) refers to a trigger requiring at least four jets with p T > 45 GeV (65 GeV), where the p T is measured at the EF level of the triggering system. The term j280 (j360) refers to a trigger requiring at least one jet with p T > 280 GeV (360 GeV) at the EF level. The horizontal and vertical axes correspond to p (1) T and p (4) T respectively, both calculated at the offline level (i.e., including the full object calibration).

Trigger
The events used in this analysis are selected by a combination of four jet triggers, consisting of the three usual levels and defined in terms of the jets produced in the event. The hardware-based L1 trigger provides a fast decision based on the energy measured by the calorimeter. The L2 trigger performs a simple jet reconstruction procedure in the geometric regions identified by the L1 trigger. The final decision taken by the EF trigger is made using jets from the region of |η| < 3.2, and reconstructed from topological clusters [52] using the anti-k t algorithm with R = 0.4.
The four different triggers used in this paper are shown in figure 1. Two of the triggers select events with at least four jets, while the remaining two select events with at least one jet at a higher p T threshold. Events are split into the four non-overlapping kinematic regions shown in figure 1, requiring at least four well-separated jets with varying p T thresholds in order to apply the corresponding trigger. This ensures trigger efficiencies greater than 99% for any event passing the inclusive analysis cuts. The small residual loss of data due to trigger inefficiency is corrected as a function of jet p T using the techniques described in section 7.
As noted in figure 1, three out of the four triggers only recorded a fraction of the total dataset. The contributions from the events selected by those three triggers are scaled by the inverse of the corresponding fraction.

Jet reconstruction and calibration
Jets are reconstructed using the anti-k t jet algorithm [27] with four-momentum recombination and radius parameter R = 0.4. The inputs to the jet algorithm are locally-calibrated topological clusters of calori-meter cells [52], which reconstruct the three-dimensional shower topology of each particle entering the calorimeter.
ATLAS has developed several jet calibration schemes [53] with different levels of complexity and different sensitivities to systematic effects. In this analysis the local cluster weighting (LCW) calibration [52] method is used, which classifies topological clusters as either being of electromagnetic or hadronic origin. Based on this classification, specific energy corrections are applied, improving the jet energy resolution. The final jet energy calibration, generally referred to as the jet energy scale, corrects the average calorimeter response to reproduce the energy of the true particle jet.
The jet energy scale and resolution have been measured in pp collision data using techniques described in references [54][55][56]. The effects of pile-up on jet energies are accounted for by a jet-area-based correction [57] prior to the final calibration, where the area of the jet is defined in η-φ space. Jets are then calibrated to the hadronic energy scale using p T -and η-dependent calibration factors based on MC simulations, and their response is corrected based on several observables that are sensitive to fragmentation effects. A residual calibration is applied to take into account differences between data and MC simulation based on a combination of several in-situ techniques [54].

Data quality criteria
Before applying the selection that defines the kinematic region of interest, events are required to pass the trigger, as described in section 6.1, and to contain a primary vertex with at least two tracks. Events which contain energy deposits in the calorimeter consistent with noise, or with incomplete event data, are rejected. In addition, events containing jets pointing to problematic calorimeter regions, or originating from non-collision background, cosmic rays or detector effects, are vetoed. These cleaning procedures are emulated in the MC simulation used to correct for experimental effects, as is discussed in detail in section 7.
No attempt is made to exclude jets that result from photons or leptons impacting the calorimeter, nor are the contributions from such signatures corrected for. Events containing photons or τ leptons are expected to contribute less than 0.1% to the cross sections under study.
Distributions of two example variables (p (1) T and p (4) T ) can be seen at the detector level (i.e. prior to unfolding detector effects) in figure 2. Different sets of points correspond to the data and the different MC generators, which are normalised to data with the scale factors indicated in table 3. These are constant factors used to facilitate the comparison with data, as described in section 5.1. Given that the generators have only LO or even only leading-logarithmic accuracy, the observed agreement is reasonable.

Data unfolding
Cross sections are measured differentially in several variables, each of which is binned in p (1) T or m 4j . Each of the corresponding distributions is individually unfolded to deconvolve detector effects such as inefficiencies and resolutions. The unfolding is performed using the Bayesian Iterative method [58, 59], as implemented in the RooUnfold package [60]. The algorithm builds an unfolding matrix starting with an initial prior probability distribution taken from MC simulation, and improves it iteratively. The method takes into account migrations between bins. It also corrects the results for the presence of events which pass the selection at reconstructed-level but not at the particle level; and for detector inefficiencies, which have the opposite effect. The number of iterations is optimised in order to minimise the size of the statistical and systematic uncertainties. A lower number of iterations results in a higher dependence on the MC simulation, whereas higher values give larger statistical uncertainties. For the analysis presented in this paper, two iterations are used.
The data are unfolded to the particle-jet level using the Pythia MC simulation to build the unfolding matrix. In order to construct the matrix, events are required to pass the inclusive analysis cuts at both the reconstructed and particle levels. The cuts require that events have at least four jets within |y| < 2.8, with p (1) T >100 GeV and p (2) T , p (3) T , p (4) T > 64 GeV. The four leading jets must in addition be separated by ∆R min 4j > 0.65. For observables requiring additional kinematic cuts, these are also applied both at the reconstructed and particle levels. No spatial matching is performed between reconstructed-level and particle-level jets.
The correlation between the observables before and after the incorporation of experimental effects tends to be higher for p T -based variables, such as H T . In the case of angular variables, such as ∆φ min 2j , the correlation is weakened due to cases where energy resolution effects lead to re-ordering of the jet p T . Nevertheless, even in the case of such angular variables the entries far from the diagonal of the correlation matrix are significantly smaller than the diagonal elements. The binning is derived from an optimisation procedure such that the purity of the bins is between 70% and 90%, and the statistical uncertainty of the measurement is 10%. The purity is defined as the fractional number of events per bin which do not migrate to other bins after the detector simulation, calculated with respect to the number of events which pass the particle-level cuts.
The possible presence of biases in the unfolded spectra due to MC mismodelling of the reconstructed-level spectrum is evaluated using a data-driven closure test. In this study, the MC distributions are reweighted to match the shape of those obtained from the data, and then unfolded using the same unfolding matrix as for the data. A data-driven systematic uncertainty is computed by comparing the result obtained from this procedure and the original reweighted particle-level MC distributions. With two iterations of the unfolding algorithm, this systematic uncertainty is found to be negligible.
A second unfolding uncertainty is evaluated to account for the model dependence of the efficiency with which both the reconstructed-and particle-level cuts are satisfied in each MC event. The systematic uncertainty is derived from the differences between the efficiencies calculated with Herwig++ and those calculated using Pythia. The resulting uncertainty is found to be subdominant in most cases, with typical sizes of 2-10%. The uncertainty is rebinned and smoothed, such that its statistical uncertainty is smaller than 40%.
The statistical uncertainties are calculated with pseudo-experiments. For each pseudo-experiment, the data and MC distributions are reweighted event by event following a Poisson distribution centred at one. Each resulting Poisson replica of the data is unfolded using the corresponding fluctuated unfolding matrix. The random numbers for the pseudo-experiments are generated using unique seeds, following the same scheme used by the inclusive jet [42], dijet [61] and three-jet [16] measurements at √ s = 7 TeV, to allow for possible future combination of results with the same dataset used for this analysis.
The integral of the unfolded distributions, corresponding to the cross section in the fiducial range determined by the inclusive analysis cuts, was compared for all the variables defined in the same region of phase space and found to agree with each other within 0.5%.

Experimental uncertainties
Several sources of experimental uncertainty are considered in this analysis. Those arising from the unfolding procedure are described in section 7. This section presents the uncertainties which arise from the jet energy scale (JES), jet energy resolution (JER), jet angular resolution and integrated luminosity. The dominant source of uncertainty in this measurement is the JES.
The uncertainty in the JES calibration is determined in the central detector region by exploiting the transverse momentum balance in Z+jet, γ+jet or multi-jet events, which are measured in situ. The uncertainties in the energy of the reference object are propagated to the jet whose energy scale is being probed. The uncertainty in the central region is propagated to the forward region using dijet systems balanced in transverse momentum. The procedure is described in detail in ref. [54].
The total JES uncertainty is decomposed into eighteen components, which account for the uncertainty in the jet energy scale calibration itself, as well as uncertainties due to the pile-up subtraction procedure, parton flavour differences between samples, b-jet energy scale and punch-through. Each of these uncertainties is incorporated as a coherent shift of the scale of the jets in the MC simulation. The energies and transverse momenta of all jets with p T > 20 GeV and |y| < 2.8 are varied up and down by one standard deviation of each uncertainty component; these components are asymmetric, i.e. the values of the upwards and downwards variations are different. The shifts are then propagated through the unfolding. The unfolded distributions corresponding to the systematically varied spectra are compared one by one to the nominal ones, and the difference taken as the unfolded-level uncertainty due to that JES uncertainty component. The total JES uncertainty is obtained by summing all such contributions quadratically, respecting the sign of the variations in the event yields; that is, positive and negative event yield variations are added independently.
Statistical uncertainties on each of the JES uncertainty components are obtained by creating Poisson replicas of the systematically varied spectra, obtained as explained in section 7. Such statistical uncertainties are used to evaluate the significance of the uncertainty for each component and for each bin of all the differential distributions. As in the case of the unfolding uncertainty, the unfolded-level uncertainty due to each JES component is then rebinned and smoothed using a Gaussian kernel regression in order to get statistical uncertainties smaller than 40% in all bins. The typical size of the JES uncertainty is 4-15%.
Jets may be affected by additional energy originating from pile-up interactions. This effect is corrected for as part of the jet energy calibration. The distributions were binned in different ranges of the average number of interactions per bunch crossing in order to test the possible presence of residual effects. No significant deviations were observed, therefore no uncertainty associated with pile-up mismodelling was considered beyond the pile-up uncertainty already included in the jet calibration procedure.
The JER has been measured in data using dijet events [62], and an uncertainty was derived from the differences seen between data and MC prediction. In general, the energy resolution observed in data is somewhat worse than that in MC simulations. The uncertainty on the observables can therefore be evaluated by smearing the energy of the reconstructed jets in the MC simulation. After applying this smearing to the jets, an alternative unfolding matrix is derived and used to unfold the nominal MC prediction. Then the MC distribution is unfolded using both the nominal and the smeared matrices, and the difference between the two is symmetrised and taken as the JER systematic uncertainty. The typical size of this uncertainty is 1-10% of the cross section.
The jet angular resolution was estimated in MC simulation for the pseudorapidity and φ by matching spatially jets at the reconstructed and particle level, and found to be ∼ < 2%. This is in agreement with in-situ measurements, so no systematic uncertainty is assigned.
Finally, the uncertainty on the integrated luminosity is ±2.8%. It is derived following the same methodology as that detailed in ref. [63].
Two examples of the values of the total experimental systematic uncertainty are shown in figure 3 for two representative variables, namely H T and ∆φ min 2j . The jet energy scale and resolution uncertainties dominate in the majority of bins, being larger at the high and low ends of the H T spectrum. The unfolding uncertainty is nearly as large at low values of the jet momenta, and it is therefore an important contribution in most of the ∆φ min 2j bins.

Results
The various differential cross sections measured in events with at least four jets are shown in figures 4 to 19 for jets reconstructed with the anti-k t algorithm with R = 0.4. The observables used for the measurements are defined in table 1. The measurements are performed for a wide range of jet transverse momenta from 64 GeV to several TeV, spanning two orders of magnitude in p T and over seven orders of magnitude in cross section. The measured cross sections are corrected for all detector effects using the unfolding procedure described in section 7. The theoretical predictions described in sections 4 and 5 are compared to the unfolded results. T > 64 GeV, p (1) T > 100 GeV and ∆R min 4j > 0.65. Separate bands show the jet energy scale (JES) and resolution (JER), and the unfolding uncertainty, as well as the combined total systematic uncertainty resulting from adding in quadrature all the components. The total statistical uncertainty of the unfolded data spectrum is also shown. The luminosity uncertainty is not shown separately but is included in the total uncertainty band.

Summary of the results
The scale factors applied to LO generators (see section 5.1) are found to vary between 0.6 and 1.4, as shown previously in table 3. Not all generators describe the shape of p (1) T correctly, so these scale factors should not be seen as a measure of the level of agreement between MC simulation and data, which may vary as a function of the cuts in p (1) T and m 4j . The cross section predicted by BlackHat/Sherpa and NJet/Sherpa is larger than that measured in data, but overall the difference is covered by the scale and PDF uncertainties evaluated using NJet/Sherpa, with only a few exceptions. BlackHat/Sherpa and NJet/Sherpa give identical results within statistical uncertainties; therefore only one of the two (NJet/Sherpa) is discussed in the following, for simplicity. It is nevertheless interesting to compare experimental results with two different implementations of the same NLO pQCD calculations as an additional cross-check.
In general, an excellent description of both the shape and the normalisation of the variables is given by NJet/Sherpa. The small differences found are covered by theoretical and statistical uncertainties in almost all cases; only the tails of p (4) T and ∆y max 2j hint at deviations from the measured distribution. Mad-Graph+Pythia describes the data very well in most regions of phase space, the most significant discrepancy being in the slopes of p (1) T and p (2) T and derived variables. HEJ also provides a good description of most variables; the most significant discrepancy occurs for the angular variables ∆y min 2j and ∆y max 2j when p (1) T is small. However when p (1) T is large, HEJ describes ∆y max 2j better than NJet/Sherpa, which highlights one of the strengths of this calculation. The 2 → 2 ME calculations matched to parton showers provide different levels of agreement depending on the variable studied; the only variable whose shape is reasonably well described by both Pythia and Herwig++ is H T .
The following discussion is based on the results obtained after applying the particular choice of normalisation of the theoretical predictions as explained at the beginning of this section. NJet/Sherpa, which generally gives very good agreement with the data, is only discussed for those cases where some deviations are present.

Momentum variables
The momentum variables comprise the p T of the four leading jets and H T . Part of the importance of these variables lies in their wide use in analyses, alone or as inputs to more complex observables. They are also interesting in themselves: it has been shown that the ratio of the NLO to the LO predictions is relatively flat across the p (1) T spectrum with a maximum variation of approximately 25% [10]. Perhaps surprisingly, the PS description of p (4) T was found to be better than that of p (1) T in the 7 TeV multi-jet measurement published by ATLAS [9]. ). The ratios of Herwig++ and HEJ to data are remarkably flat above ∼ 500 GeV and ∼ 300 GeV respectively. MadGraph+Pythia is within the experimental uncertainties above ∼ 300 GeV, and it is the only one with a positive slope in the ratio to data.
The subleading jet p T (figure 5) is well described by HEJ, while the LO generators show similar trends to those in p (1) T . MadGraph+Pythia describes both p (3) T and p (4) T well, as shown in figures 6 and 7. As the 7 TeV results suggested, Pythia gives a good description of the distribution of p (4) T . HEJ and Herwig++ overestimate the number of events with high p (4) T . NJet/Sherpa shows a similar trend at high p (4) T , but the discrepancy is mostly covered by the theoretical uncertainties. H T , shown in figure 8, exhibits features similar to those in p (1) T . In summary, Pythia and Herwig++ tend to describe the p T spectrum of the leading jets with similar levels of agreement, whereas Pythia is better at describing p (4) T . MadGraph+Pythia does a reasonable job for all of them, while HEJ and NJet/Sherpa are very good for the leading jets and less so for p (4) T . This could perhaps be improved by matching the calculations to PSs.
Mass variables Mass variables are widely used in physics searches, and they are also sensitive to events with large separations between jets, which puts the HEJ and MadGraph+Pythia predictions to the test, as they are expected to be especially accurate in this regime.
The distribution of the total invariant mass m 4j is studied in figure 9. Pythia and MadGraph+Pythia describe the data very well. Herwig++ describes the shape of the data between 1 TeV and 3-6 TeV. HEJ is mostly compatible with the measurement, but the ratio to data has a bump structure in the region of approximately 1 to 2 TeV. This feature is also shared by NJet/Sherpa, but the differences with respect to the data are covered by the NLO uncertainties.
The description of different splitting scales is tested in figure 10 through the variable m min 2j /m 4j . This distribution is well described by Pythia, whereas Herwig++ gets worse with increasing m 4j , consistently overestimating the two ends of the m min 2j /m 4j spectrum. MadGraph+Pythia provides a very good description, with a flat ratio for all the m 4j cuts. The HEJ prediction shows trends similar to those of Herwig++ at higher values of m 4j . These differences are covered in all cases by the large associated scale uncertainty. NJet/Sherpa overestimates the number of events in the very first bin, possibly due to the lack of a PS, but otherwise agrees with the data within the theoretical uncertainties.
Overall, MadGraph+Pythia provides the best description of mass variables.
Angular variables Similarly to mass variables, angular variables are able to test the description of events with small-and wide-angle radiation. In addition, they can also provide information on the global spatial distribution of the jets. High-p T , large-angle radiation should be well captured by the ME+PS description of MadGraph+Pythia, or the all-orders approximation of HEJ -particularly the rapidity variables ∆y min 2j , ∆y max 2j and ∆y min 3j . PS generators are expected to perform poorly at large angles, given that they only contain two hard jets, and the rest is left to the soft-and collinear-enhanced PS. The fixed-order NLO prediction of NJet/Sherpa should provide a very good description of these variables too, as long as they are far from the infrared limit. This is indeed the case, and therefore no detailed comments about its performance are given here. Figure 11 compares the distributions of ∆φ min 2j for different cuts in p (1) T . Pythia has a small downwards slope with respect to the data in all the p (1) T ranges. MadGraph+Pythia also shows a small slope. The other generators, both LO and NLO, reproduce the data very well. Herwig++, in particular, provides a very good description of the data.
The ∆φ min 3j spectrum is shown in figure 12. The different p (1) T cuts change the spatial distribution of the events, such that at low p (1) T most events contain two jets recoiling against two, while at high p (1) T the events where one jet recoils against three dominate. In general, the description of the data improves as p (1) T increases. For Pythia, the number of events where one jet recoils against three (low ∆φ min 3j ) is significantly overestimated when p (1) T is low; as p (1) T increases, the agreement improves such that the p (1) T > 1000 GeV region is very well described. MadGraph+Pythia, Herwig++ and HEJ are mostly in good agreement with data. Figure 13 compares the distributions of ∆y min 2j with data. This variable is remarkably well described by Pythia, showing no significant trend. MadGraph+Pythia mostly underestimates high ∆y min 2j values, while Herwig++ has a tendency to underestimate the low values. HEJ overestimates the number of events with high ∆y min 2j values at low p (1) T , but describes the data very well at larger values of p (1) T .
For the variable ∆y min 3j , presented in figure 14, the predictions provided by Pythia and Herwig++ show in general a positive slope with respect to the data. MadGraph+Pythia reproduces the shape of the data well, as does HEJ for p (1) T > 400 GeV. However, for smaller values of p (1) T HEJ overestimates the number of events at the end of the spectrum, as was the case for ∆y min 2j .
The variable ∆y max 2j , shown in figure 15, is very well described by HEJ in events with p (1) T > 400 GeV. The ratios to data in both Pythia and Herwig++ have upwards slopes in all p (1) T bins. MadGraph+Pythia provides mostly a good description of the data, with a tendency to underestimate the extremes of the distribution. Interestingly, NJet/Sherpa seems to overestimate the number of events in the tail, although it is a statistically limited region and the comparison with BlackHat/Sherpa is not conclusive.
In summary: NJet/Sherpa mostly agrees with the data within the uncertainties, but its ratio to data has an upwards trend in the tail of ∆y max 2j . HEJ provides a very good description of all angular variables for the region p (1) T > 400 GeV, as expected, but shows significant discrepancies with respect to the data in all the rapidity variables for lower p (1) T values. It is important to keep in mind, though, that the associated scale uncertainties are large. MadGraph+Pythia describes all the data well, apart from the tail of ∆y min 2j and the extreme values of ∆y max 2j , which it underestimates. Herwig++ gives good descriptions of the φ variables, but fails at describing the rapidity variables. Pythia has some problems describing both threejet variables, as well as ∆y max 2j . These variables highlight the need to combine four-jet ME calculations with parton showers.
Σ p central T variables The variables setting a minimum forward-backward rapidity interval and measuring the total p T of the central jets (Σp central T ) were defined to test the framework of HEJ. HEJ has been designed to describe events with two jets significantly separated in rapidity with additional, central, high-p T radiation. These variables are also useful to describe the spatial configuration of the events, as they represent the forward-backward rapidity span of the jets, and whether the leading jet is among the two central ones or not. The NLO predictions and MadGraph+Pythia are also expected to be successful in this regime, whereas the 2 → 2 generators with PSs are expected to be less suitable.
The variable Σp central T is studied for values of ∆y max 2j larger than 1, 2, 3 or 4, and for different cuts in p (1) T . In most cases, the description of the observable worsens significantly with increasing ∆y max 2j and p (1) T . Figures 16 to 19 correspond to the results for ∆y max 2j > 1, 2, 3, 4. The generators with 2 → 2 MEs have problems describing the data around the threshold values where the contribution from different jets changes, which results in kinks in the ratio distributions. One such transition occurs at the Σp central T value for which the leading jet is first allowed to be central. For p (1) T > 400 GeV, this happens at Σp central T > 464 GeV, at which point there is a major jump in Pythia in the second ratio plot of figure 16. Pythia gives in general the most discrepant prediction, with kinks in the ratio to data at the transition points that reach differences of 70% at high p (1) T , as well as global slopes. Herwig++ describes the data very well at lower ∆y max 2j values, but as ∆y max 2j grows its normalisation worsens, as well as the shape -particularly at high p (1) T .
MadGraph+Pythia provides an excellent description of the Σp central T variables, especially at low p (1) T . The agreement deteriorates at high p (1) T , but it is not very much affected by the changes between different jet configurations, providing overall a very good description of the shapes. Most distributions are well described by HEJ, especially the high Σp central T region; the low Σp central T region shows more shape differences, which get worse at large ∆y max 2j . This is compatible with similar observations made in previous ATLAS measurements performed with 7 TeV data [64], where it was also shown that the agreement was significantly improved after interfacing HEJ with a PS generator. NJet/Sherpa has a tendency to overestimate the number of events with very low Σp central T , which may be correlated with the p (4) T discrepancy discussed earlier. It provides a very good description of the data otherwise.
Tables 4 to 49 in appendix A contain the numerical values of the measured differential cross sections and their corresponding uncertainties. The quoted values correspond to the average differential cross sections over the bin ranges given.            Figure 11: Unfolded four-jet differential cross section as a function of ∆φ min 2j , compared to different theoretical predictions. The other details are as for figure 10, but here the multiple ratio plots correspond to different selection criteria applied to p (1) T . Some points in the ratio curves for NJet/Sherpa fall outside the y-axis range, and thus the NLO uncertainty is shown partially, or not shown, in these particular bins.  Figure 12: Unfolded four-jet differential cross section as a function of ∆φ min 3j , compared to different theoretical predictions. The other details are as for figure 11. Some points in the ratio curves for NJet/Sherpa fall outside the y-axis range, and thus the NLO uncertainty is shown partially, or not shown, in these particular bins.  Figure 13: Unfolded four-jet differential cross section as a function of ∆y min 2j , compared to different theoretical predictions. The other details are as for figure 11. Some points in the ratio curves for NJet/Sherpa fall outside the y-axis range, and thus the NLO uncertainty is shown partially, or not shown, in these particular bins.  Figure 14: Unfolded four-jet differential cross section as a function of ∆y min 3j , compared to different theoretical predictions. The other details are as for figure 11. Some points in the ratio curves for NJet/Sherpa fall outside the y-axis range, and thus the NLO uncertainty is shown partially, or not shown, in these particular bins.  Figure 15: Unfolded four-jet differential cross section as a function of ∆y max 2j , compared to different theoretical predictions. The other details are as for figure 11. Some points in the ratio curves for NJet/Sherpa fall outside the y-axis range, and thus the NLO uncertainty is shown partially, or not shown, in these particular bins.     Figure 19: Unfolded four-jet differential cross section as a function of Σp central T with ∆y max 2j > 4, compared to different theoretical predictions. The other details are as for figure 11. Some points in the ratio curves for NJet/Sherpa fall outside the y-axis range, and thus the NLO uncertainty is shown partially, or not shown, in these particular bins.

Conclusion
This paper presents unfolded differential cross sections of events with at least four jets in pp collisions at 8 TeV centre-of-mass energy. The cross sections are studied as a function of a variety of kinematic and topological variables which include momenta, masses and angles. Events are selected if the four anti-k t R = 0.4 jets with the largest transverse momentum within the rapidity range |y| < 2.8 are well separated (∆R min 4j > 0.65), all have p T > 64 GeV, and include at least one jet with p T > 100 GeV. The results are obtained from the analysis of the full dataset collected by the ATLAS detector at the LHC in 2012, which corresponds to a total integrated luminosity of 20.3 fb −1 . The total experimental systematic uncertainty is typically of the order of 10%, and it is dominated by the jet energy scale calibration uncertainty.
The measurements are compared to NLO pQCD predictions provided by BlackHat/Sherpa and NJet/Sherpa, as well as the all-orders calculation provided by HEJ. Three leading-order calculations are also considered, including two 2 → 2 PS samples (Pythia and Herwig++) and a multi-leg calculation with up to four partons in the ME matched to a PS generated by Pythia (MadGraph+Pythia).
The LO cross sections and HEJ are normalised by fixed factors to facilitate the comparison of the spectra in the kinematic regions of interest; these factors vary between 0.6 and 1.4 for the different samples, where the MadGraph+Pythia and HEJ samples are the ones that need the smallest corrections. The NLO predictions, BlackHat/Sherpa and NJet/Sherpa, are almost always compatible with the data within their theoretical uncertainties, which are found to be large (O(30%) at low momenta) and asymmetric. Within the normalisation scheme used, MadGraph+Pythia also provides a good description of the data, as does HEJ, especially at high leading jet p T . The 2 → 2 PS calculations generally describe the data relatively poorly, although they are found to provide good predictions in some particular cases: Pythia gives a very good prediction of p (4) T and ∆y min 2j , while Herwig++ performs well in the azimuthal angle variables. Looking at the individual distributions of the differential cross section, the description of the jet momenta is compatible with previous measurements of the multi-jet cross sections. It should be noted that HEJ, NJet/Sherpa and BlackHat/Sherpa give a very good description of the distributions of the leading jets but show some discrepancy with the data for p (4) T . For variables that are particularly sensitive to wide-angle configurations and high-p T radiation, such as masses or angles, BlackHat/Sherpa, NJet/Sherpa and Mad-Graph+Pythia do a remarkable job overall. HEJ also provides a good description of the data, the main exception being that it disagrees with the rapidity measurements in events with low p (1) T . At high p (1) T the prediction is very good. These measurements expose the shortcomings of 2 → 2 parton ME+PS predictions in a variety of scenarios and highlight the importance of the more sophisticated calculations.                                                 [58] G. D'Agostini, A Multidimensional unfolding method based on Bayes' theorem, Nucl. Instrum. Meth. A362 (1995) 487-498.