Jet energy scale and resolution measured in proton-proton collisions at $\sqrt{s}=13$ TeV with the ATLAS detector

Jet energy scale and resolution measurements with their associated uncertainties are reported for jets using 36-81 fb$^{-1}$ of proton-proton collision data with a centre-of-mass energy of $\sqrt{s}=13$ TeV collected by the ATLAS detector at the LHC. Jets are reconstructed using two different input types: topo-clusters formed from energy deposits in calorimeter cells, as well as an algorithmic combination of charged-particle tracks with those topo-clusters, referred to as the ATLAS particle-flow reconstruction method. The anti-$k_t$ jet algorithm with radius parameter $R=0.4$ is the primary jet definition used for both jet types. Jets are initially calibrated using a sequence of simulation-based corrections. Next, several $\textit{in situ}$ techniques are employed to correct for differences between data and simulation and to measure the resolution of jets. The systematic uncertainties in the jet energy scale for central jets ($|\eta|<1.2$) vary from 1% for a wide range of high-$p_{\text{T}}$ jets ($2502.5$ TeV). The relative jet energy resolution is measured and ranges from ($24 \pm 1.5$)% at 20 GeV to ($6 \pm 0.5$)% at 300 GeV.


Introduction
The energetic proton-proton ( ) collisions produced by the Large Hadron Collider (LHC) yield final states that are predominantly characterized by jets, or collimated sprays of charged and neutral hadrons. Jets constitute an essential piece of the physics programme carried out using the ATLAS detector due to their presence in the signal processes being measured and searched for, the various background processes that hide those signals, and the additional activity due to simultaneous collisions. Measurements of the energy scale and resolution of these complex objects, as well as their associated systematic uncertainties, are therefore essential both for precision measurements of the Standard Model (SM) and for sensitive searches for new physics beyond it. This paper presents the strategy used for the determination of the jet energy scale (JES) and resolution (JER) by the ATLAS experiment and its implementation as it pertains to the analysis of data from Run 2 of the LHC. Results for the JES and JER are presented using data collected during 2015-2017, corresponding to integrated luminosities in the range 36-81 fb −1 , depending on the analysis method and its goals. This publication focuses on calibrating jets reconstructed with the anti- [1] algorithm with radius parameter = 0.4.
The ATLAS Collaboration has published previous calibrations and uncertainties of the energy scale and resolution for this jet definition with data taken in 2010 [2][3][4], 2011 [5], 2012 [6], and 2015 [7]. Additionally, some ATLAS publications have targeted different jet definitions. In particular, the Run 1 papers include dedicated calibrations 1 of jets reconstructed with the anti-algorithm with = 0.6 and = 1.0, and a dedicated in situ calibration of large-radius jets has also been completed in Run 2 data [9]. This publication extends and improves on previous calibrations of anti-= 0.4 jets, taking full advantage of the larger dataset recorded over the period of 2015-2017. The significant increase in the number of proton collisions per bunch crossing in 2016 and 2017 data-taking leads to a correspondingly more difficult environment for jet reconstruction, and this result presents new jet energy scale and resolution measurements in these unique high pile-up conditions. Section 2 describes the ATLAS detector, and Section 3 describes the recorded data and the Monte Carlo (MC) simulation samples used in this paper. Section 4 presents the inputs and algorithms used to reconstruct the jets. Section 5 and Section 6 present the methods used and the result of both the calibration and the resulting systematic uncertainties of the JES and the JER, respectively.

The ATLAS detector
The ATLAS detector [10] at the LHC covers nearly the entire solid angle around the collision point. 2 It consists of an inner tracking detector surrounded by a thin superconducting solenoid, electromagnetic and hadronic calorimeters, and a muon spectrometer incorporating three large superconducting toroidal magnets. The inner-detector system (ID) is immersed in a 2 T axial magnetic field and provides charged-particle tracking in the range | | < 2.5. 1 Comparisons in Run 1 between = 0.4 and = 0.6 jets confirm the need for dedicated calibrations for different jet radii [8]. 2 ATLAS uses a right-handed coordinate system with its origin at the nominal interaction point in the centre of the detector. The positive -axis is defined by the direction from the interaction point to the centre of the LHC ring, with the positive -axis pointing upwards, while the beam direction defines the -axis. Cylindrical coordinates ( , ) are used in the transverse plane, being the azimuthal angle around the -axis. The pseudorapidity is defined in terms of the polar angle by = − ln tan( /2). Rapidity is defined as = 0.5 ln[( + )/( − )], where denotes the energy and is the component of the momentum along the beam direction. The angular distance Δ is defined as √︁ (Δ ) 2 + (Δ ) 2 .
The silicon pixel detector covers the vertex region and typically provides four measurements per track, with the innermost space-point provided by the insertable B-layer that was installed before Run 2 [11,12]. The pixel detector is followed by the silicon microstrip tracker, which usually yields eight measurements per track. The silicon-based detectors are complemented by the transition radiation tracker (TRT), which enables radially extended track reconstruction up to | | = 2.0. The TRT also provides electron identification information based on the fraction of hits (typically 30 in total) above a higher energy-deposit threshold corresponding to transition radiation.
The calorimeter system covers the pseudorapidity range | | < 4.9. Within the region | | < 3.2, high-granularity lead/liquid-argon (LAr) calorimeters with both barrel and endcap sections provide electromagnetic calorimetry. An additional thin LAr presampler covers | | < 1.8, and is used to correct for energy loss in materials traversed by particles prior to reaching the calorimeters. Hadronic calorimetry is provided by the steel/scintillator-tile calorimeter, segmented into three barrel structures within | | < 1.7, and two copper/LAr hadronic endcap calorimeters cover the range 1.5 < | | < 3.2. The solid angle coverage between 3.2 < | | < 4.9 is completed with forward copper/LAr and tungsten/LAr calorimeter modules optimized for electromagnetic and hadronic measurements respectively. Interfaces that exist between each of these components, in particular between the barrel and endcap regions, provide for space to route various services and infrastructure, such as electrical and fiber-optic cabling, cooling, and support structures. However, these so-called transition regions also create discontinuities in the response of the calorimeter to both charged and neutral particles due to energy absorption in the inactive materials and changes in the geometry of the active materials of the calorimeters. The calibrated response and resolution of the calorimeter must therefore either correct for these features, or account for them when establishing systematic uncertainties. Figure 1 shows the many components of the calorimeter system, with reference pseudorapidities and various relevant transition regions marked as well [10,13,14].
The muon spectrometer (MS) comprises separate trigger and high-precision tracking chambers measuring the deflection of muons in a magnetic field generated by superconducting air-core toroids. The field integral of the toroids ranges between 2.0 and 6.0 T m across most of the detector. A set of precision chambers covers the region | | < 2.7 with three layers of monitored drift tubes, complemented by cathode-strip chambers in the forward region, where the background is highest. The muon trigger system covers the range | | < 2.4 with resistive-plate chambers in the barrel, and thin-gap chambers in the endcap regions.
Interesting events are selected to be recorded by the first-level trigger system implemented in custom hardware, followed by selections made by algorithms implemented in software in the high-level trigger [15]. The first-level trigger accepts events from the 40 MHz bunch crossings at a rate below 100 kHz, which the high-level trigger reduces in order to record events to disk at about 1 kHz.

Data and Monte Carlo simulated samples
The data used for the measurements presented here were collected in collisions at the LHC with a centre-of-mass energy of 13 TeV and a 25 ns proton bunch crossing interval during 2015-2017. The integrated luminosities of the datasets used are in the range 36-81 fb −1 after requiring that all detector subsystems were operational during data recording.
Additional collisions in the same and nearby bunch crossings are referred to as pile-up. The number of reconstructed primary vertices ( PV ) and the mean number of interactions per bunch crossing ( ) are optimal observables to quantify the level of pile-up activity. The average value of is 13.7, 24.9, and 37.8 in the 2015, 2016, and 2017 datasets, respectively [16]. As described below, these conditions are accounted for in the production and reconstruction of simulated data.
Simulated dĳet, multĳet, +jet, and +jet samples are used in determining the jet energy scale and its uncertainties. Table 1 summarizes the MC generators, adjustable sets of parameters (tunes), and parton distribution function (PDF) sets used for all nominal and alternative samples of the various simulated processes. The nominal samples for the majority of analyses were generated with P 8.186 [17] (from now on referred to as P 8) or P +P 8.186 [17,20,21]. The multĳet balance analysis uses S 2.1.1 [22] as the nominal generator since it incorporates up to three jets in the matrix element and is thus more suitable for multĳet processes that have more than two jets in the final state. The dĳet, multĳet, and +jet nominal samples use the NNPDF2.3LO PDF set [19] and the A14 set of tuned parameters [18]. For the +jet analysis, the dedicated AZNLO tune [26] is used instead. Alternative samples for defining systematic variations use various generators and tunes.
Stable particles, defined as those with > 10 mm, output by the generators were passed through the G 4-based simulation of the ATLAS detector [27,28]. This step simulates the interactions of the The EM barrel has consistent performance throughout, but has a seam in the construction at = 0 which can impact jet energy resolution. The EM endcap has a precision region marked in darker green and an extended region in light green, and the transition from one to the other around ∼ 2.5 involves a dramatic change in the material layers. The hadronic Tile calorimeter is shown in light blue while the hadronic endcap calorimeters based on liquid argon are illustrated in light orange. The forward calorimeters are shown in dark orange. Pink filled regions represent the tile plug calorimeter, often referred to as TileGap1 and TileGap2. The thin hot pink line marks the location of the very narrow gap and cryostat scintillators (TileGap3). The regions corresponding to the transition from barrel to endcap ( ∼ 1.4) and from endcap to forward calorimeter ( ∼ 3.1) are given for reference. Table 1: List of generators used for various processes. Information is given regarding the underlying-event tunes, the PDF parameter sets, and the perturbative QCD highest-order accuracy used in the matrix element. Abbreviations in the PDF names and matrix element orders are LO (leading order), NLO (next-to-leading order), and NNLO (next-to-next-to-leading order).

Process
Generator Tune PDF set Matrix element + fragmentation/hadronization order Dĳet P 8.186 [17] A14 [18] NNPDF2.3LO [ particles with matter in the detector and generates outputs which can be reconstructed in the same way as data. Hadronic showers were simulated using the FTFP BERT model as described in Ref. [29]. A set of simulated dĳet events using the less detailed Atlfast-II (AFII) are also studied to determine the difference in performance between full and fast simulation and provide appropriate calibrations for AFII samples in analyses [27].
Pile-up is incorporated in the MC samples by overlaying simulated inelastic interactions on the generated hard-scatter interaction. The inelastic interactions were simulated in P 8.210 using the A3 tune and the NNPDF2.3LO PDF set [19,30]. To determine the number of simulated collisions to overlay onto a particular hard-scattering process, a random value is drawn from a Poisson distribution of the number of collisions per bunch crossing with a mean given by the desired average number of collisions per crossing for a particular data period. Events simulated with a particular pile-up profile are then compared with data from the corresponding data period. One set of MC samples was created using the pile-up profile of 2015+2016 data (average number of collisions 23.7) while a second independent set of samples used the profile of 2017 data. When data and simulation are compared in this paper, both sets of MC samples are used unless otherwise specified and are normalized to the luminosity of 2015+2016 data and 2017 data separately.

Jet reconstruction
The primary jet definition used in the majority of physics analyses by the ATLAS Collaboration and in the studies presented here is the anti- [1] algorithm with a radius parameter = 0.4 as implemented in the F J 3.2.2 [31,32] software package. Four-vector objects are used as inputs to the algorithm, and may be stable particles defined by MC generators, charged-particle tracks, calorimeter energy deposits, or algorithmic combinations of the latter two, as in the case of the particle-flow reconstruction technique [33].
For use in jet reconstruction, calorimeter cells are first clustered into three-dimensional, massless, topological clusters (topo-clusters) using a nearest-neighbour algorithm [34]. Cells are added to a topo-cluster according to the ratio of the cell energy to the expected noise in each cell using thresholds that control the growth of each topo-cluster. The resulting energy of the topo-cluster is defined at the electromagnetic (EM) scale, which is the baseline calorimeter scale that correctly measures energy depositions from electromagnetic showers. Only positive-energy topo-clusters are used as inputs to the jet reconstruction. A jet produced in the hard-scatter process is expected to originate from the primary vertex, defined as the reconstructed vertex with at least two associated tracks and the largest sum of squared track momentum. Therefore, an event-by-event correction to account for the position of the primary vertex in each event -referred to as an origin correction -is applied to every topo-cluster, based on its depth within the calorimeter and pseudorapidity. This method is to be contrasted with earlier approaches [7] that applied this correction only to the jet four-momentum rather than to its constituents.
Jets reconstructed using only calorimeter-based energy information use the origin-corrected EM scale topo-clusters and are referred to as EMtopo jets. This was the primary jet definition used in ATLAS physics results prior to the end of Run 2. EMtopo jets exhibit robust energy scale and resolution characteristics across a wide kinematic range, and are independent of other reconstruction algorithms such as tracking at the jet-building stage.
Hadronic final-state measurements can be improved by making more complete use of the information from both the tracking and calorimeter systems. The particle flow (PFlow) algorithm is based on Ref. [33] and updated as described below. Particle flow directly combines measurements from both the tracker and the calorimeter to form the input signals for jet reconstruction, which are intended to approximate individual particles. Specifically, energy deposited in the calorimeter by charged particles is subtracted from the observed topo-clusters and replaced by the momenta of tracks that are matched to those topo-clusters. These resulting PFlow jets exhibit improved energy and angular resolution, reconstruction efficiency, and pile-up stability compared to calorimeter jets [33]. EMtopo and PFlow jets are retained for the analyses discussed in this paper only if they have an uncalibrated T > 7 GeV and | | < 4.5.
The updates to the PFlow algorithm since its description in Ref. [33] are as follows. The expected mean value of the energy deposited by pions, dep , and its expected standard deviation, ( dep ), were recomputed using the updated simulation, geometry, and topo-cluster noise thresholds for Run 2 [7]. The shower profiles were similarly updated. The only algorithmic change was an improvement in the transition between using track energy and cluster energy in high-T jets. Since energetic particles are often in the core of jets and thus poorly isolated from nearby activity, accurate removal of the calorimeter energy associated with the track can be difficult. Therefore, the PFlow algorithm prevents energy subtraction in these cases. Formerly this was managed by applying a simple trk T < 40 GeV cut in the track selection. In the updated algorithm, a more sophisticated procedure is used to prevent the subtraction in cases where the advantages of the tracker are smaller and where the particle shower falls in a region with significant energy depositions from other particles. For all tracks up to trk T = 100 GeV, if the energy clus in a cone of size Δ = 0.15 around the extrapolated particle satisfies then the subtraction is not performed. With this parameterization, the subtraction is performed at lower track momenta unless the calorimeter activity measured by clus is very high, such as in very dense environments where the accuracy of the subtraction is degraded. Since the calorimeter provides a good energy measurement at high trk T , this parameterization effectively slowly truncates the algorithm, yet allows the subtraction to continue to be performed for a small range above this cut-off even when the calorimeter energy deposition is low or near the expected value, dep . The momentum range up to which the subtraction is still allowed to be performed is driven by the coefficient of 33.2 in Eq. (1) and is typically about 20-50% above the 40 GeV cut-off previously used. Above trk T = 100 GeV no track information is used and the PFlow algorithm becomes equivalent to EMtopo, benefitting from excellent calorimeter performance at high energies. The result of the improved subtraction method detailed here is that the energy resolution of PFlow jets becomes compatible with that of EMtopo jets at high energies while remaining superior at low energies.
After the subtraction, two scalings are applied. These account for the difference in response, here defined as the ratio of measured to true particle energy, between topo-clusters at the EM scale and tracks for which the energy scale is closer to the true particle energy. The first scale factor applies only when no subtraction has been performed for a selected track. In this case the PFlow object includes both the full topo-cluster energy and the track momentum. To avoid double-counting the energy while maintaining the contribution from the calorimeter measurement, the track momentum is scaled by a factor (1 − dep / trk ). The resulting PFlow object uses the desired information and has a final energy of approximately trk , matching the response for the subtracted case. The second scale factor is applied in both the subtracted and non-subtracted cases for all PFlow objects created from selected tracks below 100 GeV. It smooths the transition between the lower-energy PFlow objects which are at the scale of the tracks and the higher-energy objects at the electromagnetic scale of the clusters. The energy of these PFlow objects is scaled by unity for trk T below 30 GeV, by (1 − dep / trk ) for objects with 60 GeV < trk T < 100 GeV, and by a linearly descending scale factor in between. This ensures that all objects are at the electromagnetic scale by 60 GeV.
Tracks used in PFlow objects and in deriving calibrations for both EMtopo and PFlow jets are reconstructed within the full acceptance of the inner detector (| | < 2.5), required to have a T > 500 MeV, and satisfy quality criteria based on the number of hits in the ID subdetectors. To suppress the effects of pile-up, tracks must satisfy | 0 sin | < 2 mm, where 0 is the distance of closest approach of the track to the hard-scatter primary vertex along the -axis and is the polar angle. Tracks are matched to jets using ghost association [35], a procedure that treats them as four-vectors of infinitesimal magnitude during the jet reconstruction and assigns them to the jet with which they are clustered.
MC simulation is used to determine the energy scale and resolution of jets by comparing PFlow and EMtopo jets with particle-level truth jets. Truth jets are reconstructed using stable final-state particles and exclude muons, neutrinos, and particles from pile-up interactions. Truth jets are selected with the same T > 7 GeV and | | < 4.5 thresholds as EMtopo and PFlow jets, and are geometrically matched to those jets using the angular distance Δ with the requirement Δ < 0.3.

Jet energy scale calibration
The jet energy scale calibration restores the jet energy to that of jets reconstructed at the particle level. The full chain of corrections is illustrated in Figure 2. All stages correct the four-momentum, scaling the jet T , energy, and mass.
At the beginning of the chain, the pile-up corrections remove the excess energy due to additional protonproton interactions within the same (in-time) or nearby (out-of-time) bunch crossings. These corrections consist of two components: a correction based on the jet area and transverse momentum density of the event, and a residual correction derived from MC simulation and parameterized as a function of the mean number of interactions per bunch crossing ( ) and the number of reconstructed primary vertices in the event ( PV ). These corrections are discussed in Section 5.1.1. The absolute JES calibration corrects the jet so that it agrees in energy and direction with truth jets from dĳet MC events, and is detailed in Section 5.1.2. Furthermore, the global sequential calibration (derived from dĳet MC events) improves the jet T resolution and associated uncertainties by removing the dependence of the reconstructed jet response on observables constructed using information from the tracking, calorimeter, and muon chamber detector systems, as introduced in Section 5.1.3. All these calibrations are applied to both data and MC simulation. Finally, a residual in situ calibration is applied to data only to correct for remaining differences between data and MC simulation. It is derived using well-measured reference objects, including photons, bosons, and calibrated jets, and for the first time benefits from a low-T measurement using the missing-T projection fraction method for better pile-up robustness. It is described in Section 5.2. The full treatment and reduction of the systematic uncertainties is discussed in Section 5.3.

Simulation-based jet calibrations
The derivation of the calibrations derived exclusively from MC simulation samples is described below.

Pile-up corrections
As a result of the increase of the topo-clustering T thresholds (to suppress electronic and pile-up noise) and in the instantaneous luminosity, the contribution from pile-up to the JES in the 2015-2017 data-taking period differs from the one observed in 2015. The pile-up corrections are therefore evaluated using updated MC simulations of the software reconstruction and pile-up conditions. These corrections are derived using the same methods employed in 2015 [7] and are summarized in the following paragraphs.
First, a jet T -density-based subtraction of the per-event pile-up contribution to the jet T is performed. The jet area is a measure of the susceptibility of the jet to pile-up and is calculated by determining the relative number of ghost particles associated with a jet after clustering. Next, the pile-up contribution is estimated from the median T density, , of jets in the -plane, T / . The calculation of uses jets reconstructed using the algorithm [36] with radius parameter = 0.4 from positive-energy topo-clusters with | | < 2. The computation of in the central region of the detector gives a more meaningful measure of the pile-up activity than the median over the entire range, and this is because drops to nearly zero Applied as a function of event pile-up p T density and jet area.
Removes residual pile-up dependence, as a function of μ and N PV .

Reconstructed jets
Jet finding applied to tracking-and/or calorimeter-based inputs.  beyond | | ∼ 2. This drop is due to the lower occupancy in the forward region relative to the central region, which is a result of a coarser segmentation in the forward region. The algorithm is chosen due to its tendency to naturally reconstruct jets including an uniform soft background [35], while is used to reduce the bias from hard-scatter jets which populate the high-T tails of the distribution. The distribution of in MC simulation for representative PV values is shown in Figure 3. The ratio of the -subtracted jet T to the uncorrected jet T is applied as a scale factor to the jet four-momentum and does hence not affect its direction. The calculation is derived from the central, lower-occupancy regions of the calorimeter and does not fully describe the pile-up sensitivity in the forward calorimeter region or in the higher-occupancy core of high-T jets. It is therefore observed that after this correction some dependence of the anti-jet T on the pile-up activity remains, and consequently, a residual correction is derived. This residual dependence is defined as the difference between the reconstructed jet T and truth jet T and it is observed as a function of both PV and , which are sensitive to in-time and out-of-time pile-up respectively.
The jet T after all pile-up ( T -density-based and residual) corrections is given by refers to the T of the reconstructed jet before any pile-up correction is applied. Reconstructed jets with T > 7 GeV are geometrically matched to truth jets within Δ = 0.3. The residual T dependences on PV ( ) and ( ) are observed to be fairly linear and independent of one another. Independent linear fits are used to derive and coefficients in bins of true T and | det |, where true T is the T of the truth jet that matches the reconstructed jet. The jet pointing from the geometric centre of the detector, det , is used to remove any ambiguity as to which region of the detector is measuring the jet. Both the and coefficients are seen to have a logarithmic dependence on true T , and logarithmic fits are performed in the range 20 GeV < true T < 200 GeV for each bin of | det |. In each | det | bin, the fitted values of the and coefficients at true T = 25 GeV are taken as their nominal values, reflecting their behaviour in the T region where pile-up effects are most relevant. The differences between the logarithmic fits over the full true T range and the nominal fits are used for a T -dependent systematic uncertainty in the residual pile-up dependence. Finally, linear fits are performed to the binned coefficients as a function of | det |. This reduces the effects of statistical fluctuations and allows the and coefficients to be smoothly sampled in | det |, particularly in regions of varying dependence.
The dependences of the T -density-based and residual corrections on PV and as a function of | det | for PFlow jets are shown in Figure 4. The negative dependence on for out-of-time pile-up is a result of the liquid-argon calorimeter's pulse shape, which is negative during the period shortly after registering a signal [37]. These corrections are similar to those derived for EMtopo jets, although the PV -dependent corrections for PFlow jets in the | det | < 2.5 region are reduced by about 60% relative to EMtopo due to the usage of tracks in the PFlow algorithm. For EMtopo jets, the shape of the residual corrections is comparable to that found in 2015 MC simulation, except in the forward region (| det | > 2.5), where it is found to be smaller by 0.1 GeV. This difference is primarily caused by higher topo-cluster noise thresholds used in the full Run 2 data.  Four systematic uncertainties are introduced to account for MC mis-modelling of PV , , the topology, and the T dependence of the residual pile-up corrections. The last of these is derived from the full logarithmic fits to and , as discussed previously. Two in situ methods are used to estimate uncertainties in the modelling of PV and . The first method uses jets reconstructed from tracks to provide a measure of the jet T independent of pile-up. This is only used for | | < 2.1. The second method exploits the T balance between a reconstructed jet and a boson and is used for 2.1 < | | < 4.5. These systematic uncertainties are described in more detail in Ref. [38]. Finally, the topology uncertainty accounts for the uncertainty in the underlying event's contribution to , and is discussed in detail in Section 5.2.4.

Jet energy scale and calibration
The absolute jet energy scale and calibrations correct the reconstructed jet four-momentum to the particle-level energy scale accounting for non-compensating calorimeter response, energy losses in passive material, out-of-cone effects and biases in the jet reconstruction. Such biases are primarily caused by the transition between different calorimeter technologies and sudden changes in calorimeter granularity. The calibration is derived for = 0.4 anti-jets from a P 8 MC simulation of dĳet events after the application of the pile-up corrections. Reconstructed jets are geometrically matched to truth jets within Δ = 0.3. In addition, reconstructed (truth) jets are required to have no other reconstructed (truth) jet of T > 7 GeV within Δ = 0.6 (Δ = 1.0).
The average jet energy response R, defined as the mean of a Gaussian fit to the core of the reco / true distribution, is measured in true and det bins. The decision to calculate the response as a function of true instead of reco is motivated by the fact that for fixed true ( reco ) bins the response distribution is (not) Gaussian. The average response is parameterized as a function of reco using a numerical inversion procedure, as detailed in Ref. [2], and the jet calibration factor is taken as the inverse of the average energy response. The response is higher for PFlow jets than for EMtopo jets at low energies since tracking information is used. The response for PFlow jets as a function of reco ( det ) for representative det ( reco ) bins is shown in Figure 5. After the JES calibration based on the results in Figure 5 is applied, the response diverges from 1 by a maximum of about 5% (3%, 1%) at true T = 20 (30, 50) GeV. This level of non-closure is observed across entire det range. These small non-closures are seen for low-T jets due to a slightly non-Gaussian energy response and jet reconstruction threshold effects, both of which impact the response fits. The closure in this result is an improvement with respect to the 2015 calibration and is thanks to advances in the fitting method and parameters.
A bias in the reconstructed jet , defined as a significant deviation from zero in the signed difference between the reconstructed and truth jet , denoted by reco and true respectively, is observed and shown in Figure 6 as a function of | det | for PFlow jets. The bias for EMtopo jets is similar, showing the same features. It is largest in jets that encompass two calorimeter regions with different energy responses caused by changes in calorimeter geometry or technology. This artificially increases the energy of one side of the jet relative to the other, altering the reconstructed four-momentum. The barrel-endcap (| det | ∼ 1.4) and endcap-forward (| det | ∼ 3.1) transition regions can be clearly seen in Figure 5(a) as susceptible to this effect. A second correction is therefore derived as the difference between the reconstructed and truth ( reco and true respectively) parameterized as a function of true and det to remove such bias. A numerical inversion procedure is again used to derive corrections in reco from true . This calibration only alters the jet T and , not the full four-momentum. EMtopo and PFlow jets calibrated with the full jet energy scale and calibration are considered to be at the EM+JES scale and PFlow+JES scale, respectively.
The absolute JES and calibrations are also derived for a P 8 MC sample using AFII. An additional systematic uncertainty is considered for these samples to account for a small non-closure in the calibration beyond | det | ∼ 3.2, due to the approximate treatment of hadronic showers in the forward calorimeters. This uncertainty is below 0.5% for all central jets and is about 3% for a forward jet of T = 20 GeV, falling rapidly with increasing T .

Global sequential calibration
Even after the application of the previous jet calibrations (from now on referred to as MCJES), for a given ( true T , det ) bin, the response can vary from jet to jet depending on the flavour and energy distribution of the constituent particles, their transverse distribution, and the fluctuations of the jet development in the calorimeter. Furthermore, the average particle composition and shower shape of a jet varies between initiating particles, most notably between quark-and gluon-initiated jets. A quark-initiated jet will often include hadrons with a higher fraction of the jet T that penetrate further into the calorimeter, while a gluon-initiated jet will typically contain more particles of softer T , leading to a lower calorimeter response and a wider transverse profile. The global sequential calibration (GSC), a procedure used in the 2012 [6] and 2015 [7] calibrations, is a series of multiplicative corrections to reduce the effects from these fluctuations and improve the jet resolution without changing the average jet energy response. The jet resolution R is given by the standard deviation of a Gaussian fit to the jet T response distribution, where the T response is defined similarly to jet energy response as the ratio of reco T to true T . The GSC is based on global jet observables such as the longitudinal structure of the energy depositions within the calorimeters, tracking information associated with the jet, and information related to the activity in the muon chambers behind a jet. For these studies, reconstructed jets are geometrically matched to truth jets and a numerical inversion procedure is used, as explained in Section 5.1.2. Six observables are identified that improve the resolution of the JES through the GSC. For each observable, an independent jet four-momentum correction is derived as a function of true T and | det | by inverting the reconstructed jet response in P 8 MC simulation events. Corrections for each observable are applied independently and sequentially to the jet four-momentum for jets with | det | < 3.5 (unless stated otherwise). No improvement in resolution was found from altering the sequence of the corrections.
The six stages of the GSC account for the dependence of the jet response on (in the order in which they are applied): • charged , the fraction of the jet T measured from ghost-associated tracks with T > 500 MeV (| det | < 2.5); • Tile0 , the fraction of jet energy measured in the first layer of the hadronic Tile calorimeter (| det | < 1.7); • LAr3 , the fraction of jet energy measured in the third layer of the electromagnetic LAr calorimeter (| det | < 3.5); • trk , the number of tracks with T > 1 GeV ghost-associated with the jet (| det | < 2.5); • trk , also known as track width, the average T -weighted transverse distance in the -plane between the jet axis and all tracks of T > 1 GeV ghost-associated with the jet (| det | < 2.5); • segments , the number of muon track segments ghost-associated with the jet (| det | < 2.7).
The first correction is only applied to PFlow jets. The segments correction, also known as the punch-through correction, reduces the tails of the response distribution caused by high-T jets that are not fully contained in the calorimeter. All corrections are derived as a function of jet T , except for the punch-through correction, which is derived as a function of jet energy since this effect is more correlated with the energy escaping the calorimeters.
The underlying distributions of these observables are shown for PFlow jets in MC simulation and bins of equal statistics in Figure 7. Each observable has been studied in data and simulation and is found to be well modelled [6,7,33]. The spike at zero in the Tile0 distribution at low true T , shown in Figure 7(b), corresponds to jets that are fully contained in the electromagnetic calorimeter and do not deposit energy in the Tile calorimeter. The tail towards negative values in the Tile0 and LAr3 distributions at low true T , shown in Figures 7(b) and 7(c), respectively, reflects calorimeter noise fluctuations. Slight differences with respect to data have a negligible impact on the GSC since the dependence of the average jet response on the observables is well modelled in MC simulation, as observed by an in situ dĳet tag-and-probe method described in Ref. [2]. In this method, the average T asymmetry between back-to-back jets is measured as a function of each observable.
The average jet T response for PFlow jets in MC simulation as a function of each of the GSC observables is shown in Figure 7 for representative true T ranges. The dependence of the jet response on each observable is reduced to less than 2% after the full GSC is applied, with small deviations from unity reflecting the correlations between observables that are unaccounted for in the corrections.
The fractional jet resolution, defined as R /R, is used to determine the size of the fluctuations in the jet energy reconstruction and is shown for PFlow jets with 0.2 < | det | < 0.3 in MC simulation in Figure 8. As more corrections are applied, the fractional jet resolution improves and the jet response dependence on the jet flavour is reduced. No improvement is observed in Figure 8 from the punch-through correction since only a small fraction of jets received this calibration, but there are analyses where their region of interest has a large fraction of jets that would receive this correction [39, 40].

ATLAS
: Resolution of jets at the PFlow+JES scale with 0.2 < | det | < 0.3 measured in P 8 dĳet MC simulation after each stage of the global sequential calibration (GSC). All jet flavours, including -jets, are considered. The lower panel shows the difference in quadrature between the resolution before any GSC correction is applied ( ) and after the corresponding GSC step is applied ( ).

In situ jet calibrations
Once jets are corrected to the particle level using the MCJES and GSC, they require one final calibration step to account for differences between the jet response in data and simulation. These differences are caused by imperfect simulation of both the detector materials and the physics processes involved: the hard scatter and underlying event, jet formation, pile-up, and particle interactions with the detector. The final in situ calibration measures the jet response in data and MC simulation separately and uses the ratio as an additional correction in data.
Jet response is calculated by balancing the T of a jet against that of a well-calibrated reference object or system. The response R in situ is defined as the average ratio of the jet T to the reference object T in bins of reference object T , where that average is taken from the peak location found by fitting the distribution with a Gaussian function. R in situ is sensitive to effects such as the presence of additional radiative jets or the transition of energy into or out of the jet cone, although these effects can be mitigated through careful event selection. 3 A better method is to form the double ratio from the response in data and MC simulation: which is robust to secondary effects so long as they are well modelled in simulation and is therefore a reliable measure of the jet energy scale difference between data and MC simulation. The double ratio is transformed via numerical inversion from a function of reference object T to a function of jet T (and jet where applicable). This is the final in situ calibration.
There are three stages of in situ analyses. First, the intercalibration analysis corrects the energy scale of forward (0.8 ≤ | det | < 4.5) jets to match those of central (| det | < 0.8) jets using the T balance in dĳet events. Second, the +jet and +jet analyses balance the hadronic recoil in an event against the T of a calibrated boson or photon. The missing-T projection fraction (MPF) method uses the full hadronic recoil instead of a jet to compute the balance to help mitigate effects of pile-up and jet reconstruction threshold which otherwise make low-T measurements challenging [41]. Finally, the multĳet balance (MJB) analysis uses a system of well-calibrated low-T jets to calibrate a single high-T jet [42]. The / +jet and MJB analyses are computed only for central jets, but are also applicable to forward jets due to the effect of the intercalibration. Each measurement is translated from a function of reference object T into jet T . A statistical combination of the / +jet and MJB analyses provides a single smooth calibration applicable across the full momentum range.
Since the three in situ analyses ( intercalibration, / +jet MPF, and MJB) are performed sequentially, systematic uncertainties are propagated from each to the next. Within each analysis, systematic uncertainties arise from three sources: modelling of physics processes in simulation, uncertainties in the measurement of the reference object, and uncertainties in the expected T balance due to the event's topology. Mis-modelling is accounted for by comparing the predictions of two MC generators and taking their difference as the uncertainty. Systematic uncertainties in the measurement of the reference object are taken from the ±1 uncertainties in each object's calibration and propagated through the analysis. Event topology uncertainties are estimated by varying the event selections used and observing the impact on the final MC simulation to data ratio.
A rebinning procedure is applied to each systematic uncertainty to ensure that the features represented in the final result are statistically significant and not the result of fluctuations in small numbers of simulated or data events. This is only performed where the response does not vary sharply with T , ensuring it does not obscure real physics effects. The rebinning procedure follows a bootstrapping method: pseudo-experiment datasets are created by sampling from a Poisson distribution with a mean of one for each event in the data or MC simulation [43]. The pseudo-experiments are therefore statistically correlated yet unique, and the root mean square of the response distribution across the pseudo-experiments provides a measure of the statistical uncertainty of the analysis. The measured result for each systematic uncertainty is then rebinned as appropriate for each analysis to ensure that the final shapes are statistically significant.
The / +jet and MJB calibrations and uncertainties are derived from the full 2015-2017 combined datasets with a total luminosity of 80 fb −1 . The intercalibration analysis uses a dataset of total size 81 fb −1 , but since this analysis is more sensitive than the others to year-by-year fluctuations, the dataset is split into two blocks and a time-dependent result is computed instead. One intercalibration is derived from and applies to the 2015 + 2016 dataset while a second independent calibration is derived from the 2017 dataset and applies to 2017 + 2018 data. These two data periods are treated separately due to a change in LAr calorimeter read-out that occurred between 2016 and 2017 data taking and affected jet reconstruction in the endcap regions. With no changes of similar scale made between 2017 and 2018 data taking, the 2017 calibration can be reasonably applied to 2018 as well. The post-calibration jet performance is consistent between these two different data periods and therefore a single set of uncertainties based on the 2015 + 2016 dataset is used for the intercalibration in all years, with only a small localized additional uncertainty added for 2018 as described in Section 5.2.1.
Certain common selection criteria are applied to all three in situ analyses. Each event must have a reconstructed vertex with at least two associated tracks of T > 500 MeV. All jets must satisfy quality criteria to reject non-collision background, calorimeter noise, and cosmic rays [44]. Furthermore, each jet with 20 GeV < T < 60 GeV and | det | < 2.4 must pass jet vertex tagging, or JVT, requirements with selection criteria that are specific to the jet definition [45]. These requirements match jets to the primary vertex and are 92% efficient for EMtopo jets and 97% efficient for PFlow jets.

Relative calibration measurement in using dĳet events
The intercalibration analysis produces a correction which is applied to forward jets (0.8 ≤ | det | < 4.5) to bring them to the same energy scale as central jets (| det | < 0.8). Jets in the central region of the detector are taken to be well-calibrated, while jets in the forward regions vary in response and must be corrected accordingly. Events are selected with exactly two jets in different regions of the detector. To maximize statistics, neither jet need be in the central region: instead, all regions will be calibrated relative to one another.
For these dĳet events, momentum balance requires that the transverse momentum of the two jets must be equal and opposite. Therefore, the momentum asymmetry of the two jets is a metric for the response difference between the two detector regions (left and right for simplicity): The response ratio R of the two jets defines the calibration factor for each jet and is then: The average response ratio R is measured in each bin of left , of right , and of avg T ; Δ R is the statistical uncertainty in each bin. All values are in detector coordinates rather than corrected jet coordinates ( det ) since the properties of interest correlate to specific regions of detector hardware. The following function relates the correction factors and responses in each of the bins: Here, the function ( ) quadratically imposes a penalty on correction factors deviating from 1. 4 Minimizing this function produces the correction factors to be used in the calibration.
Previous iterations of the jet energy scale have used a fit in M to minimize ( ). The current calibration instead minimizes the function analytically. Suppressing the indices for clarity and setting the derivative of with respect to some correction factor equal to zero, the following equation defines the correction factor values which minimise S: Here is a Lagrange multiplier arising from the penalty term whose value has no effect on the minimization result but prevents the trivial solution where all the are null.
Equation (2) can then be expressed as a matrix system of linear equations. This matrix system is solved independently for each avg T bin to obtain values for the correction factors for each det bin in this momentum range. Solving analytically for the in this way allows the result to be found approximately a thousand times more quickly than using a fit. This large reduction in computational requirements in turn allows the analysis to use a finer binning in det , capturing more details of the detector structure. The two methods agree well and each shows good closure when tested in simulation. Finally, the full set of correction factors are normalized such that the average correction factor in the central region | det | < 0.8 is unity.
Events are selected using a combination of single-jet triggers, with each trigger only considered in the jet T range for which it is at least 99% efficient [15,46]. Events may pass either a central jet trigger or a forward jet trigger, or both. In the case that a trigger is prescaled, the passing event is weighted by the appropriate amount. Jets with | det | < 2.4 are also required to satisfy JVT criteria to minimize contributions from pile-up and must pass basic cleaning requirements [38,44]. Each selected event must have two jets with 4 This penalty function takes the form ( ) = , where introduces the Lagrange multiplier visible in Eq. (2). The purpose of the penalty function is to ensure that the appropriate minimum is selected by suppressing local minima with large values of , and as such its exact form is somewhat arbitrary.
T > 25 GeV and | | < 4.5. To ensure a clean dĳet topology, events are further required to have no third jet with significant T : third T / avg T < 0.25, where avg T is the average momentum of the two leading jets. The two leading jets are required to be back-to-back in the azimuthal plane such that Δ > 2.5 rad.
Like the other in situ analyses, the goal of the intercalibration is to correct for data-simulation differences, so the quantity of interest is the ratio of the measured calorimeter response in MC simulation to the response in data. The nominal calibration is derived by comparison with P +P 8 simulated events. The analysis binning in avg T and det is selected to balance the requirements of both sufficient statistics in sparse regions and resolution of narrow detector features. As such, it varies for different values of det . Remaining statistical fluctuations in the final calibration are smoothed using a two-dimensional Gaussian kernel with parameters selected to preserve significant structures. . The simulation can be seen to approximately reproduce the det -dependent features of the response observed in data, although the response in data is consistently higher than the response in simulation. The simulation/data response ratio as directly measured is shown in discrete points in the bottom panel, while the calibration derived from smoothing the response ratio is overlaid as the solid curve. The dashed curve shows the extrapolation to T ranges beyond the available data, taken from the Gaussian smoothing results. Since the smoothing is stronger in the T direction and weaker in det to preserve detector features, this sets each extrapolated value to approximately the value of the last populated bin at lower T . Above T = 2 TeV the value is kept constant.
Uncertainties are derived as a function of det and T and account for mis-modelling of physics, detector, and event topology effects on the momentum balance of dĳet events. The dominant uncertainty is in MC mis-modelling and is taken to be the difference between the smoothed calibration curves derived from the P +P 8 and S dĳet samples. Additional uncertainties in the physics and topology modelling are assessed by varying the third T , Δ , and pile-up suppression cuts and using a bootstrapping method to ensure observed shapes are statistically significant as discussed in Section 5.2. Similarly, the JVT uncertainty is determined by comparison with tighter and looser working points. These uncertainties can take positive or negative values. The statistical uncertainty is strictly positive and is taken from the data and MC simulation sample sizes. Finally, a non-closure uncertainty is assessed by comparing the response in data with that in P +P 8 after applying the derived intercalibration. This uncertainty is largest for | det | ∼ 2.1-2.6, where detector transitions make modelling of the LAr pulse shape particularly difficult [34], and for jets near the kinematic limit, where they have the maximum possible T for a given det subject to the constraint of a 13 TeV centre-of-mass energy. The non-closure uncertainty is treated as three independent nuisance parameters, two covering the regions around ±2.4 in det and one at the kinematic limit, since these two non-closure uncertainties are uncorrelated.
After being corrected each with their dedicated calibration, the 2015+2016 and 2017 datasets are in good agreement, and therefore a single set of uncertainties is sufficient to cover both cases. The uncertainties calculated with the 2015+2016 dataset are selected for this role. The only dataset-dependent uncertainty is an additional small non-closure uncertainty used for 2018 data only. It covers the region around = ±1.5 to account for the difference in Tile calorimeter calibration during this year of data-taking and has a maximum size of 2%.
The method uncertainties are shown in Figure 10. Three illustrative T values are selected. The uncertainties decrease slightly as a function of T and increase significantly as a function of det outside of the central detector region, while in the central region they are zero by construction. For practical use the various    Figure 9: Relative response of jets calibrated with PFlow+JES in data (black circles) and P +P 8 MC simulation (red squares). Response is shown as a function of det for jets of (a) 40 GeV < jet T < 60 GeV, (b) 85 GeV < jet T < 115 GeV, and (c) 270 GeV < jet T < 330 GeV, and as a function of T for jets of (d) 1.2 < det < 1.4, (e) 2.6 < det < 2.8, and (f) 3.0 < det < 3.2. The lower panel shows the response ratio of simulation to data (red squares) as well as the smoothed in situ calibration factor derived from the ratio (solid curve) which is used to perform the intercalibration. Dotted lines show the extrapolation of the in situ calibration to the regions without data points. The dashed red and blue horizontal lines provide reference points for the viewer. systematic uncertainty terms are summed in quadrature to produce one single systematic uncertainty dominated by the modelling term. In the cases where up and down variations differ, the largest absolute value of the two is used at each point. The total systematic uncertainty and the statistical uncertainty are both symmetrized in det . The non-closure uncertainties, not included in Figure 10 as they are not method uncertainties, are instead shown in Figure 22 where it can be seen that they are kept asymmetric to reflect real differences in the detector. The calibrations are similar in size and shape between PFlow and EMtopo jets. Systematic uncertainties are also similar in size and shape since the dominant MC modelling component does not differ meaningfully between the two jet collections.

Calibration measurement using +jet and +jet events
The next stage of the in situ calibration corrects for the differences between data and MC simulation using the momentum balance between the measured hadronic activity in the event and the T of a well-calibrated photon or boson. Only the central region of the detector (| | < 0.8) is used for this analysis: the intercalibration ensures that a correction derived centrally translates directly to forward jets as well.
The / +jet analyses rely on the energy scale of the photon or the electrons and muons from the decay being well measured. All three objects are cleanly measured in the ATLAS detector and the uncertainties in their energy scales are small [47,48]. The response is calculated separately in → + − and → + − events since the sources of uncertainties propagated from and calibration are independent, and the three channels are combined at a later stage. The +jet response measurement is limited at moderate to high T by low statistics and thus covers a range in jet T from 17 GeV to 1 TeV with large uncertainties in the final bin. The +jet response measurement benefits from much higher statistics and extends to 1.2 TeV with little loss in sensitivity. However, it is limited at low jet T by both the trigger prescales and the prevalence of soft jets misidentified as photons and so begins at 25 GeV.
The missing-T projection fraction technique is used for both of the / +jet analyses and balances the reference object T against the full hadronic recoil in an event. By doing so, it is possible to compute the calorimeter response to hadronic showers directly. This approach is robust to both pile-up and the underlying event, which each cancel out directionally on average over a large collection of events, and is not strongly affected by jet definitions since these become relevant only in the application of the calibration. The showering and topology effects in moving from a recoil-level quantity to a jet-level quantity are studied and found to be small, as discussed below. Taking ì recoil T as the total transverse momentum of the hadronic activity in a clean / +jet event and ref T as the transverse momentum of the photon or boson, conservation of transverse momentum means that at the particle level: This balance could be altered by the presence of initial-or final-state radiation. 5 To suppress the effects of such additional radiation, a cut is placed on the azimuthal angle Δ between the jet and the reconstructed photon or boson in the event and an uncertainty due to the topology is evaluated by varying the event selection requirements. If the calorimeter response to the hadronic activity in this event is MPF and the response for the calibrated reference object is 1, and assuming any missing energy in the event is due to the low response to the hadronic recoil ( MPF < 1), then at the detector level Eq. (3) becomes: After taking the projection of each term in the direction of the reference object, defined by a unit vector ref , the response to the hadronic recoil is then seen to depend only on the missing energy in the event and the momentum of the reference object. The MPF response R MPF is defined by measuring the average of MPF across events binned in the reference object T . Thus, This peak location is taken to be the average response in that bin, and the response is mapped from reference to jet T by finding the average jet T in the events in each bin after intercalibration but before the application of any other in situ steps.
Missing energy in each event is reconstructed from calorimeter topo-clusters in the case of EMtopo jet calibration and from particle-flow objects in the case of PFlow jet calibration, ensuring that the energy scale is consistent. The → events are required to pass a dielectron trigger with 1, 2 T > 15 GeV; → events must pass a similar dimuon trigger with 1, 2 T > 14 GeV [49,50]. Electrons entering the analysis must have T > 20 GeV, ensuring that the trigger is fully efficient, must be contained within the tracker such that | | < 2.47, and must not fall in the calorimeter transition region (1.37 < | | < 1.52). Muons entering the analysis are required to have T > 20 GeV and to fall within | | < 2.4. Both electron and muon candidates must also pass loose identification and isolation requirements [47,48]. All +jet events are selected such that the reconstructed mass calculated from the electron or muon pair must be close to the boson mass: 66 GeV < / < 116 GeV. A combination of single-photon triggers are used for the +jet analysis, with the lowest trigger threshold corresponding to T > 15 GeV. Offline photons must have T > 25 GeV and | | < 1.37 and must satisfy tight identification and isolation criteria [47]. Both the +jet and +jet analyses have further selection requirements on the jets and event topology to suppress pile-up and initial-and final-state radiation. All jets within Δ = 0.2 of a photon or Δ = 0.35 of a lepton are removed. Jets must satisfy basic cleaning requirements and pass the JVT selection to suppress pile-up. Selected events must have one jet with T > 10 GeV and | | < 0.8. Additional event activity is suppressed by requiring that any subleading jet must have T < max(0.3 × ref T , 12) GeV and that the leading jet and reference object must be relatively back-to-back with Δ ref, jet > 2.9. The relatively loose T cut on subleading jets is shown to be acceptable for the MPF analysis due to its intrinsic robustness to pile-up effects. Figures 11 and 12 show the MPF response calculated in +jet and +jet events for data and for two MC samples using different generators. The lower panels show the MC simulation to data ratio for both generators. The results using P +P 8 ( +jet) and P 8 ( +jet) constitute the nominal calibration while S is used to define an uncertainty due to the generator choice. In the lowest T bin of the +jet measurement, the discrepancy between the MC predictions is caused by a generator-level cut at 35 GeV present in the S sample. This point is included in the final in situ combination, but due to its large generator uncertainty it contributes very little to the overall weighted-average-based result (see Section 5.2.5 and Figure 19(a)) and the total effect is negligible. The +jet generator uncertainty at this point has therefore been set to its value in the second-lowest bin for display purposes in Figure 14 to better reflect its actual contribution to the total systematic uncertainties. The apparent dip near the lowest T range of each measurement is due to the interplay of two factors: an asymmetry in the R MPF distribution near the low T reconstruction threshold which causes the measured response to increase for the lowest T values, and the natural increase in response with higher jet T . One motivation for the use of the MPF technique is increased resilience to this threshold effect.
Two small correction factors are derived in simulation and use the true calorimeter response, defined as the ratio of measured energy in the calorimeter deposited by particles belonging to a particle-level jet to the total energy of the particle-level jet. The topology correction accounts for the differences in calorimeter response for sparse energy depositions versus those in the dense cores of jets, and is found by taking the average of the ratio of R MPF to the true calorimeter response in each T bin. The showering correction accounts for the flow of particles entering or exiting across the boundaries of the jet definition and is calculated from the ratio of the true calorimeter response to the measured response of the reconstructed  jet, therefore varying with the jet algorithm and size. The total correction factor is the product of the two and is found to be less than 2% for jets of T < 50 GeV and negligible above that. This correction factor would in principle be applied identically to R MPF in both data and simulation to better estimate jet response, but since the ratio of R MPF in data and simulation is the quantity of interest for the in situ calibration, the correction would cancel out in the ratio and only the uncertainty in its derivation is relevant. This uncertainty is taken from a comparison of two different physics lists (FTFP BERT [29] and QGSP BIC [51]) in the simulation of the particle/detector interactions and is found to be ∼ 2% for jets with T < 20 GeV, ∼ 0.5% for jets with 20 GeV < T < 40 GeV and zero for jets with T > 40 GeV.
The full set of uncertainties is shown for the → + jet and → + jet analyses in Figure 13 and for the +jet analysis in Figure 14. The dominant systematic uncertainties are due to generator differences at lower T and to the photon energy scale at higher T . Uncertainties in the , , and energy scales and resolutions are taken from the calibrations provided for each physics object and are propagated through the analysis [47, 48]. The Δ and second-jet veto uncertainties are estimated by varying the cuts and comparing the resulting response measurements. As in the intercalibration, the JVT uncertainty is determined by comparison with tighter and looser working points. A photon purity uncertainty is estimated for the +jet analysis using control regions dominated by dĳet events where one of the jets can be misidentified as a photon. The uncertainty on the final state modelling is taken, as discussed, from the generator comparison. Limited data and MC statistics contribute to the statistical uncertainty, which is largest for the lowest and highest bins of each analysis. A bootstrapping procedure is applied to the uncertainties to suppress statistical fluctuations as previously described. Similar analyses in the / +jet final states but explicitly balancing the reference T against the T of a reconstructed jet (direct balance) are used to cross-check the jet energy scale calibration. The JES results computed using direct balance showed good agreement with those derived via MPF.
The innate difference in response between EMtopo and PFlow jets can be seen by comparing their measured MPF responses. Since the MPF method uses topo-clusters and PFlow objects in computing the missing energy, the measured responses are independent of the MCJES calibration and reflect the precalibration response for each jet input type. The MPF responses measured in the +jet analysis for EMtopo and PFlow jets are shown in Figure 15. The shape of the EMtopo measurement follows the form of the Groom's function, which corresponds to the response expected from a hadronic calorimeter [52]. The PFlow measurement does not follow the same shape but instead shows an improvement over the baseline calorimeter response at low T thanks to the inclusion of information from tracks.

High-T jet calibration using multĳet balance
The final stage of in situ calibration derives a correction for jets with T above the range of the / +jet analyses using the multĳet balance (MJB) technique. Events are selected with a single high-T jet balanced against a system of lower-T jets (the recoil system). The jets of the recoil system are selected to ensure they are well calibrated using a combination of the / +jet results (Section 5.2.2), while the leading jet is left at the scale of the intercalibration. The response of the system is defined as: where ref T is taken from the vector sum of all jets in the recoil system. In a procedure parallel to that used for the / +jet analyses, the response is measured in bins of ref T and the correction is then mapped to the uncalibrated leading jet by finding the average lead T of the events in each bin. Since the MJB analysis can only include events where all jets of the recoil system can already be wellcalibrated, events with very high lead T are often excluded as their second and third leading jets can have momenta outside the range of calibration by the / +jet analyses. To address this, MJB proceeds via two iterations. In the first iteration, a combination of the / +jet results is used to calibrate the recoil system, so only events with subleading jets of T < 1.2 TeV are included. In the second iteration, events with subleading jets up to T = 1.8 TeV are included and calibrated using the MJB results from the first iteration. This extends the range of the calibration to lead T = 2.4 TeV. Events are selected for the MJB analysis using a variety of single-jet triggers with each corresponding to a unique range of lead T . To suppress dĳet topologies and ensure that only true multĳet events are used, events must have at least three jets with T > 25 GeV and | | < 2.8 and the subleading jet must not have a momentum above 0.8 lead T . Jets are as usual required to pass JVT selections, limiting the effects of pile-up. Isolation of the leading jet from contamination by the recoil system is ensured by requiring that the azimuthal angle Δ between the leading jet and the direction of the recoil system is at least 0.3 radians and that the Δ between the leading jet and any individual jet in the recoil system with a T > 0.05 lead T is at least 1.0 radians. Figure 16(a). In both data and MC simulation, the response decreases at lower T due to the intrinsic bias in R MJB from the combined effects of the leading jet isolation and T asymmetry requirements. This bias is greater for lower-T leading jets, but is well modelled in simulation, leaving the calibration unbiased. The lower panel shows the ratio of the response of each MC sample to data. Here, the ratio of the S sample to data defines the nominal correction while the ratio based on P defines an uncertainty on the generator choice. This response ratio is constant and approximately 2% for jets above 1 TeV; below this point the calculated correction is slightly smaller.

The MJB response in data and in four MC samples with different generators is shown in
All uncertainties in the MJB analysis are shown in Figure 16(b). The dominant term at low lead T is the uncertainty from jet flavour, derived in simulation and reflecting the difference in jet response for quark-initiated and gluon-initiated jets. Two terms contribute, one reflecting the uncertainty in the fraction of gluon-initiated jets in the sample, the other based on the difference in MC simulation-derived gluon response between generators. Other independently derived uncertainties correspond to pile-up and punch-through effects and are propagated through the MJB analysis via the recoil system. The +jet, +jet, and intercalibration uncertainties are propagated from the previous stages of in situ analysis. Event selection uncertainties are determined by varying each of the analysis cuts and determining the effects   ; its difference from the P result defines the 'MC generator' uncertainty in (b). This uncertainty is defined in a single-sided way by the measured response difference and therefore it is not symmetrised for display in (b) but instead its full one-sided value is shown. Other uncertainties come from the event selection and MC simulation/data statistics or are propagated from the +jet, +jet, flavour, pile-up, intercalibration, and punch-through studies. on the measured response ratio. Finally, the MC generator uncertainty is derived as described above by comparing the response ratio of S with P as an alternative. Results using H and P +P 8 are shown for reference but are not included in the uncertainty definition as they are less reliable for this measurement. All uncertainties are smoothed via the bootstrapping procedure to ensure statistical significance, and the total uncertainty is found to be below 1.5% for all considered values of lead T . The MC generator uncertainty, which is defined in a one-sided fashion from the response ratios, is symmetrised by the in situ combination process along with the other uncertainties. However, its full one-sided size is shown in Figure 16 For EMtopo jets the intrinsic bias at low T is slightly smaller and more closely tracked by simulation, leading in turn to slightly reduced systematic uncertainties for jets below T ∼ 700 GeV. Above T > 1 TeV, in situ uncertainties propagated from lower-T jets have a greater impact, and therefore the uncertainty is smaller for PFlow jets than for EMtopo jets.

Pile-up and the in situ analyses
One of the primary changes in LHC run conditions over the course of Run 2 was an increase in pile-up. The average number of interactions per crossing ( ) during 2015+2016 data taking was 23.7, which increased to 37.8 in 2017. The data taken during 2018 and to which the calibrations in this paper are also applied has an average of 36.1 interactions per crossing [16]. The consistency of the calibrations for events with different pile-up conditions is therefore an important feature of the methods.  The in situ JES measurements can be used to calculate the dependence of the measured median T density on the event topology in simulation and data and to derive an uncertainty, as mentioned in Section 5.1.1. The density is computed as a function of for each of the +jet, +jet, and dĳet topologies as shown in the top panels of Figure 18. Taking 2017 as the value of for the average pile-up conditions during 2017 data taking and t1 and t2 as any two in situ measurement topologies out of +jet, +jet, and dĳet, then the following metric of consistency can be defined: The quantity max(|Δ|) is then the largest value of Δ across the various topology comparisons. The total topology systematic uncertainty is given by where JES is the size of the MCJES correction for a jet with the relevant T . The second panels in Figure 18 show t1 2017 − t2 2017 for the comparisons ( +jet, dĳet) and ( +jet, dĳet) in both MC simulation and data. The lower panels show the difference of these two quantities between data and MC simulation, that is, Δ +jet, dĳet and Δ +jet, dĳet . The input to the systematic uncertainty max(|Δ|) is the most discrepant of the two lines in the lower panel evaluated at = 37.8, the value in 2017 data. As Figure 18 illustrates, this uncertainty is larger for PFlow jets than for EMtopo jets. This is understood to be due to a greater sensitivity to the underlying event when tracking information is included, which leads to greater differences among the simulated samples.  Figure 18: Inputs to the topology uncertainty derived in the +jet, +jet, and dĳet in situ analyses. The error bars show the statistical uncertainties. The top panels relate the T density to the mean number of interactions per bunch crossing in data and MC simulation for the three input analyses. The second panels show the difference between the +jet and dĳet and between the +jet and dĳet measurements. The lowermost panels show the difference between the data and MC simulation lines in the second panels: this defines the size of the topology uncertainty. The two plots show (a) EMtopo and (b) PFlow jets, illustrating why this uncertainty is larger for PFlow jets than for EMtopo jets.

In situ combination
The data/MC simulation response ratios, from the four different 'absolute' in situ measurements of (→ )+jet, (→ )+jet, +jet, and the multĳet balance must be combined to produce a single calibration covering the full range of jet T from 17 GeV to 2.4 TeV. The four measurements overlap one another in various T ranges, so this procedure must account for their relative statistical power as well as the tension between different response ratio measurements in the same T range. The (→ )+jet and (→ )+jet channels, though compatible within uncertainties, are treated as separate measurements for the sake of the combination since they are affected by different systematic uncertainties.
The combination procedure is briefly summarized here; for a detailed description see Ref. [5]. Each of the absolute in situ measurements is converted from a parameterisation in terms of reference object T into jet T and divided into finer bins of 1 GeV using second-order polynomial splines. A 2 minimization is performed in each bin, taking as inputs the measurements available in that T range and their uncertainties. This minimisation functions as a weighted average, with the weight given to each input measurement decreasing as its uncertainty grows. In this way, the measurement with the smallest statistical and systematic uncertainties dominates the estimate of the response ratio in that bin.
The weights of each input measurement in this combination are shown in Figure 19(a) as a function of jet T . The +jet measurements dominate for jet T below ∼ 500 GeV where the statistical uncertainties on these measurements grow dramatically; the (→ )+jet is the more powerful of the two in the upper half of this range due to the size of the electron scale and resolution uncertainties affecting the (→ )+jet channel. The combination is then dominated by +jet until jet T of above 1 TeV, where the lower statistics in this channel and the decreased flavour uncertainties in the multĳet balance analysis allow the latter to dominate. The final calibration curve is determined by smoothing the outputs from the minimization with a Gaussian kernel.  Figure 19: (a) The weight assigned to different techniques in the combination of in situ measurements of the relative T response of anti-= 0.4 particle-flow jets in data and simulation, as a function of the jet T . For each T bin, the weights of the +jet, +jet, and multĳet balance methods are shown. (b) The 2 / dof metric, illustrating the compatibility of the in situ measurements being combined, as a function of jet T . In the low T range, the combination is between three measurements ( (→ )+jet, (→ )+jet, and +jet) of which the two +jet measurements have several correlated uncertainties, resulting in increased tension compared to previous calibrations.
The √︁ 2 / dof across the measurements, before any scaling is applied, is shown in Figure 19(b). This metric shows the degree of tension between the input measurements at each point: when they are in agreement well within uncertainties the value will be below 1, while when they differ relative to their uncertainties it will be above 1. Following PDG guidelines, in bins where tension between the input measurements, quantified by √︁ 2 / dof , is found to be greater than 1, the uncertainties in the measurements in that bin are scaled by the same tension factor to ensure that the overall level of agreement between methods is acceptable within uncertainties for all T values [53]. However, since the tensions visible at low T are primarily between the two +jet measurements, and since the MC generator and showering and topology uncertainties are fully correlated between the two channels and therefore cannot contribute to this tension, these two components are excluded from the scaling procedure. The components which are not scaled are the dominant uncertainties. Figure 20 shows the final in situ combination as a function of jet T . To complete the calibration, the inverse of the curve ( MC / data ) is taken as the scaling factor and applied to data. The combined measurement (solid line) for PFlow+JES jets is compared with each of the four absolute in situ analyses (empty shapes) in Figure 20(a). The total size of the correction is approximately 3% at low T and decreases to around 2% for jets above 200 GeV. A comparison between the results for EM+JES and PFlow+JES jets is shown in Figure 20(b), where the overall size of both the correction and its uncertainty is seen to be slightly larger for EM+JES jets.  Figure 20: (a) Ratio of the PFlow+JES jet response in data to that in the nominal MC event generators as a function of jet T for +jet, +jet, and multĳet in situ calibrations. The inner horizontal ticks in the error bars give the size of the statistical uncertainty while the outer horizontal ticks indicate the total uncertainty (statistical and systematic uncertainties added in quadrature). The final correction and its statistical and total uncertainty bands are also shown, although the statistical uncertainty is too small to be visible in most regions. (b) A comparison of the combined correction and its uncertainty for PFlow+JES and EM+JES jets.
Each uncertainty component from the in situ analyses is individually propagated through the combination procedure. First, the relevant measured response is varied by 1 in the uncertainty component within its standard binning. The finer rebinning, 2 minimization, and combination procedure is repeated, although using the weights as determined for the nominal result to prevent the varied uncertainty from decreasing the contribution of the measurement. The difference between the combined calibration curve with the systematically shifted input and the nominal calibration curve is taken as 1 in the varied uncertainty. Throughout this process, each individual uncertainty source is treated as fully correlated across and T but entirely uncorrelated with all other uncertainty sources. After this step, the uncertainties from the +jet analyses are taken to be fully correlated with the same uncertainties propagated through the multĳet balance. Other assumptions of correlation between components can similarly be made and altered after their propagation, allowing multiple different assumptions.

Systematic uncertainties
The full uncertainty in the jet energy scale consists of 125 individual terms derived from the in situ measurements, pile-up effects, flavour dependence, and estimates of additional effects as summarized in Table 2. The majority of the individual terms stem from the in situ measurements and cover the effects of analysis selection cuts, event topology dependence, and MC mis-modelling and statistical limitations, as well as the uncertainties associated with the calibration of the electrons, muons, and photons.
The intercalibration analysis results in five nuisance parameters, with a sixth for 2018 data only, as discussed in Section 5.2.1: one covers systematic effects, one covers statistical uncertainty, and three (four in 2018) are used to parameterize the non-closure. Pile-up effects are described by four nuisance parameters which account for offsets and T dependence in and PV as well as event topology dependence of the density metric . The offset and T dependence terms are derived in data using a combination of +jet measurements and measurements comparing reconstructed jets with track-jets. The topology term is the largest of the pile-up uncertainties and is determined by the maximum deviation in measured density between different in situ measurements under the same pile-up conditions.
The two flavour dependence uncertainties are derived from simulation and account for relative flavour fractions and differing responses to quark-and gluon-initiated jets [5,6]. The flavour composition uncertainty accounts for the differing response of quark-and gluon-initiated jets in a sample with some uncertainty on the fraction of gluon-initiated jets . Where R and R are the responses measured in P and is the uncertainty on in the sample, the flavour composition uncertainty is defined as: The flavour response uncertainty accounts for the fact that, unlike quark-initiated jet response, gluon-initiated jet response is found to differ significantly between generators. This uncertainty is defined by comparison between the nominal P sample and an alternative H sample: Figure 21 shows the gluon-jet response and the difference between quark-jet and gluon-jet responses using both P and H for PFlow jets. The samples are the same as those used for the multĳet balance analysis and are dominated by gluon jets at low T . For H , R − R becomes negative in the 90-600 T region (which appears as a bump in the |R − R | curve).
An additional uncertainty applied only to -initiated jets covers the difference in response between jets from light-versus heavy-flavour quarks and replaces the flavour composition and response uncertainties for these heavy-flavour jets. The punch-through uncertainty accounts for mis-modelling of the GSC correction to jets which pass through the calorimeter and into the muon system, taking the difference in jet response between data and MC simulation in bins of muon detector activity as the systematic uncertainty. Both are discussed in more detail in Ref. [6]. Finally, the high-T 'single particle' uncertainty is derived from studies of the response to individual hadrons and is used to cover the region beyond 2.4 TeV, where the MJB analysis no longer has statistical power [29]. When calibrating MC samples simulated using AFII, an additional non-closure uncertainty is applied to account for the difference in jet response between these samples and those which used full detector simulation. Uncertainty in the response of gluon-initiated jets -jets Uncertainty in the response of -quark-initiated jets Punch-through Uncertainty in GSC punch-through correction Single-particle response High-T jet uncertainty from single-particle and test-beam measurements AFII non-closure Difference in the absolute JES calibration for simulations in AFII The total jet energy scale uncertainty is shown in Figure 22(a) as a function of jet T for fixed jet = 0 and in Figure 22(b) as a function of jet for fixed jet T = 60 GeV. A dĳet-like composition of the sample (that is, predominantly gluons) is assumed in computing the flavour uncertainties. The uncertainties in the intercalibration analysis are labelled 'relative in situ JES' with the non-closure uncertainty creating the asymmetric peaks around = ±2.5. Uncertainties in all other in situ measurements are combined into the 'absolute in situ JES' term, which also includes the single-particle uncertainty.  Figure 22: Fractional jet energy scale systematic uncertainty components for anti-= 0.4 jets (a) as a function of jet T at = 0 and (b) as a function of at T = 60 GeV, reconstructed from particle-flow objects. The total uncertainty, determined as the quadrature sum of all components, is shown as a filled region topped by a solid black line. Flavour-dependent components shown here assume a dĳet flavour composition.

Uncertainty correlations and reductions
The detail contained in 125 independent nuisance parameters is far more than is required by most analyses, so it is necessary to reduce the uncertainty description to a smaller number of terms. One could imagine a single 'Jet energy scale' nuisance parameter constructed by adding in quadrature all of the independent components. However, a meaningful set of correlations exist between the jet energy scale uncertainties for two jets at different and T as a result of the structures of the nuisance parameters. In the case of reduction to a single component, the entirety of this correlation information would be lost and an unrealistic assumption -that of full correlation between the jet energy scale uncertainties for any values of and Twould be enforced. In practice, a variety of reduced uncertainty schemes are provided to allow simplified descriptions with a minimum loss of correlation information.
The 98 uncertainty components stemming from the absolute in situ analyses are functions only of T and thus their behaviour can be easily represented by a smaller number of orthogonal terms. An eigenvector decomposition is performed on a covariance matrix of these uncertainty components and the largest of the resulting orthogonal terms are kept separate as new effective nuisance parameters [5]. The remaining terms are combined into a single residual nuisance parameter. To determine how many components to keep independently and how many to combine in the residual term, the covariance matrix for the reduced set is also computed and the difference in correlation in each jet and T between the reduced set and the full set is calculated. This difference is taken as a measure of the information loss and the number of combined terms is adjusted so that the difference is below an acceptable bound (usually 0.05). Two different reduction schemes are produced: the global reduction combines all T -dependent in situ uncertainty components regardless of their sources and results in 8 reduced components for a total of 23 once the two-dimensional terms (not arising from the in situ analyses and not reduced) are included. The category reduction combines the T -dependent in situ uncertainty components in separate groups based on their origin (detector, statistical, modelling, or mixed) and results in 15 reduced components for a total of 30. The JES correlation matrix for the full set of nuisance parameters is shown in Figure 23(a). The bin-by-bin correlation loss between the full set of nuisance parameters and the category reduction is shown in Figure 23(b) and is below 0.05 everywhere as required.
While the same procedure could in principle be used for the components which depend on both T and , the complexity added by the second dimension means that nearly as many eigenvectors would be needed to adequately describe the correlations as there were original terms and so the gain would be minimal. However, many analyses still require fewer than 25 nuisance parameters and are not affected by loss of correlation information. To provide suitable uncertainties for these, a strong reduction procedure is used to group the globally reduced versions of the absolute in situ uncertainties together with the two-dimensional uncertainties into three effective nuisance parameters as detailed in Ref [7]. The three terms of the intercalibration non-closure uncertainty are kept separate because their two-dimensional shapes are especially difficult to reduce and would cause an unacceptably large correlation loss.
Four different sets (scenarios) of the three effective nuisance parameters are created by varying the combinations of terms they contain. The varied sets are chosen such that the correlation loss in each is constrained to an -T range which is well described by a different set. The metric for assessing performance of the four scenarios is the uncovered correlation loss, defined as the maximum difference in correlation between any two reduced scenarios minus the minimum difference in correlation between any reduced scenario and the full set of nuisance parameters. The uncovered correlation loss is calculated for a fine grid of points in and T , ensuring no small-scale structures are missed. Contents of the effective nuisance parameters are varied, keeping systematic uncertainties with similar behaviours mostly grouped together, until a set of scenarios is found in which the maximum uncovered correlation loss is kept below 0.25 and confined to sufficiently small regions that the average correlation loss in an -T plane does not exceed 0.02. A detailed discussion of the application of strongly reduced uncertainties within physics analyses can be found in Ref. [7].

Uncertainties for EMtopo and PFlow jets
Although the scale of individual calibrations may vary between EMtopo and PFlow jets, the final uncertainties are similar in size. A slightly larger pile-up uncertainty contribution in PFlow jets due to the impact of the underlying event is offset by smaller in situ uncertainties, leading to a comparable total overall uncertainty. Figure 24 shows the total uncertainty in EMtopo and PFlow jets for a range of T values at fixed = 0 and for a range of values at fixed T = 60 GeV. The level of agreement is representative of other T and ranges.

Jet energy resolution
Precise knowledge of the jet energy resolution (JER) is important for detailed measurements of SM jet production, measurements and studies of the properties of the SM particles that decay to jets (e.g. / bosons, top quarks), as well as searches for physics beyond the SM involving jets. The JER also affects the missing transverse momentum, which plays an indispensable role in many searches for new physics and measurements involving particles that decay into neutrinos, and thus rely on well-reconstructed missing momentum.
The dependence of the relative JER on the transverse momentum of the jet may be parameterized using a functional form expected for calorimeter-based resolutions, with three independent contributions, namely the noise ( ), stochastic ( ) and constant ( ) terms [54]: The noise ( ) term is due to the contribution of electronic noise to the signal measured by the detector front-end electronics, as well as that due to pile-up. Since both contribute directly to the energy measured in the calorimeter but are approximately independent of the energy deposited by the showing particles, the contribution to the JER scales like 1/ T . The noise term is expected to be significant in the low-T region, below ∼30 GeV. Statistical fluctuations in the amount of energy deposited are captured by the stochastic ( ) term, which represents the limiting term in the resolution up to several hundred GeV in jet T . The term contribution to the JER scales like 1/ √ T . The constant ( ) term corresponds to fluctuations that are a constant fraction of the jet T , such as energy depositions in passive material (e.g. cryostats and solenoid coil), the starting point of the hadron showers, and non-uniformities of response across the calorimeter. The constant term is expected to dominate the high-T region, above approximately 400 GeV.
In order to measure the JER, jet momentum must be measured precisely. This implies that the jets must either recoil against a reference object whose momentum can be measured precisely, or be balanced against one another in a well-defined dĳet system [5,6]. Measurements using the latter approach are presented here, as well as a method for measuring the contributions to the resolution from the noise term ( ) due to both pile-up and electronics. The 2017 data, corresponding to an integrated luminosity of 44 fb −1 is used for these measurements.

Resolution measurement using dĳet events
Dĳet events are both plentiful and produced via a set of 2 → 2 processes that are theoretically wellunderstood. JER measurements using these events for the dĳet balance method rely on the approximate scalar balance between the transverse momenta of the two leading jets. Deviations from exact balance, measured via the asymmetry, given by are due to a combination of experimental resolution, the presence of additional radiation in the event, and biases due to the event selection used in the measurement. In Eq. (5) ( Equation (6) is valid in the probe region as well, up to a correction factor that accounts for the potential overall imbalance between the reference jet and the probe jet in that region. This correction factor is found to be negligible (< 1%) for the measurements performed here. However, the T balance of the measured jets, and thus the measured asymmetry distribution, is measurably affected on an event-by-event basis by physics effects such as additional radiation, non-perturbative processes including hadronization and multi-parton interactions, and others that may lead to particle losses and additions in the measured jets. Consequently, the measured dĳet balance asymmetry distribution represents a convolution of the intrinsic detector resolution and the particle-level balance affected by the aforementioned effects. The determination of probe A must therefore account for such effects, for example by subtracting the particle-level quantity from the measured quantity in quadrature: The results presented here use an iterative fitting procedure to extract the impact of these effects and to isolate the intrinsic detector resolution, probe A det , by assuming a Gaussian convolution of detector effects with the particle-level balance. First, the asymmetry distribution measured at particle level in MC simulation is fitted with an ad hoc function A truth based on exponential curves and found to describe it well. Second, the measured asymmetry distribution, A meas , is fitted by the function , taking A truth from the particle-level fit and where R ( det A ) is a Gaussian distribution with width det A representing the detector resolution for the probe jet and offset det A accounting for any residual non-closure in the JES calibration.
Collision data used for the dĳet balance measurement are collected using specific combinations of central and forward jet triggers for each of the 11 avg T ranges used in the measurement. Trigger selections are required to be at least 99% efficient in the range of avg T in which a particular combination is used. Jets must also pass JVT selection requirements as described in Section 5.2.2.
Topology criteria are applied to select well-defined dĳet production processes with minimal contributions due to additional radiation or higher-order processes. The azimuthal angle, Δ , between the two leading jets in the event and the maximum T of a potential third jet, 3 T , are constrained by the following two criteria: Δ ( 1 , 2 ) ≥ 2.7 rad. , Example asymmetry distributions are shown in Figure 25 in two representative bins of avg T and probe jet probe det . An iterative Gaussian fit to the core of the asymmetry distribution is used to extract the JER. The result of the measurement of the relative JER and its systematic uncertainty is shown in Figure 26 for a single narrow range of probe det and as a function of jet T . The JER is observed to be slightly underestimated by MC simulation in this central region of the detector. Systematic uncertainties are dominated by imprecise knowledge of the scale of the jets at low T , which results in an approximate 1.5% uncertainty at 40 GeV, whereas the non-closure of the dĳet balance method itself is largely dominant at higher T . The non-closure uncertainty is evaluated as the difference between the resolution measured using the in situ procedure applied to MC simulation and the particle-level resolution, ( )/ , where = reco T / true T . Good agreement is found, resulting in an uncertainty in the relative resolution that is approximately 0.4% and generally increases with T due to the non-Gaussian jet response. At lower T the uncertainties propagated from the JES dominate. The increase in JES uncertainty around 800 GeV is a result of the single-particle uncertainty (see Section 5.3): the jet energy scale calibration used for the dĳet energy resolution measurement is necessarily based on a smaller dataset than the one presented in this paper, allowing the two measurements to converge simultaneously, and as a result the statistics were lower and the single-particle uncertainty became dominant at a lower jet T value than in Figure 22 Figure 25: Asymmetry distribution measured in data and particle-level P 8 for PFlow jets in two example T and ranges. Error bars represent the statistical uncertainty. (a) The measured asymmetry is shown for probe jets with 80 GeV < avg T < 110 GeV in the range 0.2 < | probe det | < 0.7, where the distributions are symmetric by construction. (b) The measured asymmetry is shown for probe jets with 300 GeV < avg T < 400 GeV in the range 1.3 < | probe det | < 1.8. In this probe det range the distributions can be asymmetric. Two fits are performed iteratively: the particle-level asymmetry is modelled with an ad hoc function which is subsequently convolved with a Gaussian function in order to describe the reconstructed asymmetry. The detector resolution is then extracted from the Gaussian fit parameter.  Figure 26: (a) Relative jet energy resolution and (b) absolute uncertainty in the relative resolution as a function of T for PFlow jets in the central region of the detector, measured using the dĳet balance method. The resolution in data is shown in black points with error bars indicating statistical uncertainties; the resolution in detector-level simulated events is shown by the blue curve with total systematic uncertainty given by the blue band. The systematic uncertainty is dominated by terms propagated from the JES uncertainty, while additional terms arise from the analysis selection, pile-up rejection (JVT), physics modelling (comparison with alternative generator), and non-closure effects. The bump in uncertainty around 800 GeV comes from the single-particle uncertainty.

Noise measurement using random cones
Direct estimates of the noise term of Eq. (4) are obtained by measuring the fluctuations in the energy deposits due to pile-up using data samples that are collected by random unbiased triggers. These measurements are performed using the random cones method in which energy deposits in the calorimeter are summed at the energy scale of the constituents in circular areas analogous to the jet area for anti-= 0.4 jets. This approach is adopted due to its ability to account for any non-Gaussian behaviour of the noise contributions to the JER. Two such random cone sums, c1 T and c2 T , are obtained at random values and within opposite ±Δ regions and the difference between them, Δ RC T , provides a measure of the random fluctuations of deposited energy. Multiple non-overlapping cones are selected within each event to maximize statistical power; this is demonstrated to cause no bias in the overall result. This random cone balance is given by and the estimated pile-up noise is determined by the central 68% confidence interval of the distribution of Δ RC T , RC , sampled over many events as a function of both and pile-up levels, as indexed by . Specifically, the noise term due to pile-up, PU , is determined as where the width of the distribution is divided by 2 to obtain the half-width of the distribution, and by √ 2 to obtain the fluctuations corresponding to just a single random cone. The distribution of Δ RC T is shown in Figure 27(a). Updates to the random cone method since its initial description in Ref.
[6] include removing a restriction to only a pair of back-to-back cones since this was found to have no effect on the result and taking multiple non-overlapping random cone pairs per event to maximise statistics.
The energy scale of the noise estimated by PU in Eq. (7) is the constituent energy scale and not that of the jets measured in Section 6.1. In order to compare the measurement of the noise term PU using the random cone method with the JER measured at the fully calibrated scale (e.g. PFlow+JES) a conversion factor is required. The nominal JES calibration factor is used to perform this conversion to the appropriate energy scale. The result is an estimate of the noise due to pile-up that may be directly compared with the measured JER.
A closure test of the random cone measurements is performed by comparing the in situ measurement of the calibrated PU with the expectation from MC simulation. Results are reported here for PFlow jets. To isolate the contribution to the JER from pile-up noise in the MC simulation, the JER is determined in simulated events both with and without pile-up and a subtraction in quadrature is performed between the extracted resolutions. The two JER determinations in MC simulation events with and without pile-up are shown in Figure 27(b) and their quadratic difference is compared directly to the in situ measurement from the random cones method. Each is fitted, as shown by the dotted lines in Figure 27(b): the random cone measurement is fitted with / T while the quadratic difference is fitted with / T ⊕ / √ T to account for non-negligible stochastic contributions. The non-closure of the method is largely due to the differences in topo-cluster formation sensitivity to pile-up and electronic noise in the presence versus absence of hard-scatter particles, and is taken as a systematic uncertainty in the result. This non-closure uncertainty is the dominant uncertainty in the JER noise term, ranging from approximately 17% in the most central region to 75% in the endcap transition region (2.5 < | | < 3.2).
The total noise contribution to the JER includes not just pile-up but also electronic noise, to which the random cones are not sensitive due to the topo-clustering process. To estimate this electronic contribution,  Figure 27: (a) The difference in the random cone sums, Δ RC T , measured in the central region (| det | < 0.7) in randomly triggered data using PFlow objects. (b) Comparison between the pile-up noise term PU determined using the random cone method (black solid circles) and the expectation from MC simulation (orange squares) as extracted from the difference in quadrature of MC simulation with (red downward triangles) and without (blue upward triangles) pile-up. Results are shown at the PFlow+JES energy scale for jets in the central region of the detector (| det | < 0.7). a fit is performed to the JER measured in a dedicated MC simulation sample with = 0 and the electronic noise term is extracted as =0 . The total noise term used in the JER combination is therefore taken to be = PU ⊕ =0 and is shown as a function of in Figure 28 along with its systematic uncertainties.
The dominant systematic uncertainty in the random cone measurement of PU is the previously discussed non-closure uncertainty, but additional terms arise from varying the quantile of the confidence interval used to extract RC and from using a different estimate of the conversion factor to the calibrated JES scale. Two systematic uncertainties apply to =0 : a 20% relative uncertainty conservatively estimating the differences in JER between data and MC simulation and an uncertainty due to the fit parameterization and stability. The systematic uncertainties enter the combined JER fit unsymmetrized in but are symmetrized during the statistical combination, and so the one-sided components are symmetrized in Figure 28 to illustrate their final contribution to the total uncertainty. Anti-PFlow+JES Figure 28: Noise term due to pile-up estimated using the random cone method and its uncertainties as a function of . The dominant uncertainty is due to non-closure in the method. Additional uncertainties address the RC definition, the JER conversion factor, the differences in JER between data and MC simulation, and the fit stability in extracting the = 0 noise term. The RC definition uncertainty and = 0 MC vs data terms are asymmetric in their upwards and downwards components, while all other uncertainties are symmetrized.

Combination of in situ jet energy resolution
A combined measurement of the JER is obtained by performing a fit to the dĳet balance measurements (Section 6.1) using a constraint on the noise term ( ) derived from the random cones measurement and = 0 simulation sample (Section 6.2). The implementation of this statistical combination is performed in a manner nearly identical to that for the JES (Section 5.2.5), propagating uncertainties from the dĳet measurement in the same way and using a similar eigenvalue decomposition to reduce the final number of nuisance parameters.
Instead of using polynomial splines to interpolate across jet T , the JER combination uses the functional form from Eq. (4). A fit to the dĳet measurement data is performed, fixing the noise term to the value measured by the random cone analysis. Dĳet measurement uncertainties are taken to be fully correlated between bins. Uncertainties due to the random cones measurements are determined by propagating the noise term uncertainties and repeating the fit with different values of . These uncertainties are taken to be decorrelated between central (| | < 2.5) and forward (| | > 2.5) regions.
The resulting combined measurement of the JER for PFlow+JES jets is shown in Figure 29 Figure 29: (a) The relative jet energy resolution as a function of T for fully calibrated PFlow+JES jets. The error bars on points indicate the total uncertainties on the derivation of the relative resolution in dĳet events, adding in quadrature statistical and systematic components. The expectation from Monte Carlo simulation is compared with the relative resolution as evaluated in data through the combination of the dĳet balance and random cone techniques. (b) Absolute uncertainty on the relative jet energy resolution as a function of jet T . Uncertainties from the two in situ measurements and from the data/MC simulation difference are shown separately. measurement data points are shown along with the total in situ combination, while the constraint on the noise term derived from random cones and included in that combination is demonstrated by plotting / T and its uncertainties as a separate curve for illustrative purposes. Figure 29(b) shows the absolute uncertainties on the combined JER measurement. For each value of jet T and det a toy jet is created and the size of each JER nuisance parameter corresponding to it is retrieved and plotted.
Comparisons of the JER measurements for PFlow+JES and EM+JES jets, as a function of both jet T and , are provided in Figure 30. The fit to the resolution as a function of T for the PFlow+JES jets shows an improvement in resolution over EM+JES jets at low T .

Application of JER and its systematic uncertainties
In order to ensure that the resolution of the jet energy scale in simulation matches that in data wherever possible, a smearing procedure is recommended. For regions of jet T in which the resolution in data is larger than in MC simulation, the simulation sample should be smeared until its average resolution matches that of data. In regions of jet T where resolution is smaller in data than in MC simulation, no smearing is performed, since the data should remain unaltered.
JER systematic uncertainties are propagated through physics analyses by smearing jets according to a Gaussian function with width smear . If nom is the nominal JER of the sample, after MC simulation smearing if necessary, and NP is the one-standard-deviation variation in the uncertainty component to be evaluated, then: 2 smear = ( nom + | NP |) 2 − 2 nom .
Application of JER systematic uncertainties must account for two factors: first, anti-correlations across a single uncertainty component, and second, differences in resolution between data and MC simulation.
Anti-correlation becomes an issue when a single JER component is positive in some regions of phase space and negative in others. To propagate such systematic uncertainties to analyses, smearing should be applied to the simulation when NP > 0 and applied to the data when NP < 0. It should be noted that the nominal data remains unchanged as this applies only to the application of systematic uncertainties. In the case that data statistics are too low to safely smear, pseudo-data may be smeared instead.
Differences in resolution between data and MC simulation are already accounted for by the application of additional smearing to the simulation when the resolution in simulation is better than in data. When the JER is smaller in data, this difference is accounted for by applying its full value as an additional systematic uncertainty: This term is defined by the dĳet asymmetry measurements of Section 6.1 and is zero for the central slice shown in Figure 29(b), but for some T ranges in more forward detector regions it can be significant. A large value of this uncertainty for PFlow jets at ∼ 3.2 is the source of the peaks visible in Figure 31(b).

Conclusions
The calibration of the jet energy scale and resolution for jets reconstructed with the anti-algorithm with radius parameter = 0.4 is presented. Jets are built from either the energy deposits that form topological clusters of calorimeter cells or a combination of charged-particle tracks and topological clusters. The measurements discussed here use 36-81 fb −1 of data recorded with the ATLAS detector during 2015-2017 in collisions at a centre-of-mass energy of 13 TeV at the Large Hadron Collider. It is the first full calibration of PFlow jets performed by the ATLAS collaboration, the first jet energy scale measurement in the high pile-up conditions of late Run 2 data-taking, and the first jet energy resolution measurement in 13 TeV data.
A sequence of simulation-based corrections removes the contribution to the jet energy from additional proton-proton interactions in the same or nearby bunch crossings, corrects the jet so that it agrees in energy and direction with particle-level jets and, improves the jet energy resolution. Any remaining difference between simulation and data is removed with in situ techniques using well-measured reference objects, including photons, bosons, and other jets, such that the energy scale of fully calibrated jets is unity within uncertainties. The jet energy resolution is measured in a dĳet balance system, and the contribution to the resolution from the noise term due to pile-up and electronics is also measured. The relative jet energy resolution ranges from 0.25 (0.35) to 0.04 for PFlow (EMtopo) jets as a function of jet T .
Systematic uncertainties in the jet energy scale for central jets (| | < 1.2) vary from 1% for a large range of high-T jets (250 < T < 2000 GeV), to 5% at very low T (20 GeV) and 3.5% at very high T (> 2.5 TeV). The absolute uncertainty on the relative jet energy resolution is found to be 1.5 at 20 GeV decreasing to 0.5 at 300 GeV.