Measurement of the energy response of the ATLAS calorimeter to charged pions from $W^{\pm}\rightarrow\tau^{\pm}(\rightarrow\pi^{\pm}\nu_{\tau})\nu_{\tau}$ events in Run 2 data

The energy response of the ATLAS calorimeter is measured for single charged pions with transverse momentum in the range $10<p_\textrm{T}<300$ GeV. The measurement is performed using 139 $\textrm{fb}^{-1}$ of LHC proton-proton collision data at $\sqrt{s}=13$ TeV taken in Run 2 by the ATLAS detector. Charged pions originating from $\tau$-lepton decays are used to provide a sample of high-$p_\textrm{T}$ isolated particles, where the composition is known, to test an energy regime that has not previously been probed by in situ single-particle measurements. The calorimeter response to single-pions is observed to be overestimated by ${\sim}2\%$ across a large part of the $p_{\textrm{T}}$ spectrum in the central region and underestimated by ${\sim}4\%$ in the endcaps in the ATLAS simulation. The uncertainties in the measurements are ${\lesssim}1\%$ for $15<p_\textrm{T}<185$ GeV in the central region. To investigate the source of the discrepancies, the width of the distribution of the ratio of calorimeter energy to track momentum, the energies per layer and response in the hadronic calorimeter are also compared between data and simulation.


Introduction
The energetic proton-proton ( ) collisions produced by the Large Hadron Collider (LHC) predominantly result in a large number of charged and neutral hadrons which form collimated sprays known as jets.
In the ATLAS Experiment [1] the energy of these particles is measured using the calorimeters, and a tracking system measures the momentum of the charged particles. A particle-flow algorithm [2] is used to take advantage of these two systems in the reconstruction of jets. The measurement of hadronic energy deposits is essential for the reconstruction of jets and the jet energy scale calibration relies on the accurate simulation of hadrons interacting with the calorimeter [3]. A powerful method of understanding the calorimeter response to hadrons is to consider the ratio of the energy reconstructed in the calorimeter to the momentum measured in the well-aligned tracking detector [4]. Previously, this was studied using isolated single hadrons from inclusive collisions [5,6]. In this paper, the large dataset accumulated in Run 2 is exploited to select events with isolated charged pions from -lepton decays so that a much higher energy regime can be probed.
The calorimeter response to electromagnetic particles such as electrons and photons is very well known due to their showers being easier to simulate accurately and precise measurements in situ using → events. The energy scale of electrons is therefore known with an uncertainty of less than 0.1% for transverse momenta 25 < T < 70 GeV in the central region [7]. The large variety of physics processes in hadronic interactions and the non-compensating nature of the ATLAS calorimeters make the response to hadrons harder to model in simulation, motivating in situ measurements using highly energetic single pions from -lepton decays.
In this paper the calorimeter response across a wide range of transverse momenta is investigated. bosons are produced copiously at the LHC and can decay via -leptons to single charged pions, providing a source of isolated pions: ± → ± (→ ± ) . Using the Run 2 dataset recorded in 2015-2018, large samples of pions are selected in order to accurately measure the calorimeter response using the ratio of calorimeter energy to track momentum. The measured response is corrected for the contributions from the multiple interactions per bunch crossing and the residual signals remaining in the calorimeter from adjacent bunch crossings, contributions collectively referred to as pile-up. The charged-pion response is measured as a function of track pseudorapidity, trk , across the tracker volume and as a function of track transverse momentum, trk T , in the barrel and endcaps. The average particle energy in the highest trk T bin in the endcaps is 680 (670) GeV in data (simulated data).
Previous measurements of the single-hadron response using this ratio were performed using 2010 and 2012 data [5,6]. These measurements focused on special runs with low instantaneous luminosity and identified isolated tracks from minimum-bias events. As high-momentum isolated tracks are not frequently produced in pure QCD interactions due to the formation of jets which consist of multiple closely spaced particles, they probed the low-momentum phase space with < 30 GeV. Test-beam measurements using the Super Proton Synchrotron (SPS) probed the response to hadrons with energies, , up to 350 GeV [8][9][10][11][12]. For the combined, electromagnetic and hadronic, barrel calorimeter, the precision of these measurements ranged from 2.8% at = 20 GeV to 1.4% at = 350 GeV [12]. The uncertainties were dominated by the non-uniformity of the calorimeter, imperfect knowledge of the effect of the material in front of it, and, for positive beams, the fraction of protons. In this paper the response is measured in situ with a pure ± sample. Additionally, the latest ATLAS detector simulation with updated geometry [13] and with the latest set of hadronic shower simulation models (called the physics list [14]) as used in all recent physics analyses, is used in the comparisons of data with simulation.
Precise knowledge of the calorimeter response at high T is essential for many ATLAS physics analyses. The uncertainty in the energy scale calibration of hadronic jets, the jet energy scale (JES) [3], is one of the primary experimental uncertainties in many searches for new physics and in measurements of Standard Model processes [15][16][17][18]. In situ techniques are used to derive corrections to the transverse momentum of jets to account for differences between data and simulation, as well as uncertainties in these corrections. For the highest-T jets in the TeV regime, where in situ methods exploiting T balance cannot be used due to a lack of data because of the low cross-section, the determination of the uncertainties in the momentum relies on measurements of the calorimeter energy scale from single hadrons. These are based on the low-momentum in situ measurements, test-beam results and conservative extrapolations, further motivating this measurement. Also, the calibration of hadronically decaying -leptons relies on these measurements at high T [19]. Additionally, measurements of the internal structure of jets rely on knowledge of the calorimeter energy scale [20]. Likewise, when tagging hadronically decaying heavy particles [21] the jet substructure variables that are used cannot be corrected easily for calorimeter scale discrepancies, so they depend on good modelling of the underlying calorimeter energy scale.
The paper is organised as follows. Section 2 introduces the ATLAS detector, and Section 3 describes the dataset and samples of simulated events. The selection of events for the measurement and the variable of interest are defined in Section 4. The energy response is evaluated in Section 5 and uncertainties in the measurements are evaluated in Section 6. The width of the distribution of the variable of interest is measured in Section 7 and the longitudinal segmentation of the calorimeters is exploited in Section 8 to investigate discrepancies between simulation and data. Finally, the conclusions are presented in Section 9.
Samples of simulated events created using Monte Carlo techniques are used to model the Standard Model processes. The main signal processes are (→ )+jets and top pair production (¯), with smaller signal contributions from single top production and small background contributions from (→ , )+jets and (→ )+jets. Other processes are cross-checked with simulated event samples and found to be negligible.
The production of +jets and +jets was simulated with the S 2.2.1 [30, 31] generator using matrix elements (MEs) with next-to-leading-order (NLO) accuracy for up to two jets, and matrix elements with leading-order accuracy for up to four jets, calculated with the Comix [32] and O L [33,34] libraries. They were matched with the S parton shower [35] using the MEPS@NLO prescription [36][37][38][39] with the set of tuned parameters (tune) developed by the S authors. All polarisation effects in -lepton production were retained by using a full matrix element calculation, and were propagated to its decay in the HADRONS++ module, which implements intermediate hadronic resonances and spin-correlation effects. The NNPDF3.0 set of parton distribution functions (PDFs) [40] was used and the samples are normalised to a next-to-next-to-leading-order (NNLO) prediction [41].
The production of¯and single-top-quark events was modelled using the P B v2 [42][43][44][45] generator at NLO with the NNPDF3.0 [40] set of PDFs. These events were processed with P 8.230 [46] using the A14 tune [47] and the NNPDF2.3 set of PDFs [48]. The decays of bottom and charm hadrons were modelled using the E G 1.6.0 [49] program. In¯events the ℎ damp parameter 3 was set to 1.5 top [50]. The¯process is normalised to the inclusive cross-section calculation at NNLO in QCD including the resummation of next-to-next-to-leading logarithmic (NNLL) soft-gluon terms from T ++ 2.0 [51][52][53][54][55][56][57]. The single-top-quark production processes are normalised to the inclusive cross-sections calculated at NLO in QCD with NNLL soft-gluon corrections [58,59]. For single top quark production in the -channel events, the diagram removal scheme [60] was used to remove overlap withp roduction. In all samples the top quark mass is 172.5 GeV.
To study the generator dependence of the analysis an alternative sample of QCD +jets events was simulated by M G 5_ MC@NLO 2.2.2 [61] using LO-accurate MEs with up to four final-state partons and the NNPDF3.0 set of PDFs. The events were interfaced to P 8.186 [62], with the A14 tune using the NNPDF2.3 PDF set, for the modelling of the parton shower, hadronisation, and underlying event. The overlap between matrix element and parton shower emissions was removed using the CKKW-L merging procedure [63,64]. The decays of bottom and charm hadrons were performed by E G 1.2.0 [49].
All samples were passed through a detailed simulation of the ATLAS detector [65] based on G 4 [66]. The modelling of hadron interactions was done by the FTFP_BERT_ATL [14] physics list. In this physics list the Bertini cascade model [67] is used for hadrons with energy below 12 GeV, and the Fritiof string model [68,69] for hadrons with energy above 9 GeV. The probability of using each model changes smoothly across the region of overlap.
Furthermore, simulated inclusive inelastic collisions were overlaid to model additional pile-up collisions in the same and neighbouring bunch crossings. These were generated with P 8.210 [46] with the A3 tune [70] and NNPDF2.3 PDF set.

Event selection, observables and response extraction
Events are selected using the properties of reconstructed objects to obtain a high-purity sample of ± → ± (→ ± ) decays which are used to measure the calorimeter response. The calorimeter energy associated with the track is summed and a fit to the energy divided by the track momentum is performed to extract the calorimeter response. The reconstructed objects used to form the variable of interest are described as follows: • Charged-pion tracks are reconstructed by an iterative track-finding algorithm seeded by measurements in the silicon layers of the inner detector [71,72]. The precise alignment of the inner detector ensures that these tracks are accurately reconstructed with a residual sagitta bias and momentum scale bias of less than 0.1 TeV −1 and 0.9 × 10 −3 respectively [4].
• Calorimeter topoclusters are clusters of connected calorimeter cells throughout both the EM and hadronic calorimeters. They are seeded from cells with reconstructed energy significantly above the noise [73]. Their energy is the sum of the energies of the constituent cells, which are calibrated at the electromagnetic scale. This scale is defined such that the response to electromagnetic showers is correctly calibrated in all calorimeters. Due to the non-compensating nature of the ATLAS calorimeters, the response to hadronic showers is expected to be lower such that they will be under-calibrated. Single particles can form multiple topoclusters [2], so when measuring the response all clusters in both the EM and hadronic calorimeters within a region are summed.
Additionally, various other reconstructed particles and observables are used in the event selection: • Electrons are reconstructed from tracks and calorimeter energy deposits. They are identified through a log-likelihood discriminant based on the track properties and the shower shape in the calorimeter. Electrons are identified with the Loose working point and require T > 10 GeV and | | < 2. 47 [74]. • Jets are reconstructed for the purpose of building the missing transverse momentum, miss T . Particleflow objects [2] are clustered using the anti-algorithm [77] with radius parameter = 0.4 using the FastJet package [78] and are calibrated to the scale of jets built from the momentum of the stable interacting particles entering the simulation using the same algorithm. During the calibration sequence, corrections are applied to mitigate the effects of pile-up, and jets in data are corrected for residual differences between data and simulation measured in situ using various T -balance techniques [3]. Calibrated jets are required to have T > 20 GeV. Jets originating from pile-up are rejected using inner-detector information [79].
• miss T is reconstructed from the vector sum of the momenta of the hard objects described above and a soft term formed from tracks matched to the primary vertex (PV) 4 of interest but not associated with any hard object [81].
The event selection is based on the above reconstructed particles and observables, and is designed to obtain a high-purity sample of ± → ± (→ ± ) events with the following properties: 1. miss T > 150 GeV in addition to being selected by the missing transverse momentum trigger.
2. A leading track with T > 10 GeV and | | < 2.5 satisfying the following criteria: • The sum of the T of other tracks, matched to the PV, within a cone of size Δ = 0.3 around the track is less than 2 GeV.
• The track's longitudinal impact parameter, measured relative to the PV, is less than 1.5 mm.
• The track's transverse impact parameter, measured relative to the beamline, is less than 0.5 mm.
• The track satisfies the tight criteria for hits in the silicon pixel and microstrip detectors [82].
• The 2 per degree of freedom of the track fit is less than 1.5.
• Several TRT hits are used in the track fit if the track is within | | < 2.0. The first criterion ensures that the selected events are in the well-understood region of the miss T trigger efficiency where it is not rapidly changing with respect to miss T . The tracking criteria select isolated tracks associated with the primary vertex that have a large number of hits to measure the momentum. The TRT hit requirements and the criterion for the 2 per degree of freedom of the track fit ensure that the track reconstruction uses the large lever arm within the TRT volume and that the track fit is of good quality. The third, fourth and fifth criteria reject tracks that are formed from electrons, muons and converted photons. The final two criteria reject events where the track does not originate from a -lepton; such backgrounds are only significant at low trk T . They utilise the expected displacement of the track owing to the -lepton lifetime and the expected upper bound on the transverse mass at the boson mass.
After this selection the sample has a high purity of ± → ± (→ ± + ) events with the most significant background being ± → ± (→ ± + 0 + ) events. is not expected that the event selection will bias the response significantly and the selection criteria are varied to check that the results of the analysis are not sensitive to the specific selection criteria.
The summed calorimeter energy deposit is corrected for pile-up contributions. Following the methodology used in the calibration of jets, the measured median T density in the -plane of the event, [3,83], scaled by the area of the cone within which we sum clusters, = × 0.15 2 , is subtracted. After this subtraction there is a small residual pile-up dependence which is subtracted using the average number of interactions per bunch crossing, PU , and the gradient, , determined from simulation as a function of | |: (| |) × PU . Only one residual correction is applied compared to two in the calibration of jets as the pile-up contributions are much smaller in this case due to the smaller area considered. This methodology is different from the previous single-hadron response measurement in a low pile-up environment [5, 6], where pile-up and physics backgrounds were both subtracted. In this analysis the pile-up is subtracted but the physics background is included in the functional fit.
Events are selected according to their trk T and | trk | values to probe the different regions of phase space. Figure 1 shows the distribution of the variable of interest, EM T / trk T , for six of the ( trk T , | trk |) bins. Due to differences in the EM T / trk T scale and resolution between simulation and data, some clear discrepancies are seen in the data-to-MC ratio.
The primary background is from events with -leptons which decay to a charged pion, ± , and at least one neutral pion, 0 . In the calorimeter, more energy will be reconstructed from this process for the same track momentum due to the electromagnetic shower from the 0 , which predominantly decays as 0 → . This results in an upper tail in the EM T / trk T distribution. Other single-charged-particle -lepton decays, such as ± → ± , also contribute at a lower level.
The aim of this analysis is to extract the mean and width of a Gaussian function fitted to the ± → ± (→ ± ) signal in the core of the EM T / trk T distribution to probe the calorimeter response and resolution. To extract this, the distribution is fit with the sum of a Gaussian function and another function that describes the background from all processes other than ± → ± (→ ± ) . The functional form of this second term is taken to be a Landau function [84]. This is chosen empirically as it is seen to accurately capture the shape of the background. The range of the fit is then taken as the mean, ± , of a Gaussian function describing EM T / trk T for signal in simulation minus 1.5 times its width, ± , to 2, i.e. [( ± − 1.5 ± ), 2]. The combined Gaussian+Landau function is fit to the distribution in data and simulation. In both cases the shape of the Landau function is fixed from a fit to the background processes in simulation. The three parameters of the Gaussian signal and the normalisation of the Landau function are allowed to float in the final fit. The mean of the Gaussian function gives the calorimeter response scale and the width of the Gaussian function measures the resolution of EM T / trk T , which contains components from both the calorimeter and track reconstruction. Figure 1 shows the functions fitted to data and simulation in six of the ( trk T , | trk |) bins, and the extracted mean and width of the Gaussian function are displayed.

Calorimeter response measurement and comparison with simulation
The data are binned in both the track T and track | | to probe the calorimeter response at different momenta and in different regions. The calorimeter response is expected to vary significantly across | | due to the geometry and different calorimeter technologies. In the central region there is a large region which is expected to have a uniform response due to similar calorimeter structures up to | | = 0.7. As the rejection of muons is more difficult at | | ∼ 0 a barrel selection region is defined by 0.1 < | | < 0.7. In this region        there is a high-granularity liquid-argon/lead electromagnetic calorimeter and a steel/scintillator hadronic calorimeter. Another uniform detector region exists in the endcap region of 1.8 < | | < 2.4, where there is liquid-argon/lead electromagnetic calorimetry and liquid-argon/copper hadronic calorimetry. Due to differences in technology and material in the two regions, differences in the modelling might be expected. To study the response across the detector, results are shown for three trk T bins: 30 < trk T < 50 GeV, 50 < trk T < 70 GeV and 70 < trk T < 100 GeV, as a function of | trk |. Figure 2 shows the fitted response for the barrel and endcap regions defined above, as a function of the track T . Clear differences are seen between the data and simulation in both regions, with a ∼2% overestimate of the response in simulation in the barrel and a ∼4% underestimate of the response in simulation in the endcaps relative to the data. The data/MC differences also follow the same trends as seen in the jet response in these two regions [3]. It is checked that the fits give a reasonable 2 per degree of freedom and are stable when the fitting range and binning of the data are altered. These plots only show the uncertainties from the limited number of events in the samples, and systematic uncertainties are discussed in the next section. At the highest energies the steeply falling momentum spectrum and the increasing track resolution produces a bias in the measured track momentum compared to the true pion transverse momentum, true T , resulting in the distribution flattening in both data and simulation. The biases in simulation from trk T / true T in the central (endcap) region for the highest three trk T bins are [0.2%, 0.8%, 1.7%] ([1.7%, 4.7%, 10.8%]). Figure 3 shows the fitted response across the detector in fine | trk | bins for the three different trk T bins which have the most events. The calorimeter structure is clearly seen and the simulation follows the data, but some differences are present. The largest disagreements are ∼5% in the 1.0 < | trk | < 1.2 region. When the two sides of the detector are analysed independently the same trends are seen in the response. Additionally, when the analysis is performed separately for positively and negatively charged tracks consistent results are seen. In this energy regime the responses to + and − are expected to be very similar and this check also tests the alignment of the inner detector as alignment errors would have different effects on positive and negative tracks.

Fit function (closure):
A Gaussian+Landau fit to the signal and background is used to extract the calorimeter response scale. This is quantified in simulation by the difference between the fitted mean obtained from a Gaussian fit to just the signal ± → ± (→ ± ) events and the fitted mean of the Gaussian function obtained from a combined Gaussian+Landau fit to the simulated signal and background. This closure test is seen to perform very well, showing that the Landau function choice is appropriate. The difference between the two values obtained is taken as a systematic uncertainty of the measurement. It is found to be subdominant and typically at the ∼0.2% level.
Bias in the fitted background shape: The shape of the background is taken from a fit to simulation, which could be biased relative to the shape in data. The background is primarily ± → ± (→ ± 0 ) events. Imperfections in the simulation of the shape of this background can come from detector effects such as the energy response to the charged and neutral pions, or from the energy spectra of the charged and neutral pions as modelled by the MC generator. The energy scale and resolution for electromagnetic showers, such as those from 0 → decays, has been measured in ATLAS [74] and is known better than that for the hadronic showers probed here. The impact of detector mismodelling is tested by shifting or smearing the simulated response and evaluating the change in the fit when it is applied to data with the altered background shape. The hadronic response is shifted down (up) by 3% (5%) in the central (endcap) region, and separately smeared to increase the resolution by 10%. These values are determined from the discrepancies between data and simulation observed in this analysis. The smearing in the central region, and scale variation in the endcaps, produce significant uncertainties in the final results. To assess the modelling of the shape an alternative generator is used for the +jets background; no significant impact on the shape is seen, so this is neglected. Therefore, the uncertainty in the shape of the fitted background is dominated by the detector modelling.
Electron and muon contamination: Contamination from electrons and muons is expected to be small: less than 0.4% and 0.7% respectively. The residual contamination is tested by tightening the criteria to reject these events: requiring the fraction of energy in the hadronic calorimeters to be >20%, and that there are no associated muon segments. Changes in the measured response in data define the uncertainty for muons. For electrons the difference between the changes in the measured response in data and simulation defines the uncertainty because the hadronic energy fraction cut can also bias the signal. Both of these uncertainties are found to be smaller than the statistical uncertainties of the measurement.

Non-lepton backgrounds:
There is a small amount of background where the track does not originate from a lepton. The (→ )+jets contamination can be reduced by a tighter selection on the transverse mass of the track and the miss T . The deviation from the nominal response when the upper bound on T is reduced by 10 GeV is taken as an uncertainty in this background. Additionally, the deviation from the nominal response when events with T < 10 GeV are excluded is taken as an uncertainty in the potential contamination from events where the miss T comes from detector effects. These are also found to be smaller than the statistical uncertainties.
Pile-up uncertainties: Two uncertainties in the subtraction of the contributions from pile-up are considered. Non-linearities in the required corrections can lead to imperfections in the removal of the effects of pile-up. The closure of the pile-up corrections is tested by taking half the difference between the responses for two large single-particle samples with different pile-up, PU ∼ 20 and PU ∼ 50, after the corrections. Additionally, the pile-up residual correction is determined from simulation where pile-up might not be well modelled. Studies of the PU dependence of the median T density of the event, , the number of clusters, and the cell energy and noise find that the modelling of the energy flow in the simulation is within 10%-20% of the data across the detector. Therefore, 25% of the residual correction is taken as an uncertainty in the modelling of pile-up, which forms a minor uncertainty in the analysis. In data it is checked that the fitted response is stable with respect to pile-up by splitting the dataset into two independent datasets via high pile-up and low pile-up selections based on PU , and also into three independent datasets, and no significant differences given the statistical uncertainties are seen in the measured response.
Tracking uncertainties: Any bias in the momentum scale of the reconstructed tracks directly affects the measurement. Muons are precisely calibrated using → and / → events. The analysis selection is modified by removing the T and transverse impact parameter requirements and requiring that a reconstructed muon is matched to the isolated track. The T of the track and muon are obviously correlated but fits to trk T / T probe the inner-detector track scale relative to the calibrated muons. Two components are considered, the uncertainty in the calibration of muons and the statistical uncertainties of this cross-calibration, as none of the deviations of simulation from data are significant. Additionally, differences in the track resolution can bias the measurement because of the underlying falling trk T spectrum. The uncertainty in the momentum resolution of tracks is determined using the muons as described in Ref. [85]. Tracks are smeared to increase the resolution in simulation within the uncertainties and the impact on the parameter of interest is symmetrised and taken as an uncertainty.    , demonstrating the power of this in situ technique. In the endcaps the uncertainties are slightly larger but are 1% for 30 < T < 100 GeV. Figure 5 shows the uncertainties in the measurements as a function of | trk |. The uncertainties are slightly larger due to the limited number of events in the smaller bins used to make fine calorimeter structures visible, but are otherwise dominated by the same sources as in the measurements as a function of T .
Having established the full uncertainties in the measurement, a calibration can be derived which can form an input to the high-T jet uncertainties [3] and also to the scale uncertainties for hadronic -lepton decays [19]. Figure 6 shows the measured data to simulation ratio of the response and its uncertainty in different trk T bins in the central and endcap regions. It is expected that such a calibration will be smooth across T . The same procedure as used for the jet energy scale is used [3,87] to translate the binned measurements and uncertainties into smooth uncertainty eigenvectors and a smooth calibration curve. Each of the measurements is divided into finer bins of 0.1 GeV using second-order polynomial splines. The final calibration curve is determined by smoothing the measurements with a Gaussian kernel of varying width. Each individual uncertainty component is treated as correlated across T and is then propagated through the same procedure after varying it by 1 . The difference between the calibration curve with   the shifted systematic uncertainty input and the nominal calibration curve is taken as 1 in the varied uncertainty. Each uncertainty source is treated as correlated across T and uncorrelated with all other sources of uncertainty. This smoothing procedure reduces statistical fluctuations in the central values and in each propagated uncertainty component. Therefore, the final uncertainties are slightly smaller than those for the corresponding individual measurement points. The resulting smooth calibration curve and uncertainty band are shown in Figure 6. Figure 7 shows the data/MC ratio as a function of | trk | for the three measured T ranges along with the associated uncertainties. As the changes in detector technology can result in sharp features in the calibration as a function of | trk | a smooth calibration curve is not derived for these results.

Width of the EM T / trk T distribution
The width of the EM T / trk T distribution probes the resolution of both the tracker and the calorimeter; however, the tracking component is relatively small outside of the highest energy bins. The width will also contain components from the modelling of pile-up noise falling into the region of the calorimeter  Figure 7: The measured data to simulation ratio of the response as a function of | trk | for the three trk T ranges: 30 < trk T < 50 GeV, 50 < trk T < 70 GeV and 70 < trk T < 100 GeV. Inner error bars represent uncertainties from limited sample size and the outer error bars give the total uncertainty including both the statistical errors and systematic uncertainties. considered in the measurement. These can be reliably corrected for in measurements of the scale, but the effect on the resolution is much harder to mitigate, so the level of agreement between simulation and data, which both include these effects, is considered. The increase in the width of EM T / trk T due to the effects of pile-up is measured by looking at the difference between the width in two simulated samples with PU ∼ 20 and PU ∼ 50. For the central (endcap) region the difference is found to be 8% (16%) for 16 < trk T < 20 GeV and 3% (12%) for 30 < trk T < 150 GeV. The impact of mismodelling the pile-up is expected to be small for most of the trk T spectrum since the simulation includes pile-up and should capture most of the effect. Figure 8 shows the relative width of the EM T / trk T distribution as a function of track T in the central and endcap regions. In the central region the simulation shows about 10% better resolution than the data, while in the endcaps the level of agreement is generally better. The relative width across | | is shown for three trk T bins in Figure 9. In both figures the resolution for tracks selected in simulation is found to contribute significantly at high momentum, but across much of the spectrum it is significantly smaller than the calorimeter contributions, indicating that the discrepancies between simulation and data are more likely due to the simulation of the calorimeter response.
In the central region where the statistical uncertainties are lower, 30 < trk T < 150 GeV, it is possible to check if any of the systematic effects considered in Section 6 could be the source of the discrepancies. The closure of the fits to obtain the width of the signal component is very good, with differences only at the level of 2% at trk T of 30 GeV and 1% above 50 GeV. Tightening the selections to reject electrons and muons has a minimal effect on the measured resolution, 0.5%. Shifting the response scale of the background and smearing the background response are each seen to change the resolution in data by 1% in this region of phase space. Tightening the criteria to reject non-lepton backgrounds also results in small changes of 0.5% beyond 30 GeV. Therefore, resolution discrepancies of ∼10% between data and simulation are significant in this trk T range compared to the potential systematic effects on the extraction of the EM T / trk T width. Below 30 GeV the fit closure grows to up to 6% and the uncertainties related to the fitted background shape and the non-lepton backgrounds all increase to 1.5%, so that at low momentum the discrepancies are less significant.
This supports the observation in the jet resolution measurement using T balance in dĳet events that the resolution in simulation is slightly superior to that in data [3] at medium to high T and this is therefore an area where it is desirable to improve the simulation.   Figure 9: The fitted width divided by the mean of the signal EM T / trk T distribution as a function of the track | | in three trk T ranges: 30 < trk T < 50 GeV, 50 < trk T < 70 GeV and 70 < trk T < 100 GeV. Also shown is the fitted width of the track T divided by the generator-level T for true → ± events in simulation to give an illustration of the contribution to the total width from the resolution of reconstructed tracks. The uncertainties shown are only those from the limited number of events in the dataset and simulated samples.

Measurements of the longitudinal energy profile and late showering particles
The longitudinal segmentation of the calorimeter into layers can be used to gain insight into the cause of the discrepancies in the scale and resolution between data and simulation. Backgrounds can bias any measurements of average energies deposited in a single layer. In the electromagnetic calorimeter the main background, ± → ± + 0 + , will result in significant biases as the 0 → showers will be contained within the electromagnetic calorimeter. Therefore, only the energy deposited in the hadronic calorimeter layers is considered, although these measurements are still slightly biased by this background through selection effects because the events are required to pass the hadronic energy fraction requirement. The hadronic calorimeter is less affected by pile-up since low-momentum hadrons are often contained within the electromagnetic calorimeter in addition to the pile-up particles that produce electromagnetic showers. Figure 10 shows the average energy deposited in each layer of the hadronic calorimeter divided by the track momentum in the barrel and endcap regions as a function of the track T . Events are required to have 0.2 < EM T / trk T < 1.1 to select events across the full response and also minimise the contribution from backgrounds. An arithmetic mean of all the selected events is used because the distribution of the energy in an individual layer is not expected to have a Gaussian shape. It can be seen that in the barrel the simulation overestimates the relative energy in the hadronic calorimeter by 3-5% across the T spectrum and this is the case in all layers, although most of the energy is contained within the first two layers. In the endcap region the amount of energy in the hadronic layers is slightly underestimated overall. This is driven by an underestimation of the energy in the first layer. Both of these features are in line with the results in Section 5; however, the magnitude of the difference in the energy deposited does not explain the size of the effect seen in the total response, indicating that there is mismodelling either in both the electromagnetic and hadronic calorimeters or in the length of the shower, or in a combination of these or other effects. The results are seen to be largely unaffected by pile-up. These results can be used for the tuning of the detector simulation.
A second method of using the longitudinal segmentation to probe the sources of the discrepancies is to select events where the pion traverses the electromagnetic calorimeter as a minimally ionising particle before showering in the hadronic calorimeter layers. Such events are characterised by having a very large fraction of their energy in the hadronic calorimeter. Selecting events in which more than 85% of the summed cluster energy is in the hadronic calorimeters, had > 85%, allows the data-to-MC response ratio to be probed in the hadronic calorimeters. Figure 11 shows the fitted response after applying this selection criterion in the barrel and endcap regions. In the barrel region the data-to-MC response ratio is seen to be similar to the inclusive response ratio; however, larger discrepancies are seen in the endcaps after this selection, with the difference between data and simulation increasing from ∼4% to ∼8%. This suggests that the simulation models hadronic showers in the hadronic calorimeter less accurately in the endcaps than in the inclusive case.

Conclusion
The energy response of the ATLAS calorimeter has been measured across a wide range of pion momenta with high precision using an innovative technique based on isolated charged pions from the process ± → ± (→ ± + ) . The calorimeter response is observed to be overestimated by ∼2% in the central region and underestimated by ∼4% in the endcaps in the ATLAS simulation. This supports the observations in the measurement of the jet energy scale, which show a similar structure across the calorimeter [3]. The precision of the measurement of the response is 1% for 15 < T < 185 GeV (20 < T < 100 GeV) in the central (endcap) regions, and is <0.6% in the most precise region, 20 < T < 120 GeV in the barrel.
The width of the EM T / trk T distribution has also been investigated. This measurement convolves the tracking and calorimeter resolutions. For this quantity the simulation is found to agree with the data to within 15%; however, the data are generally observed to have a wider distribution. Finally, the average energy in the layers of the hadronic calorimeter was probed as a method of investigating the cause of the discrepancies between the simulation and the data. These measurements will be used in the tuning of the ATLAS simulation to further improve the hadron shower modelling and detector geometry.
This powerful new method of measuring the hadronic energy response of the ATLAS calorimeter can achieve a precision of 1% for the response of the calorimeter to charged pions. This detailed understanding of the response of the calorimeter can be used in gaining better understanding of the jet energy scale, its uncertainty for the highest-T jets, the energy scale of hadronically decaying -leptons, and in measuring the properties of jets.