Studies of the muon momentum calibration and performance of the ATLAS detector with pp collisions at s=13\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\sqrt{s}=13$$\end{document} TeV

This paper presents the muon momentum calibration and performance studies for the ATLAS detector based on the pp collisions data sample produced at s\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\sqrt{s}$$\end{document} = 13 TeV at the LHC during Run 2 and corresponding to an integrated luminosity of 139 fb-1\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${\textrm{fb}}^{-1}$$\end{document}. An innovative approach is used to correct for potential charge-dependent momentum biases related to the knowledge of the detector geometry, using the Z→μ+μ-\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$Z\rightarrow \mu ^{+}\mu ^{-}$$\end{document} resonance. The muon momentum scale and resolution are measured using samples of J/ψ→μ+μ-\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$J/\psi \rightarrow \mu ^{+}\mu ^{-}$$\end{document} and Z→μ+μ-\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$Z\rightarrow \mu ^{+}\mu ^{-}$$\end{document} events. A calibration procedure is defined and applied to simulated data to match the performance measured in real data. The calibration is validated using an independent sample of Υ→μ+μ-\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\Upsilon \rightarrow \mu ^{+}\mu ^{-}$$\end{document} events. At the Z(J/ψ)\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$(J/\psi )$$\end{document} peak, the momentum scale is measured with an uncertainty at the 0.05% (0.1%) level, and the resolution is measured with an uncertainty at the 1.5% (2%) level. The charge-dependent bias is removed with a dedicated in situ correction for momenta up to 450 GeV with a precision better than 0.03 TeV-1\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${\textrm{TeV}}^{-1}$$\end{document}.


Introduction
Muons are a crucial component of the physics programme of the ATLAS experiment [1] at the Large Hadron Collider (LHC) [2]. The discovery of the Higgs boson [3,4] and the measurement of its properties [5], precision tests of the Standard Model [6], and searches for physics processes beyond the Standard Model [7,8] all strongly rely on the performance of muon identification and measurement with the ATLAS detector.
During its second data-taking campaign (Run 2; 2015-2018), ATLAS collected 139 fb −1 of collisions at a centre-of-mass energy of √ = 13 TeV, a data sample of unprecedented size. A recent publication presented the performance of the new and optimised muon reconstruction and identification techniques developed for the analysis of the full Run 2 data sample [9]. The muon momentum measurement was previously published for early Run 2 data, corresponding to 3.2 fb −1 collected in 2015 [10].
This paper describes the methods used to calibrate the momentum measurement of muons reconstructed by the ATLAS detector for the full Run 2 data sample. First, the potential charge-dependent bias on the scale of the muon momentum measurement, introduced by the imperfect knowledge of the real detector geometry, is measured in reconstructed collision data with a sample of → events. 1 A dedicated 1 Charges of decay products are omitted in the following for simplicity.
correction is derived and applied to the data. Then, J/ → and → decays are selected in data and used to measure the resolution and scale of the muon momentum, which are compared with those predicted by simulation. A calibration procedure is applied to simulated events, to improve the agreement between the simulation and the data. Finally, a validation procedure of this calibration is performed using J/ → and → events, in addition to an independent sample of → decays.
In contrast to previous publications, this paper presents a completely new methodology to determine the charge-dependent bias, sensitive to global shifts in the detectors' positions, which could not be measured with previous techniques. Furthermore, the inclusion of new J/ → data collected with dedicated trigger strategies, the development of new fitting techniques with better convergence properties, and the significantly larger data sample result in an unprecedented precision of the momentum calibration procedure. In addition to the calibrations developed separately for the tracks measured in the inner detector and in the muon spectrometer of the ATLAS detector, for the first time a dedicated calibration for tracks obtained by combining information from both sub-detectors is derived. The validation of the calibration procedure is performed using → events as well as J/ → and → events in regions with finer granularity than those used in the calibration procedure. This paper is structured as follows: the ATLAS detector is described in Section 2; the simulated and real data samples used for the measurements are presented in Section 3; the identification of muon candidates is discussed in Section 4; the methodologies for measuring the muon scale and momentum corrections are presented in Section 5; the results of the measurement and the validation of the data-driven corrections derived for simulated samples are presented in Section 6; final conclusions are provided in Section 7.

ATLAS detector
The ATLAS detector [1] at the LHC covers nearly the entire solid angle around the collision point. 2 It consists of an inner tracking detector surrounded by a thin superconducting solenoid, electromagnetic and hadron calorimeters, and a muon spectrometer incorporating three large superconducting air-core toroidal magnets. The inner detector system (ID) is immersed in a 2 T axial magnetic field (bending the tracks of charged particles in the transverse plane) and provides charged-particle tracking in the range of | | < 2.5. The high-granularity silicon pixel detector covers the vertex region and typically provides four measurements per track. The first hit is normally found in the insertable B-layer (IBL) installed before Run 2 [11,12]. It is followed by the silicon microstrip tracker (SCT), which typically provides eight measurements per track. These silicon detectors are complemented by the transition radiation tracker (TRT), which enables radially extended track reconstruction up to | | = 2.0, typically providing 30 measurements per track. The TRT also provides electron identification information based on the fraction of hits above a higher energy-deposit threshold corresponding to transition radiation.
The calorimeter system covers the pseudorapidity range | | < 4.9. In the region | | < 3.2, electromagnetic calorimetry is provided by barrel and endcap high-granularity lead/liquid-argon (LAr) calorimeters, with 2 ATLAS uses a right-handed coordinate system with its origin at the nominal interaction point (IP) in the centre of the detector and the -axis along the beam pipe. The -axis points from the IP to the centre of the LHC ring, and the -axis points upwards. Cylindrical coordinates ( , ) are used in the transverse plane, being the azimuthal angle around the -axis. The pseudorapidity is defined in terms of the polar angle as = − ln tan( /2). Angular distance is measured in units of an additional thin LAr presampler covering | | < 1.8 to correct for energy loss in material upstream of the calorimeters. Hadron calorimetry is provided by the steel/scintillator-tile calorimeter, segmented into three barrel structures with | | < 1.7, and two copper/LAr hadron endcap calorimeters. The solid angle coverage is completed with forward copper/LAr and tungsten/LAr calorimeter modules optimised for electromagnetic and hadronic energy measurements respectively.
The muon spectrometer (MS) comprises separate trigger and high-precision tracking chambers measuring the deflection of muons in a magnetic field generated by the superconducting air-core toroidal magnets, which bend charged particles tracks in the -plane, i.e. in the plane formed by the muon momentum vector and the beam axis. The field integral of the toroids ranges between 2.0 and 6.0 T m across most of the detector. Three layers of precision chambers, each consisting of layers of monitored drift tubes, cover the region | | < 2.7, complemented by cathode-strip chambers in the forward region, where the background is highest. The muon trigger system covers the range | | < 2.4 with resistive-plate chambers in the barrel, and thin-gap chambers in the endcap regions.
Events are selected by the first-level trigger system implemented in custom hardware, followed by selections made by algorithms implemented in software in the high-level trigger [13]. The first-level trigger accepts events from the 40 MHz bunch crossings at a rate below 100 kHz, which the high-level trigger further reduces to record events to disk at about 1 kHz.
An extensive software suite [14] is used in data simulation, in the reconstruction and analysis of real and simulated data, in detector operations, and in the trigger and data acquisition systems of the experiment.

Data sample
The analysis is performed using the √ =13 TeV collision data sample recorded by the ATLAS detector between 2015 and 2018. Only events collected in stable beam conditions and with all relevant ATLAS magnets and detector subsystems fully operational are used in the analysis. A combination of single-muon trigger algorithms (for → events) and trigger algorithms dedicated to J/ → and → topologies [15] were used. Starting from the middle of 2016, due to the increasing instantaneous luminosity of the LHC, events associated with J/ → and → triggers were collected with a dedicated data acquisition stream for -physics and low-mass final states, separated from the main physics stream so that their reconstruction could be delayed if the availability of processing resources was limited [16]. This additional stream is analysed for a muon momentum calibration for the first time in this paper, significantly increasing the number of J/ → and → events available. For the data sample used in this analysis a procedure to calibrate the muon detectors was applied, as described in Ref. [17]. The analysed data corresponds to an integrated luminosity of 139 fb −1 after trigger and data-quality requirements, with an average number of collisions per bunch crossing < >= 33.7. The calibration derived in the analysis does not show any significant dependence on < >.

Simulated samples
Samples of Monte Carlo (MC) simulated inclusive prompt production of J/ → and Ψ(2 ) → events were generated at leading-order (LO) accuracy using Pythia 8.186 [18], for matrix element (ME) calculations, and for the modelling of the parton shower, hadronisation, and the underlying event. The on the radius [49] and a perfect geometry for the MS. They also include samples with the positions of the middle layer of the MS chambers shifted by amounts corresponding to the residual alignment precision [50].

Muon reconstruction and identification
The muon reconstruction is described in detail in Ref. [9]; only a brief summary is provided here. Muons are reconstructed using information from the ID and/or MS sub-detectors, which provide two independent measurements of muons crossing the ATLAS detector. In this analysis, the following reconstruction algorithms are used.
• ID tracks: tracks reconstructed using ID hits only.
• Standalone tracks (SA): muon candidates obtained using MS hits only; to define kinematic quantities at the interaction point, SA tracks are extrapolated to the beam-spot taking into account energy loss and multiple scattering in the material upstream of the muon spectrometer.
• Combined tracks (CB): muon candidates obtained by starting from a MS track and matching it to an ID track. A combined fit to the hits, taking into account energy loss in the calorimeters and multiple scattering in the spectrometer, is performed.
Several working points (WPs) are introduced in Ref. [9], defining sets of quality cuts applied to reconstructed muons. The Medium WP is the baseline for this analysis; this algorithm only selects CB muons in | | < 2.5 and applies a set of requirements to reject poorly reconstructed tracks. The High-T WP applies tighter cuts on CB muons to ensure an optimal muon reconstruction for analyses interested in the very high T regime, with a better momentum resolution. Other working points target either lower background rates, at the expense of a lower efficiency, or higher efficiency but with a lower purity for muon identification; these WPs have a momentum resolution performance similar to the Medium one. The individual ID and SA tracks associated with the CB muons are still accessible and used in the analysis, as discussed in the following. For simplicity, muon candidates reconstructed using the ID, MS or combined ID+MS information are referred to as ID, MS or CB muons, respectively.

Event selection
Proton-proton collision vertices are constructed from reconstructed trajectories of charged particles in the ID with transverse momentum T > 500 MeV. Events are required to have at least one collision vertex. The vertex with the highest 2 T of reconstructed tracks is selected as the primary vertex of the hard interaction. The data are subjected to quality requirements to reject events in which detector components were not operating correctly [51].
Events are required to be selected by a combination of the following trigger chains: • For → candidate events, unprescaled single-muon trigger chains with the lowest kinematic threshold available in each data sample are used.
-Data collected in 2015: at least one muon with T > 20 GeV and passing a loose isolation requirement based on the scalar sum of the T of tracks in a cone around the muon candidate track. Events are also retained if selected by a second chain requiring at least one muon with T > 40 GeV without any isolation requirement. -Data collected in 2016, 2017, 2018: a similar strategy as the one used in 2015 was employed, but with the first chain requiring a muon with T > 26 GeV passing a tighter isolation requirement, and the second chain requiring at least one muon with T > 50 GeV.
• For J/ → and → candidate events, trigger chains requiring at least two muons in the event were used. The trigger algorithms also performed common vertex fits to pairs of oppositely charged muon candidates, requiring at least one of the fitted vertices to satisfy fit quality criteria and have an invariant mass consistent with that of a J/ or resonance [15]. The last requirement is significantly looser than that applied by the offline analysis, to avoid introducing a bias. The trigger algorithms applied the following kinematic requirements on the muons associated with the candidate resonance: -Data collected in 2015: both muons must satisfy T > 4 GeV.
-Data collected in 2016: both muons must satisfy T > 6 GeV.
-Data collected in 2017 and 2018: both muons must satisfy T > 6 GeV. Additional requirements were also applied on the angular distance between the two muons associated with the vertex and on the displacement of the fitted vertex in the transverse plane. Events were also selected using auxiliary trigger algorithms requiring the leading (sub-leading) muon to have T > 11 GeV ( T > 6 GeV) but without any additional requirement.
Events are required to have two oppositely charged CB muons satisfying the Medium WP requirements and spatially matched with the muon candidates reconstructed by the respective trigger chains. Further kinematic requirements are applied to ensure that selected muons are in the regions where the triggers used for the analysis were fully efficient: • For → candidate events, the leading muon must satisfy T > 27 GeV.
• For J/ → and → candidate events, both muons must satisfy T > 6. 25 GeV. For events selected only by the auxiliary trigger chain in the 2017 and 2018 data samples, the requirement on the leading muon is changed to T > 11.25 GeV.
For each muon, requirements are also applied on the transverse ( 0 ) and longitudinal ( 0 ) impact parameters. The 0 of a charged-particle track is defined in the plane transverse to the beam as the distance from the primary vertex to the track's point of closest approach. The 0 is the distance in the direction between this track point and the primary vertex; this distance is represented by 0 sin in the longitudinal projection. The candidate muons are required to satisfy: representing the uncertainty in the measured 0 value; • | 0 sin | <0.5 mm.
After application of these selections, the non-prompt component of J/ → production is reduced to less than 0.5%.

Momentum scale corrections
The muon momentum scale and resolution are studied using J/ → and → decays in data and simulated samples. While dedicated procedures are applied to correct the detector alignment [1,49,50], residual misalignments introduce a charge-dependent bias in the momentum measurement. Studies of this effect are discussed in Section 5.1, together with the determination of an appropriate set of corrections for the data. Simulated samples assume ideal detector alignment and thus do not require these corrections.
Although the simulation contains an accurate description of the ATLAS detector, the level of detail is not sufficient to reach the accuracy of 0.1% on the muon momentum scale and the percent level precision on the resolution measured in data. The analysis of this sample improves upon the excellent simulated description of the ATLAS detector and its interaction with muons, encompassing the best knowledge of the detector geometry, material distribution, and physics modelling at the time of event simulation. Section 5.2 details the measurement of momentum scale and resolution in data and simulation, and the determination of corresponding corrections for the simulated samples. Validation studies for the corrections are presented in Section 6.

Charge-dependent momentum scale calibration in data
After dedicated detector alignment procedures are applied, residual effects can still introduce a bias in the momentum measurement of the muon. These effects arise from both the ID and the MS sub-detectors.
The ID alignment is derived by a global 2 minimisation of the track-hit residuals [49]. Residual detector displacements relative to the nominal detector geometry may however still be present, the weak modes, that are not fully corrected for by the current alignment procedures. ID alignment studies of these modes are periodically performed and data is reconstructed with an improved description of the detector geometry. Nevertheless, residual effects after this correction procedure still prevent the best possible precision being attained, motivating the study discussed in the following.
For the MS, the alignment procedures are less sensitive to charge-dependent biases, though residual effects may remain due to the limited precision of the procedure. An optical alignment system [1] monitors the position of the muon chambers relative to each other and relative to fiducial marks in the detector. The system compensates for chamber position variations as the data is collected. However, several aspects limit the accuracy of this system, including access shafts for detector operations and construction precision in some detector regions. Additionally, the analysis of dedicated data runs, either collecting cosmic-ray data or in proton-proton collisions with the magnetic field generated by the toroid systems switched off, allows the precision of the relative position of the muon chambers to be further constrained. This leads to a precision at the level of tens of micrometers on the sagitta of the muon track, and up to 120-130 m in specific detector regions [50]. This residual uncertainty in the alignment system manifests as charge-dependent effects in the muon spectrometer reconstruction procedure.
The precision of the muon momentum reconstruction can be further improved with a dedicated correction procedure accounting for the ID weak modes and MS alignment system uncertainties, which would otherwise degrade the momentum resolution. The charge-dependent bias can be approximated as: where = ±1 is the charge of the muon, andˆare the corrected and uncorrected momentum of the muon, respectively, and is the strength of the bias. This parameterisation forms the basis for developing a correction to the data, recovering the residual bias and improving the momentum resolution.
To estimate (and later correct) the bias, the large sample of → decays is used. The mass of the dimuon system, , can be expressed as a function of the positively and negatively charged muons' transverse momenta + T and − T , respectively: where Δ and Δ are the differences between the pseudorapidity and azimuthal angles of the two muons. This expression is valid in the approximation of the muon mass being negligible with respect to the energies of the muons, and has been verified to hold in the context of these studies to sub-MeV precision. The bias is parameterised as a function of a grid of 48×48 equal-sized -detector regions, a granularity chosen to ensure that the measurement is not affected by large statistical fluctuations while still correctly reflecting local biases or deformations. This high granularity allows the transverse momentum T of the measurement to be used instead of , which is necessary when comparing ID and MS results, since the two sub-detectors have different bending planes due to their respective magnet systems. Combining the previous two equations, the biased invariant massˆ2 can be expressed as: . ( Assuming the bias is small, Equation 3 can be approximated as: Since + T ∼ − T in boson decays, charge-dependent biases largely cancel in the expression of 2 , hence do not impact the average dimuon mass over all detector regions, but only broaden the resolution of the → peak. Therefore, the reconstructed invariant mass distribution defined in Equation 4 is sensitive to the bias through its impact on the variance of the distribution. The values of the sagitta biaseŝ ( , ) are evaluated by minimising the variance of the invariant mass distributions; then, the biased momentum of the muonˆT is corrected using the following equation: Finally, the updated T values are used to recalculate the invariant mass distribution and a new iteration is started. Given the large number of degrees of freedom induced by the dependence of the biases on and , a direct solution of the equation becomes impractical. Instead, an iterative approach is chosen. At each iteration the values ofˆ( , ) obtained from the previous iteration are used as an input toˆT of the next.
A complementary method to estimate this bias is also defined by approximating the average T of the muon to half the invariant mass of the dimuon pair. In this case, the bias can be approximated by: where ⟨ ⟩ is the average of the invariant mass of the dimuon pairs used to derive the correction, whilê ( , ) is the average invariant mass of the dimuon pairs when the muon with the highest transverse momentum is in the given ( , ) region.
Both methods additionally assume that the kinematic properties of positively and negatively charged muons produced by decays are symmetrical. In addition to the small asymmetry introduced by the weak mixing angle, this assumption breaks down due to the detector acceptance and selection requirements restricting the phase space of the selected muon pairs. The impact of these acceptance and kinematic restrictions can be estimated by applying the same procedure to events simulated with ideal detector alignment, obtaining a residual bias term MC . The data measured in data can be corrected by subtracting the values measured in MC simulation, MC , to define the final corrections: Typical values for MC ( , ) range from up to 0.01 TeV −1 in the central pseudorapidity region to up to about 0.05 TeV −1 in the forward pseudorapidity region, after event selection. The procedure is applied separately for the muon momentum measured using the CB, ID, and MS information, and dedicated corrections for each measurement are derived. Both procedures target a residual of less than 10 −3 TeV −1 through simultaneous iterative updates on the ID, MS, and CB momenta. The convergence is achieved once the target is reached for all momenta. For the first method this happens after five iterations on average, while the second typically requires 25 iterative updates. In both methods, the angular correlations between the two muons have a small impact on the estimate of , given the fine granularity chosen in × . Figure 1 shows the measured sagitta corrections evaluated on CB momenta, with biases up to 0.4 TeV −1 . To avoid introducing a bias in the measurement due to the value assumed for , the unbiased value of is taken from the data when integrating over all and bins for both methods. Results from the two methods, which rely on different assumptions, are compared. For the first method, a comparable bias for the leading and subleading T is assumed, while for the second method the approximation of + T ≃ − T is less precise at high values. The ( , ) corrections obtained with the two methods have a correlation close to 100%, with an agreement between the two methods at the level of 0.01 TeV −1 or better, with this difference taken as an additional systematic uncertainty. Results for the second method are given in Appendix A.
Using the derived bias results, the transverse momenta of the muon are corrected based on Equation 5 using theˆderived after the last iteration. The biases are re-evaluated after the application of the corrections and the results are shown in Figure 2 for CB momenta. Both methods address relative differences between nearby , regions, as well as biases that are constant across full sub-detector regions. The biases are reduced to less than 2 · 10 −4 TeV −1 in all regions of the detector, validating the method to several orders of magnitude better than the typical size of the correction.
To validate the correction procedure, simulated samples with misaligned detector configurations, as described in Section 3.2, were used. The injected biases were produced by a rotation of the ID detector layers with a linear dependence on the radial distance from the interaction point. The sizes of these biases in the simulation are the same as those observed in data from → decays before applying the correction procedure. Figure 3 uses these biased detector geometry samples to compare the reconstructed and generated momentum distributions, as evaluated before and after the correction, plotted separately for positive and negative muons. The application of the correction reduces the differences between the positively and negatively charged muons, improving the reconstructed T resolution. Further closure tests are performed using simulated samples with distorted geometry on → → and hypothetical narrow resonances decaying into two muons with masses greater than 500 GeV, to estimate the impact of the correction in muon momentum regimes far from that of muons originating from decays. Closure is observed at the same level of precision.
Several sources of systematic uncertainty associated with the correction procedure are studied. They relate to both the accuracy of the measurement of and the assumptions used in the correction procedure.
The systematic uncertainty originating from the residual non-closure of the correction compared with a perfectly aligned detector simulation is evaluated as follows. First, the value obtained after the last iteration on data is used to bias the T in a perfectly aligned simulation. This is done using Equation 5, by flipping the sign of , and using the same granularity in and as used for deriving the correction. The resulting T is compared with the unbiased momentum and the difference is taken as a systematic uncertainty. A second component of the residual non-closure is estimated in simulation by injecting a set of constant known biases and by taking the difference of between the injected and measured . The injected biases range from percent fractions up to twice those observed in the data at the first iteration of the correction procedure; the largest resulting difference is taken as the uncertainty for all bins. Both methods show very good closure for the full range of injected biases. The uncertainties originating from the non-closures amount to about 0.02 TeV −1 , depending on the pseudorapidity region.
A second contribution to the systematic uncertainties arises from extrapolations to momenta outside the phase-space region covered by decays. In fact, the relative contribution of the ID and MS detectors on the CB momentum measurement may vary in a non-trivial way. This may lead to biases on the estimate of and of the correction. In this case, the residual bias is evaluated as the difference between simulation and data and as a function of and T using the alternative method of Equation 6. The difference compared with the simulation is used to account for the kinematic dependencies of the estimate. The resulting uncertainty is small, compared with the non-closure described previously, in the central regions of pseudorapidity. However, in the forward pseudorapidity regions, it becomes sizeable and evolves as 2 × 10 −4 TeV −2 × ( T − 0.045 TeV).
The statistical uncertainties due to the observed and simulated number of decays are also taken into account. These are approximately an order of magnitude smaller than the first two components, but in the forward region they reach up to 0.02 TeV −1 .
Given the lack of sufficient → decays above 450 GeV in data to validate the methodology, the data are not corrected for charge-dependent biases beyond this value. Instead, an uncertainty accounting for the charge-dependent biases before any correction is assigned to the simulated momenta. This is estimated by biasing the simulated muon momenta in a perfectly aligned simulation, according to Equation 5, using the values of as measured before any correction. As done for the non-closure uncertainty, the uncertainty is derived by injecting bias into simulated events and taking the difference between the biased and the non-biased muon momentum values, resulting in an uncertainty of up to 0.4 TeV −1 as shown in Figure 1.

Muon momentum calibration procedure in simulation
In the following, the muon momentum calibration is defined as the procedure used to identify the corrections to the reconstructed muon transverse momenta in simulation in order to match the measurement of the same quantities in data. This procedure is performed after correcting for the charge-dependent bias discussed in the previous section. The transverse momenta of the ID and MS tracks associated with a CB muon, referred to as ID T and MS T , respectively, are used in addition to the transverse momentum of the CB track, referred to as CB T . The corrected transverse momentum after applying the calibration procedure, Cor,Det T (Det= CB, ID, MS), is described as: where MC,Det T is the uncorrected transverse momentum in simulation, are normally distributed random variables with zero mean and unit width, and the terms Δ Det ( , ) and Det ( , ) describe the momentum resolution smearing and scale corrections respectively, applied in a specific ( , ) detector region. A possible Det 2 ( , ) term in the numerator is neglected because it would model effects already corrected in data with the procedure described in Section 5.1.
The corrections described in Equation 8 are defined in -detector regions homogeneous in detector technology and performance. All corrections are divided into 18 pseudorapidity regions. In addition, the CB and MS corrections are divided into two bins separating the two types of MS sectors: those that include the magnet coils (small sectors) and those between two coils (large sectors). The small and large MS sectors are subjected to independent alignment techniques and cover detector areas with different material distribution, leading to scale and resolution differences.
The numerator of Equation 8 describes the momentum scales. The Det 1 ( , ) term corrects for inaccuracy in the description of the magnetic field integral and the dimension of the detector in the direction perpendicular to the magnetic field. The Det 0 ( , ) term models the effect on the CB and MS momentum from the inaccuracy in the simulation of the energy loss in the calorimeter and other material between the interaction point and the exit of the MS. As the energy loss between the interaction point and the ID is negligible, ID 0 ( ) is set to zero [10]. The denominator of Equation 8 describes the momentum smearing that broadens the relative T resolution in simulation, ( T )/ T , to properly describe the data. The corrections to the resolution assume that the relative T resolution can be parameterized as follows: with ⊕ denoting a sum in quadrature. In Equation 9, the first term ( 0 ) mainly accounts for fluctuations of the energy loss in the traversed material; the second term ( 1 ) mainly accounts for multiple scattering, uncertainties related to, and inhomogeneities in, the modelling of the local magnetic field, and length-scale radial expansions of the detector layers; and the third term ( 2 ) mainly describes intrinsic resolution effects caused by the spatial resolution of the hit measurements and residual misalignments between the different detector elements. The energy loss term has a negligible impact on the muon resolution in the momentum range considered, and therefore Δ Det 0 is set to zero. In a second step, to cross-check the validity of the corrections obtained directly for the CB tracks, the corrected combined momenta from ID and MS measurements Corr ID+MS T is also obtained by combining the ID and MS corrected momenta with a weighted average: which assumes that the relative contribution of the two sub-detectors to the combined track remains unchanged before and after momentum corrections.

Determination of the T calibration constants
The CB, ID, and MS correction parameters contained in Equation 8 are extracted from data using a fitting procedure that compares the invariant mass distributions for J/ → and → candidates in data and simulation, selected as discussed in Section 4.
When extracting correction parameters, muons are assigned to -regions of fit (ROFs), which are defined separately for the ID and the MS. The values of Δ ID 0 , Δ MS 0 , Δ CB 0 , and ID 0 , are set to zero as previously discussed, while Δ MS 2 is determined from alignment studies using cosmic-ray data and special runs with the toroidal magnetic field turned off [50], and from detector calibration procedures [17].
The corrections are extracted using the distributions of the dimuon invariant mass, Det . When fitting a specific ROF, one muon is required to belong to the ROF and the other can be anywhere in the detector. To enhance the sensitivity to T -dependent correction effects, the dimuon pair is classified according to the T of the muons. For J/ → decays, the fit is performed in two exclusive categories defined by the subleading muon transverse momentum: 6.25 < Det,sublead Templates for the Det variables are built using J/ → and → simulated signal samples. The first step in the fitting procedure consists of estimating the contribution of the background processes. For the → selection, the small background component (approximately 0.1%) is extracted from simulation and added to the templates. A much larger (up to 15%) non-resonant background from decays of light and heavy hadrons and from continuum Drell-Yan production is present for events selected in the J/ → channel, estimated with a data-driven approach as this background is difficult to simulate. The dimuon invariant mass distribution in data is fitted in each ROF using a Crystal Ball function [52] with an exponential background distribution. This background model is then combined with the simulated signal templates used in the fit.
To estimate the scale and smearing parameters, a multi-stage procedure is used. A two-step random walk minimisation is first performed where the parameters are sampled within a specified range. The parameter configuration that leads to the best agreement for the invariant mass distribution between simulation and data is kept. In the first step, the best agreement is defined by a simplified metric where the compatibility between the means and standard deviations of the distributions is compared. In the second step, a binned 2 is computed that compares the dimuon mass spectrum in simulation to that of data. Around this best-agreement parameter configuration, signal templates are then generated at discrete intervals and interpolated using moment morphing [53]. Using this signal model, a full minimisation using the binned 2 is performed. This procedure is iteratively repeated, where the second muon outside the ROF definition is calibrated using results from the previous iteration, until the change of each parameter stays within the root mean squared of its values as estimated from the last five iterations. This happens after 16 iterations. Each calibration parameter's values from the last five iterations are averaged to produce their final value.
The calibration parameters obtained from the fits to the data are summarised in Tables 1, 2, and 3, averaged over three regions. The sums in quadrature of the statistical and systematic uncertainties are shown, with the latter dominating. All sources of uncertainties are evaluated by varying the parameters of the template fit. The higher uncertainty for Δ ID 1 in the second bin is associated with the region of the ID with the largest amount of material, corresponding to the transition between the barrel and endcap region of its subdetectors. The increase in uncertainty as a function of | | for ID 1 is due to the decrease of the magnetic field integral of the solenoid as a function of | |. The larger uncertainties for some of the MS terms in the 1.05 ≤ | | < 2.0 region are associated with the low bending power of the magnet system of the MS in part of that region.
The main contributions to the final systematic uncertainty are: • / only and only fit: The fit is repeated using only the J/ → or → decays with only Det 1 left as a free parameter and the other parameters fixed to their nominal fitted value. The resulting uncertainty is defined by taking the difference relative to the nominal fitted Det 1 and accounts for the extrapolation from the regions dominated by either the J/ → or the → in their respective T spectra. • kinematic reweighting: As discussed in Section 3.2, the reweighting maps are derived separately for each simulated sample. To evaluate the impact of the reweighting on the analysis, new templates of ID , CB , and MS are produced. The results obtained when using the different → samples with their respective reweighting scheme and the non-reweighted results for the nominal → samples are used to derive the uncertainty. This systematic uncertainty impacts the overall lineshape by changing the relative weight of various momentum regimes and, by extension, the contributions of their dedicated corrections in the final fit.
• Decay modelling and final state radiation modelling: The nominal simulated sample of → events, of which the QED final-state radiation is handled by Photos++, is compared with → events simulated with the Sherpa 2.2.1 sample which uses an alternative radiative modelling. The impact of the uncertainties in the decay modelling are significantly smaller than those originating from the components discussed above, partially due to the already discussed kinematic reweighting.
• / and T template range: The T ranges of the / and templates are varied. For / , the ranges of the two nominal T templates are / ,nom T ∈ [6. 25,9] GeV and / ,nom T ∈ [9, 20] GeV.
For , the ranges of two templates are ,nom T ∈ [20,50] GeV and ,nom T ∈ [50, 300] GeV. The boundary of the two templates for / is varied from 9 to 8 and 12 GeV, and for is varied from 50 to 40 and 80 GeV. This preserves the numbers of events, and covers any additional T dependencies.
• Mass binning of / and : The number of bins for the templates are reduced from 200 to 150 for , and from 90 to 60 for / . This systematic uncertainty covers any binning effect in generating templates.
• mass window: New templates of are generated for the → sample changing the window selection to 75 < < 115 GeV for all track types. The high and low mass regions away from the resonant pole of the are more sensitive to initial-and final-state radiation contributions, the running of EM , and other specific choices of the MC generators used in the → simulation.
• / mass window: The same approach as in the case of the is applied, changing the invariant mass window selection to 2.75 < < 3.4 GeV for all track types. The region furthest from the / peak is more sensitive to the shape variation of the combinatorial background, and the specific choices of the MC generators used in the J/ → simulation.  • / background: New templates are generated for the J/ → samples using a background parameterisation different from the exponential function. For this systematic uncertainty, background parameterisations including a combination of two Crystal ball functions and a Chebyshev function are used.
• Closure test and statistical uncertainty: The root mean squared of the last five iterations of each calibration parameter is used as the statistical uncertainty of the method. This accounts for any instabilities in the fit, and the correlations between different parameters. It was observed that small changes in the starting parameters of the fit do not affect its stability.
The largest component of the total uncertainty in the momentum scale originates from the comparison of the J/ and only fits. The remaining components are smaller by about a factor of two. For the resolution smearing terms, in the same kinematic regime, the largest uncertainty comes from the J/ and T template fit range. At significantly higher momenta the uncertainty in the J/ and T template fit range still dominates the resolution uncertainty in Cor,CB T .
As an additional check, the value of Cor,ID+MS T is compared with Cor,CB T . Results from both methods are in agreement within their respective systematic uncertainties. The comparison is performed on simulated Table 3: Summary of CB momentum resolution and scale corrections for small and large MS sectors, averaged over three main detector regions. The corrections for large and small MS sectors are derived in 18 pseudorapidity regions, as described in Section 5, and averaged, with each region assigned a weight proportional to its width. The energy loss term Δ CB 0 is negligible and therefore fixed to zero in the fit for all . The uncertainties represent the sum in quadrature of the statistical and systematic uncertainties.  The resolution as a function of the muon momentum is shown in Figure 4, (a) for muons in the barrel region (| | < 1.05) and (b) for all other muons (| | > 1.05). The resolution is estimated by taking half of the smallest interval containing the 68% of the distribution of the relative difference between the corrected and the generated single muon momentum in simulation for muons satisfying the High-T WP. The resolution is approximately 2% (3%) at T values of 45 GeV in the | | < 1.05 (| | > 1.05) region and increases with T . The resulting CB track momentum resolution is always better than that of the individual measurements in the ID or the MS. At low T the ID dominates while in the intermediate regime of few tens of GeV to a few hundreds of GeV both detector systems contribute equally to the measurement. At T values higher than few hundred GeV the resolution is dominated by the MS. The resolution is approximately 14% (19%) at 1 TeV in the | | < 1.05 (| | > 1.05) region.

Performance studies
The samples of J/ → , → , and → decays are used to validate the momentum corrections obtained with the template fit method described in the previous sections and measure the muon momentum reconstruction performance. The detector segmentation applied to J/ → and → decays is chosen to be at least twice as fine as that used for deriving the simulation corrections. Furthermore, the momentum requirements used in both cases are looser than those used for deriving the corrections. The combination of the finer segmentation and looser requirements allows the data-driven validation of the correction methods both within the bins assumed in the template fits and extrapolated in T beyond the range of the calibration procedure. To complete the data-driven validation and performance measurements, an additional study is performed using → decays, which are statistically fully independent of the samples used to derived the calibration corrections.
The invariant mass distributions for the J/ → and → candidates are shown in Figure 5 and compared with corrected simulation. The lineshapes of the two resonances in simulation agree with the data within the systematic uncertainties, demonstrating the overall effectiveness of the T calibration.
When the two muons have similar momentum resolution and angular effects are neglected, the relative mass resolution, / , is directly proportional to the relative muon momentum resolution, / : Similarly, the total muon momentum scale, defined as = ⟨( meas − true )/ true ⟩, is directly related to the dimuon mass scale, defined as = ⟨( meas − true )/ true ⟩: where 1 and 2 are the momentum scales of the two muons. The effectiveness of the momentum calibration is also measured by comparing the mean value , and resolution ( ) of the dimuon mass resonances. To measure such quantities, fully analytical fit functions are created for each resonance modelling both the signal and the background. Using the same definition of the resolution function for all resonances, this methodology allows a direct comparison of the resolution quantities as extracted separately for each resonance. In turn, this allows a precise measurement of the momentum reconstruction performance across a wide kinematic regime. The fitting procedures used to measure the mean value and resolution quantities are optimised, subsequently, for each resonance.
In J/ → decays, the intrinsic width of the resonance is negligible compared with the experimental resolution. In contrast to Section 5.2, the mass window for the J/ → validation fits is modified to 2.8 < < 3.9 GeV. The lower bound is raised to remove the abrupt spectrum sculpting due to the trigger criteria, which is difficult to model well analytically. The upper bound is extended to include the Ψ(2 ) resonance and add a lever arm to constrain the estimation of the resolution.
The resonant peak of the J/ is modelled by a double-sided Crystal Ball function; the four parameters modelling the tails are extracted from simulation, and subsequently fixed in the fits to data. A secondary double-sided Crystal Ball function models the Ψ(2 ) resonance that follows the 1 resonance in the J/ validation mass window. Similarly to the 1 resonance, the parameters modelling the tails are extracted from simulation and kept fixed when fitting to data. The mean value parameter of the resolution function is kept free floating for the Ψ(2 ) resonance for fits on simulated and real data, while the resolution parameter is constrained to scale linearly with the resolution parameter of the 1 resonance using the ratio of the mean parameters. The non-resonant background or the background from mis-identified muons is described by an exponential function.
In → decays, the fits use a convolution of the true lineshape (modelled by a Breit-Wigner function) with an experimental resolution function (a double-sided Crystal Ball). A fit range of 75 < < 105 GeV is used. Similarly to the J/ , the non-resonant background is described by an exponential function. The peak position and width of the Crystal Ball function are used as estimators for the and ( ) variables in each of the and T bins. Figure 6 shows the position of the mean value of the invariant mass distribution, , obtained from the fits to the boson and J/ samples as a function of the pseudorapidity of the highest-T muon for each decay. The distributions are shown for data and corrected simulation, with the ratios of the two in the lower panels. The simulation is in good agreement with the data. Minor deviations are within the momentum scale systematic uncertainties of 0.05% in the barrel region increasing with | | to 0.15% for J/ → decays, and to 0.05% in the barrel region increasing with | | to 0.1% for → decays. The systematic uncertainties shown in the plots include the effects of the uncertainties in the calibration constants described in Section 5.2. The observed level of agreement demonstrates that the T calibration for combined muon tracks described above provides an accurate description of the momentum scale in all regions, over a wide T range. Similar levels of data/MC agreement are observed for the ID and MS components of the combined tracks.  ) as a function of the leading-muon for the two resonances. The dimuon mass resolution is about 1.3% and 1.6% at small values for the J/ and bosons, respectively, and increases to 2.1% and 2.4% in the endcaps. This corresponds to a relative muon T resolution of 1.8% and 2.3% in the centre of the detector and 3.0% and 3.4% in the endcaps for J/ and boson decays, respectively. Uncertainties in the dimuon mass resolution range between 2% and 5% for the J/ and between 3% and about 6% for the boson, depending on the detector region.
Using the same methodologies as for the J/ → decays, the measurement of the scale dependence and resolution is repeated on the fully independent set of data from → decays. The same approach as for the J/ resonances models the 2 and 3 resonances of the Υ. This entails constraining the resolution parameter of the 3 resonance to scale linearly with that of the 2 resonance and that of the 2 resonance to scale linearly with that of the 1 resonance. The mean parameters of the resolution models of the 2 and 3 are kept free independently in data and simulation, similarly to what is done for the J/ resonance. The background is, however, modelled using a polynomial function derived from data. Figure 8 shows the data and simulation agreement after applying the momentum corrections on simulation for the invariant mass distributions of the three resonances of the → . As done for the J/ → and → decays, the momentum scale dependency and the resolution are extracted as a function of the pseudorapidity of the leading muon in the event. Figure 9 compares the data and simulation after all corrections. The data agree with the simulation within the systematic uncertainties. The uncertainty in the momentum scale is up to 0.05% in the central region and up to 0.1% in the forward region. Similarly, the extracted resolution is in agreement within about 3%. Good agreement between the dimuon mass resolution measured in data and simulation is also observed when deriving the corrections independently for the ID and MS components of the combined tracks, as shown in Appendix B.
The relative dimuon mass resolution / is approximately proportional to the average momentum of the muons, as shown in Equation 12. A direct comparison of the momentum resolution function determined with J/ , Υ, and boson decays can therefore be performed. To remove the effect of the correlation between the measurement of the dimuon mass resolution and the T of the muons, the following definition of transverse momentum is used: * T =ˆ√︄ sin 1 sin 2 2(1 − cos 12  whereˆis a fixed value, corresponding to the best known value of the mass of the boson, the J/ , and the Υ, respectively. The variables 1 and 2 are the polar angles of the two muons, and 12 is the opening angle of the muon pair. The relative dimuon mass resolution from J/ → , → , and → events measured in data is compared with that obtained from calibrated simulation as a function of of * T in Figure 10. The resolutions are in good agreement. Due to the larger number of events, the J/ → measurement extends to higher momenta than the → one. and → events as a function of the * T variable defined in the text. To account for the resolution differences induced by the different pseudorapidity distributions among muons from the three resonances, the data and simulations are re-weighted according to the pseudorapidity distribution of the muon with the highest momentum from → decays. The lower panel shows the ratio of the data to the MC simulation. The error bars represent the statistical uncertainty, while the bands show the systematic uncertainties.

Conclusion
The momentum performance of the ATLAS muon reconstruction is measured using 139 fb −1 of data from collisions at √ = 13 TeV recorded during the second run of LHC between 2015 and 2018.
Events from the → process are used to correct the reconstructed data for a charge-dependent bias in the muon momentum measurement associated with detector alignment effects. This bias is reduced on average from up to 0.4 TeV −1 to 2 · 10 −4 TeV −1 for muons with T < 450 GeV after the corrections are applied, with associated uncertainty at the level of 0.03 TeV −1 for muons with transverse momentum of about 45 GeV.
The scale and resolution of the muon momentum measurement is studied in detail using J/ → and → decays. These studies are used to correct the simulation, improving the agreement with the data and reducing the systematic uncertainties related to the muon calibration in physics analyses. The improvements in the T correction methods described in this paper and the substantial number of J/ → and → decays collected in the Run 2 data sample together improve the precision of the momentum scale by up to a factor of two relative to the previous publication based on 3.2 fb −1 of collected data [10]. For → decays, the uncertainty in the momentum scale varies from a minimum of 0.05% for | | < 1 to a maximum of 0.15% for | | ∼ 2.5.
The dimuon mass resolution is about 1.3% (1.6%) at small values of pseudorapidity for J/ ( ) decays, and increases up to 2.1% (2.4%) in the endcaps. This corresponds to a relative muon T resolution of 1.8% (2.3%) at small values of pseudorapidity and 3.0% (3.4%) in the endcaps for J/ ( ) decays. After applying momentum corrections, the T resolution in data and simulation agree within the quoted uncertainties, which are at the level of, or better than, 5% (6%) for J/ ( ) decays depending on the range.
Validation studies performed with → and J/ → show that the corrections applied to the simulation bring the agreement with the data to the level of their estimated uncertainties. An additional, statistically independent validation performed using → decays confirms the correctness of the correction procedure.

Appendix
This appendix complements the results presented in the paper with some additional material. Appendix A provides the measured charge-dependent biases before and after application of the correction procedure for the alternative method discussed in Section 5.1. Appendix B documents the measured performance when applying the corrections for CB tracks as obtained by the combination of corrections to ID and MS measurements.
A Charge-dependent momentum scale calibration in data  Figure 11: Charge-dependent biases on the muon T as evaluated on data for CB momenta after alignment and before applying the dedicated corrections using the alternative method described in Section 5.1. The biases are shown separately for the three data-taking periods of (a) years 2015 and 2016, (b) 2017 and (c) 2018. The bias is defined in Eq. (1).
As described in the main body of the paper, a complementary method to estimate the charge-dependent bias was also developed. The details of the method are explained in Section 5.1. In this Appendix the results obtained with this method are presented. Similarly to Figure 1, Figure 11 shows the measured sagitta corrections as evaluated on CB momenta. It can be seen that the biases also extend to about 0.4 TeV −1 . After using the correction procedure with the iterative method explained in Section 5.1, using 25 iterations, the residual biases are re-evaluated with the same methodology. The resulting values, as shown in Figure 12, indicate the same reduction of the measured biases to values of less than 10 −3 TeV −1 .

B Additional performance studies
A second step, used to cross-check the validity of the corrections obtained directly for CB tracks, is introduced in Section 5.2. Because of the different parameterisation, a one-to-one comparison of the results obtained from the ID+MS method, as shown in Tables 1 and 2, and that of the CB method, as shown in Table 3, is not possible. Therefore, the methods used to estimate the performance in Section 6 are extended to the Cor,ID+MS T momenta as obtained from the ID+MS correction procedure. Using the same resonant decays from J/ → , → , and → events in simulation, the dimuon momentum mass scale and the dimuon mass resolution are compared with the measurements obtained from data. The results are obtained with the same fitting methods detailed in Section 6. For the J/ → decays the results are presented in Figure 13, and they show a good agreement between data and simulation for both the mass scale and the mass resolution. Similarly, Figure 14 and Figure 15 show the results obtained for → and → decays, respectively. As expected from the systematic uncertainty study detailed in Section 5.2, the ID+MS methodology results in a reduced smearing uncertainty to a higher scale uncertainty than the results of the CB methodology. All results show an agreement within quoted uncertainties of the measured values between data and simulation.  Figure 13: (a) Fitted mean mass of the dimuon system for CB muons for J/ → events data and corrected simulation as a function of the pseudorapidity of the highest-T muon. (b) Dimuon invariant mass resolution for CB muons for J/ → events for data and corrected simulation as a function of the pseudorapidity of the highest-T muon. The upper panels show the results obtained on data and simulation after all corrections using the alternative ID+MS correction method for CB tracks. The lower panels show the ratios of the data to the MC simulation. The error bars represent the statistical uncertainty; the shaded bands represent the systematic uncertainty in the correction and the systematic uncertainty in the extraction method added in quadrature.  Figure 14: (a) Fitted mean mass of the dimuon system for CB muons for → events data and corrected simulation as a function of the pseudorapidity of the highest-T muon. (b) Dimuon invariant mass resolution for CB muons for → events for data and corrected simulation as a function of the pseudorapidity of the highest-T muon. The upper panels show the results obtained on data and simulation after all corrections using the alternative ID+MS correction method for CB tracks. The lower panels show ratios of the data to the MC simulation. The error bars represent the statistical uncertainty; the shaded bands represent the systematic uncertainty in the correction and the systematic uncertainty in the extraction method added in quadrature.  Figure 15: (a) Fitted mean mass of the dimuon system for CB muons for → events data and corrected simulation as a function of the pseudorapidity of the highest-T muon. (b) Dimuon invariant mass resolution for CB muons for → events for data and corrected simulation as a function of the pseudorapidity of the highest-T muon. The upper panels show the results obtained on data and simulation after all corrections using the alternative ID+MS correction method for CB tracks. The lower panels show the ratios of the data to the MC simulations. The error bars represent the statistical uncertainty; the shaded bands represent the systematic uncertainty in the correction and the systematic uncertainty in the extraction method added in quadrature.
[3] ATLAS Collaboration, Observation of a new particle in the search for the Standard      [13] ATLAS Collaboration, Performance of the