Determination of the parton distribution functions of the proton using diverse ATLAS data from pp collisions at s=7\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\sqrt{s} = 7$$\end{document}, 8 and 13 TeV

This paper presents an analysis at next-to-next-to-leading order in the theory of quantum chromodynamics for the determination of a new set of proton parton distribution functions using diverse measurements in pp collisions at s=7\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\sqrt{s} = 7$$\end{document}, 8 and 13 TeV, performed by the ATLAS experiment at the Large Hadron Collider, together with deep inelastic scattering data from ep collisions at the HERA collider. The ATLAS data sets considered are differential cross-section measurements of inclusive W±\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$W^{\pm }$$\end{document} and Z/γ∗\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$Z/\gamma ^*$$\end{document} boson production, W±\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$W^{\pm }$$\end{document} and Z boson production in association with jets, tt¯\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$t\bar{t}$$\end{document} production, inclusive jet production and direct photon production. In the analysis, particular attention is paid to the correlation of systematic uncertainties within and between the various ATLAS data sets and to the impact of model, theoretical and parameterisation uncertainties. The resulting set of parton distribution functions is called ATLASpdf21.


Introduction
Precise knowledge of the content of protons, the parton distribution functions (PDFs), is a necessary ingredient for accurate predictions of both Standard Model (SM) and beyondthe-SM (BSM) cross sections at the Large Hadron Collider (LHC). Searching for BSM effects in the deviations of SM parameters from their predicted SM values [1,2] requires better knowledge of PDFs in the kinematic regions where they are already best known [3,4]. This means that many effects which were previously considered negligible must now be examined seriously. In this paper, particular attention is paid to some of these effects, including theoretical scale uncertainties and the correlations of experimental systematic uncertainties between, as well as within, data sets.
With present theoretical knowledge, PDFs can be determined at up to next-to-next-leading order (NNLO) in perturbative quantum chromodynamics (QCD). In order to determine the PDFs to the required precision, a wide coverage of the scale, Q 2 , and x, the fraction of the proton's longitudinal momentum carried by the parton participating in the initial interaction, is required. This is facilitated by combining new inputs with older data from various experiments and by measurement of different processes to constrain different regions of phase space, which span the kinematic region: 10 −5 x 1 and 1 Q 2 10 6 GeV 2 .
Knowledge of the PDFs of the proton, in the kinematic region relevant for the LHC, comes mainly from the precision deep inelastic scattering (DIS) data from ep collisions at the HERA collider, which covers a broad range of Q 2 and x. The PDF set HERAPDF2.0 was determined from HERA data alone [5] using information from e ± p neutral-current (NC) and charged-current (CC) processes. The lower-Q 2 NC data constrain the low-x sea-quark distribution but are not able to distinguish between quark flavours in the sea at low x, or between the down-type quarks,d ands at any x. The difference between the NC e + p and e − p cross sections at high Q 2 , together with the high-Q 2 CC data, constrains the valence distributions. The Q 2 dependence measured in the data constrains the gluon distribution. Additional information about the gluon comes from using HERA cross sections measured at various centre-of-mass energies ( √ s) through the contribution of the longitudinal structure function [6][7][8].
Diverse ATLAS data sets can be used in addition to the HERA data to constrain the PDFs better. In Ref. [9], high precision measurements of the inclusive differential W ± and Z /γ * boson cross sections at 7 TeV were added to the HERA data, resulting in the PDF set ATLASepWZ16, which improved on the HERAPDF2.0 set in various respects. Firstly, the strange content of the sea was determined rather than assumed to be a fixed fraction of the light sea. Indeed, compared to previous determinations, the strange sea was found to be enhanced at low x. Secondly, the accuracy of the valence quark distributions for x < 0.1 was improved. The effect of adding differential tt distributions, in the lepton + jets and dilepton channels at 8 TeV, to the HERA data and the inclusive W, Z /γ * data was studied in Ref. [10]. These tt data are complementary to the W, Z /γ * data in their power to constrain PDFs because they are sensitive to the high-x gluon distribution. The resulting PDF set was called ATLASepWZtop18. In Ref. [11], data on the production of W and Z bosons in association with jets (V + jets) were added to the HERA data and inclusive W , Z /γ * data, resulting in the ATLASepWZVjets20 PDF set. The V + jets data are sensitive to partons at higher x than can be accessed by inclusive W and Z /γ * data and, in particular, they constrain thed and s quarks at higher x.
It would clearly be advantageous to combine ATLAS W, Z /γ * data, tt data and V + jets data in a single QCD fit. This is done in the present paper, and additional ATLAS data sets are also included. This paper presents the first comprehensive and comparative NNLO perturbative QCD analysis of a number of ATLAS data sets with potential sensitivity to parton distributions. The additional ATLAS data sets are described in the following. Firstly, the data on W [12] production and Z /γ * [13] production with 20.2 fb −1 at 8 TeV are added, providing further constraints on the valence PDFs and on the composition of the light-quark sea. Secondly, the direct photon production differential cross sections with 20.2 fb −1 at 8 TeV and 3.2 fb −1 at 13 TeV are added in the form of their ratios [14], whereby many systematic uncertainties cancel out. Thirdly, the tt differential cross sections in the lepton + jets channel from 3.2 fb −1 at 13 TeV [15] are added and, fourthly, inclusive jet production cross sections with 4.5 fb −1 at 7 TeV [16], 3.2 fb −1 at 8 TeV [17] and 3.2 fb −1 at 13 TeV [18] are considered. These direct photon data, tt data and inclusive jet data all have impact on the gluon PDF at medium to high x.
All ATLAS data sets considered in this study have full information about correlated systematic uncertainties. The analysis considers systematic uncertainty correlations between, as well as within, data sets. This is important now that several of the input data sets have systematic uncertainties deriving from jet measurements, since these uncertainties are larger than those from the lepton measurements. Theoretical uncertainties, such as scale uncertainties, are also considered, including their correlations between data sets. The resultant PDF set is called ATLASpdf21.
The structure of the paper is as follows. Section 2 presents the input data sets. Section 3 describes the theoretical framework of the fit. Section 4 presents the fitting methodology, including the definition of the fit χ 2 and details of the parameterisation. Section 5 presents ATLASpdf21 PDFs and then compares them with a fit in which correlations between data sets are not implemented. The impact of each data set is considered and then model uncertainties and further theoretical uncertainties, including scale uncertainties, are considered. Section 6 presents a study of the χ 2 tolerance and a comparison with other modern PDF sets. Section 7 gives the summary and conclusions. Appendix A gives a more detailed exposition of the impact of scale uncertainties for the inclusive W, Z /γ * data. The impacts of inclusive jet data at different centre-of-mass energies are compared in Appendix B. Appendices C and D compare the ATLASpdf21 fit predictions with the ATLAS input data sets and with data sets from other experiments, respectively.

Description of data sets
The combined e ± p NC cross-section measurements of H1 and ZEUS [5] cover a kinematic range of Q 2 , defined as the negative four-momentum transfer squared, from 0.045 GeV 2 to 50,000 GeV 2 and of Bjorken-x, x Bj , which is equal to the fraction of the proton's longitudinal momentum carried by the struck parton at leading order (LO) in QCD, from 6 × 10 −7 to 0.65. The CC cross-section measurements cover a kinematic region Q 2 ∼ 300 GeV 2 to beyond 10 4 GeV 2 and of x from ∼0.65 down to ∼10 −2 . Low-x data, below x Bj = 10 −5 , are excluded from this analysis by requiring Q 2 > 10 GeV 2 , motivated by the poorer fit observed in this region compared to the rest of HERA data [5], which may reflect the need for resummation corrections at low x [19,20]. Full information about correlated systematic uncertainties is provided. There are 169 sources of correlated systematic uncertainty. Total uncertainties are below 1.5% over the Q 2 range of 3 < Q 2 < 500 GeV 2 and remain below 3% up to Q 2 = 3000 GeV 2 .
The ATLAS W, Z /γ * differential cross-section measurements at √ s = 7 TeV based on an integrated luminosity of 4.6 fb −1 are used [9]. A combination of the electron and muon decay channels is used. They access a kinematic range whereby the scale is identified with the measured boson mass range, and the x range is determined by the scale, the proton beam energy and the measured rapidity (y) ranges, such that Q 2 = m 2 and x = (Q/ √ s) e ±y , which gives an x range 0.001 x 0.1. 1 The W ± differential cross sections were measured as a function of the W decay lepton pseudorapidity, η , with an experimental precision of 0.6%-1.0%. The Z /γ * boson rapidity distribution, y Z , was measured in three mass ranges: 46 < m < 66 GeV; 66 < m < 116 GeV and 116 < m < 136 GeV, with the two decay leptons in central-central (c-c) and centralforward (c-f) rapidity ranges, 2 and experimental uncertainties as low as 0.4% for c-c rapidity and 2.3% for c-f rapidity. There are 131 sources of correlated systematic uncertainty. The low-mass off-peak Z /γ * data are not used in the present analysis because they are subject to much larger corrections for non-fixed order, parton shower p T -resummation effects than the other W, Z /γ * data [21]. These low-mass data had little impact on the ATLASepWZ16 PDFs. ATLAS has published differential cross sections and forward-backward asymmetries for Z /γ * production at √ s = 8 TeV, based on 20.2 fb −1 of data [13]. The measurement is presented in terms of three variables for both the c-c and c-f regions of the dilepton rapidity, y . For the c-c region, there are 12 bins of absolute rapidity ranging from 0.0 to 2.4, in intervals of 0.2, divided into 7 bins of the dilepton mass m ranging from 46 to 200 GeV and 6 bins of the polar angle, θ , of the decay lepton in the Collins-Soper frame [22]. For the c-f region, there are 5 bins of absolute 1 The definitions of the scale for the DIS and inclusive vector boson processes considered so far are conventional. They have been defined here in order to give an idea of the kinematic coverage of the fit. For the other data sets considered, the scale definitions are more conveniently given together with the description of the theoretical predictions in Sect. 3. 2 The central-central region requires both decay leptons to have absolute rapidity less than 2.5 and the central-forward region requires one lepton with absolute rapidity less than 2.5 and the other in the range 2.5-4.9. rapidity ranging from 1.2 to 3.6 and 5 mass bins ranging from 66 to 150 GeV in the same 6 bins of the Collins-Soper angle. The dilepton data were extracted in both the dielectron and dimuon channels up to |y | = 2.4, which is extended to |y | = 3.6 in the dielectron channel. The dimuon and dielectron data are combined for the c-c region. Measurements are used as cross sections when the acceptance of the lepton fiducial cuts with respect to the (m , |y |, cos θ * ) bin is greater than 95%, such that the leptons are constructed with high accuracy and the cross sections are known to ∼0.5% accuracy. There are 184 cross-section data points which fulfil these requirements and all of them lie in the c-c region of dilepton rapidity. These are used for the central fit. The selection of the bins in m , y and θ ensures that the NNLO predictions for these high acceptance bins are not highly sensitive to fiducial cuts applied to leptons [23]. The uncertainties of the high-|y | data range from 2% to 33% and the combined central rapidity data reaches a precision ranging from 0.4%, for the Z -mass-peak bins at |y | < 1.0, to 1.8% for the high-mass bins at |y | = 2.4. There are 330 sources of correlated systematic uncertainty. The remaining lower-acceptance data are used only in cross-checks. They are converted to 188 data points for forward-backward asymmetries across the Collins-Soper angle in order to minimise sensitivity to acceptance cuts. These data lie in both the c-c and c-f dilepton rapidity regions. The W inclusive differential cross sections, from 20.2 fb −1 of data at √ s = 8 TeV [12], are measured as a function of muon pseudorapidity, η μ . Data are provided for W + and W − cross sections separately as well as for the W -asymmetry. The separate W + and W − cross sections are used for the fit, rather than the W -asymmetry. The precision of these crosssection measurements varies from 0.8% to 1.5% as a function of η μ . The pseudorapidity covers the same central range as the √ s = 7 TeV inclusive W data. There are 41 sources of correlated systematic uncertainty.
The ATLAS differential cross-section measurements of W production in association with jets (W + jets) are based on data recorded during pp collisions at √ s = 8 TeV, and a total integrated luminosity of 20.2 fb −1 , in the electron decay channel only. Each event contains at least one jet with transverse momentum p T > 30 GeV and rapidity |y| < 4.4. The data are split into W + and W − cross sections. There are 51 sources of correlated systematic uncertainty. As explained in Ref. [11] the differential spectra of the transverse momenta of the W + and W − bosons p W T are fitted in the range 25 < p W T < 800 GeV. The data precision ranges from 10 to 25% from low to high p W T [24]. The ATLAS differential cross-section measurements of Z production in association with jets [25] are based on data recorded during pp collisions at √ s = 8 TeV, and a total integrated luminosity of 20.2 fb −1 , in the electron decay channel only. The measurement was performed as a function of the absolute rapidity of inclusive jets, |y jet |, for several bins of the transverse momentum within 25 GeV < p jet T < 1050 GeV. There are 42 sources of correlated systematic uncertainty. The data precision ranges from ∼10% to ∼30% from low |y jet | and p jet T to high |y jet | and p jet T . The tt differential cross sections were measured at √ s = 8 TeV using 20.2 fb −1 of data in the lepton + jets [26] and dilepton [27] decay channels. The tt cross sections are provided as both normalised and absolute spectra. The absolute spectra are used in the present study since they carry extra information about the normalisation of the cross-sections, which is not input anywhere else in the fit. As detailed in Ref. [10], in the lepton + jets channel the differential spectra that are used are the mass of the tt pair, m tt , and the average top-quark transverse momentum, p t T . Each of these spectra has full information about systematic and statistical bin-to-bin correlations, including statistical correlations between the spectra. There are 55 sources of correlated systematic uncertainty in common between the different spectra. The precision of these lepton + jets data is ∼15% for p t T and ranges from ∼10% to ∼20% from low to high m tt . In the dilepton channel, the spectrum for the rapidity of the tt pair, y tt , is used. The precision of these data ranges from ∼5 to ∼15% from low to high |y tt |. For these dilepton channel data the correlations are provided as a total covariance matrix. Further studies of these top-quark data are presented in Ref. [10].
The tt differential cross sections were measured at √ s = 13 TeV using 36 fb −1 of data in the lepton + jets [15] channel. Similarly to the tt data at 8 TeV, the absolute spectra are used. For the tt data at 13 TeV, two complementary topologies are measured; the 'resolved' and 'boosted' topologies, which differ in that the decay products of the hadronically decaying top quark are well separated for the resolved case and collimated for the boosted case. The 'boosted' topology reaches higher absolute rapidity of the tt pair. The differential spectra used for the PDF fit are the mass of the tt pair, m tt , the average top-quark transverse momentum, p t T , the average top-quark rapidity, y t , and the boosted rapidity of the tt pair, y b tt . These spectra were found to be the most sensitive to the gluon PDF in Ref. [28]. Each of these spectra has full information about systematic and statistical bin-to-bin correlations, including statistical correlations between the spectra. 3 3 The final two data points, for which p t T > 360 GeV, are excluded from the fitted data set. This is because the full statistical correlation matrix could not be inverted when these bins are included. A crosscheck is made in which the fit uses only two of the tt 1-D spectra p t T and m tt at 13 TeV, for which the statistical matrix could be inverted, both including and excluding the final two bins of the p t T spectrum. The resulting PDFs were compared to the PDFs from a fit to the 2-D p t T , m tt spectra. The PDFs of these three fits are almost identical, thus the exclusion of the high p t T bins does not affect the PDF shapes significantly.
There are 88 sources of correlated systematic uncertainty in common between the different spectra and statistical correlations between the spectra are implemented. Double differential spectra in p t T and m tt were also studied. The precision of these lepton + jets data ranges from 15% to 30% for p t T and from 15% to 22% for m tt , with the largest uncertainties in the lowest and highest bins. For both of the rapidity spectra the total uncertainty is ∼15%.
The ratio of the cross sections for inclusive isolatedphoton production at √ s = 8 and 13 TeV comes from integrated luminosities of 20.2 fb −1 and 3.2 fb −1 respectively. The data are presented as a function of photon transverse energy, E γ T , for E γ T > 125 GeV in bins of photon pseudorapidity, η γ , for the central region |η γ | < 2.37. Details of the isolation criteria are given in Ref. [14]. Evaluation of experimental uncertainties in the ratio data takes into account the correlations between the data at the two different centreof-mass energies. The dominant source of correlated uncertainty is the photon energy scale. The uncertainty due to this source is much lower when considering the ratio of cross sections. Nevertheless, 25 correlated uncertainty components from this source remain, and there are 21 further uncertainty components which are treated as correlated or uncorrelated as specified in Ref. [14]. It should be noted that for the present fits the combined luminosity uncertainty at 8 and 13 TeV is not used, so the separate luminosity uncertainties may be correlated with those of other data at these centre-of-mass energies. The experimental precision of the ratios ranges from ∼4% at low E γ T and |η γ | to ∼10% at the largest E γ T and |η γ |.
The ATLAS measurements of inclusive jet cross sections at √ s = 7 TeV [16], 8 TeV [17] and 13 TeV [18] were considered. The data are presented as a function of jet p jet T , in six bins of absolute rapidity from |y jet | = 0.0 to |y jet | = 3.0. Jets were reconstructed using the anti-k t algorithm with jet radius parameter R = 0.4 and R = 0.6 for data at 7 and 8 TeV, and with R = 0.4 for data at 13 TeV. Statistical correlation matrices between bins are provided for all jet data sets. However, it is not possible to fit inclusive jet production data at different centre-of-mass energies simultaneously, because the full experimental systematic uncertainty correlations between these data sets have not been fully specified. The data at 8 TeV are selected for the central fit, firstly because a better understanding of correlated systematic uncertainties has been achieved for these data than for the data at 7 TeV, and secondly because they are available for R = 0.6, unlike for the data at 13 TeV, and a larger R value is considered more reliable theoretically [29]. Details of the data set are therefore given only for the jet data at 8 TeV. The data at 7 and 13 TeV have similar features, the main difference being in the p jet T range probed. These data sets, including the alternative choice of jet radius, are used  [17] has been done to understand correlated sources of uncertainty. There are 320 sources of correlated systematic uncertainty, in addition to the luminosity uncertainty, and the jet energy scale's η intercalibration uncertainty is split into 250 sources. The total uncertainty in the central p jet T range, 300 < p jet T < 600 GeV, is ∼5% for |y jet | < 0.5, rising to ∼10% for |y jet | in the range 2.5-3.0. For low p jet T , 85 < p jet T < 300 GeV, the uncertainty is ∼15% and for high p jet T , 600 < p jet T < 2000 GeV, it is ∼50%. All the ATLAS input data sets for the QCD fit are summarised in Table 1.

Correlations of uncertainties within and between data sets
The correlated systematic uncertainties are applied within each data set with the following small number of exceptions. Firstly, for the W + jets, the Z + jets and the inclusive jet spectra at 8 TeV, the systematic uncertainty due to the unfolding procedure is treated as being uncorrelated, both within and between these spectra. 4 As shown in Ref. [11], this affects the χ 2 of the fits to V + jets, but has little impact on the fitted PDFs. Similarly, the parton shower systematic uncertainty is decorrelated between the p t T and m tt spectra in the tt lepton + jets channel, as done in Ref. [10]. It was established that this decorrelation has a minimal effect on the PDFs, while reducing the fit χ 2 to acceptable levels. This decorrelation 4 The two systematic uncertainties in each of the W + jets and Z + jets spectra related to unfolding (one related to the MC modelling and one to the size of the data samples) are fully decorrelated between spectra and bins within a single spectrum as they contain a large statistical component in both data sets owing to MC simulation statistics. and the aforementioned decorrelation of the unfolding systematic uncertainty in V + jets and inclusive jet data, can be justified because the systematic uncertainties concerned are evaluated from the difference of two Monte Carlo estimates, and thus do not represent well-behaved Gaussian uncertainties. Similar conclusions were reached in a recent study in the MMHT framework [30]. Thirdly, in the inclusive jet data at 8 TeV further decorrelations of such systematic uncertainties, derived from the difference of two Monte Carlo estimates, are considered following Ref. [17]. The experimental systematic uncertainties for the jet energy scale (JES) such as the 'Flavour Response', 'Multi-Jet Balance Fragmentation', 'Pile-up Rho Topology', 5 and the 'Non-Perturbative Correction' uncertainty, are not considered completely correlated between all rapidity bins. Instead, they are split into two or three components as a function of rapidity and p jet T as specified in the various splitting options described in the Appendix of Ref. [17]. For the central fit, the preferred set of splitting options for R = 0.6 is used, in which the JES 'Flavour Response' is split into three components (see Table 6 of Ref. [17]). In the present paper, this is called 'Decorrelation Scenario 2' and it is chosen because it is one of the two preferred options as determined in the analysis in Ref.
[17]. 6 Alternative decorrelation scenarios are also considered in Sect. 5.3.1. 5 The 'Flavour Response' is the systematic uncertainty due to the response difference between quark-and gluon-induced jets, the 'Multi-Jet Balance Fragmentation' represents the jet fragmentation uncertainty in the multi-jet p T balance and the 'Pile-up Rho Topology' takes into account the uncertainty in the density ρ of pile-up activity in a given event. 6 The decorrelations used at next-to-leading order (NLO) in Ref.
[17], include decorrelations of systematic uncertainties due to scale choice. These are not applied in the present NNLO analysis because scale uncertainties are much smaller. NNLO scale uncertainties for the inclusive jets are studied in Sect. 5.3.1. Correlations of systematic uncertainties between data sets are explained below. The luminosity uncertainties are considered fully correlated for all data sets at the same centreof-mass energy. For the data sets considered in this analysis, systematic uncertainties involving electron and muon measurements are small (<1%), whereas systematic uncertainties involving the jet measurements can be much larger (O(10%)). Moreover, the high-precision inclusive W, Z /γ * differential cross-section measurements at 7 TeV and the inclusive Z /γ * triple differential cross-section measurements at 8 TeV both had the electron and muon channel data combined and thus the identities of the systematic uncertainty sources are lost in the combination procedure such that the analysis cannot correlate them with muon and electron uncertainties in other data sets (or between the inclusive Z /γ * measurements at 7 and 8 TeV). Thus, correlations of lepton uncertainties between these inclusive W and Z measurements and the other data sets are neglected. This is not a significant limitation in the study of the effect of correlations between data sets since the lepton uncertainties are small compared to the jet uncertainties, as shown in the recent study of the impact of V + jets data on PDF fits [11]. This study reverted to the use of the separate electron and muon channel data for the inclusive W, Z /γ * data at 7 TeV in order to correlate the electron systematic uncertainties with those of the W + jets data and the Z + jets data at 8 TeV in the electron channel. These studies produce PDFs which are barely different from those in which the electron/muon combined data are used and lepton systematic uncertainty correlations are not applied between the data sets. Indeed, in Sect. 5.2 of the present analysis, it is shown that the impact of the V + jets data within the present analysis, which uses combined electron/muon data for both inclusive W, Z /γ * data at 7 TeV and inclusive Z /γ * data at 8 TeV, is very similar to the impact of these data in Ref. [11], where uncombined electron and muon data are used. Thus, the use of the more accurate combined electron and muon data is preferred.
In this paper, the correlations of the much larger jetmeasurement uncertainties are considered. The W + jets data and the Z + jets data at 8 TeV have common sources of jet systematic uncertainties, which are listed in Table 2. These sources are considered 100% correlated. Since the tt cross sections at both 8 and 13 TeV consider data in the lepton + jets channel, there are some common sources of systematic uncertainty, due to the jet measurements, between these data and the V + jets data. These are also listed in Table 2 and are considered 100% correlated between these data sets. This table also lists correlations of some sources of smaller systematic uncertainties, in the muon measurements and diboson, single-top and Z + jets backgrounds, which can be correlated between the tt measurements. It should be noted that detailed correlations between the tt data at 8 TeV in the dilepton channel and other data sets cannot be applied because the correlations for the dilepton channel are supplied as a covariance matrix and the separate individual sources of uncertainty cannot be separated. This does not have a significant effect on the fit since the dilepton data itself has only a small impact on the fit, as seen in Ref. [10].
Correlations of jet systematic uncertainties between the inclusive jet data and the V + jets data and tt data in the lepton + jets channel have been identified and are listed in Table 2. There is a choice to be made in the implementation of these inter-data-set correlations. The JES 'Flavour Response' and 'Pile-up Rho Topology' are part of the Jet Decorrelation Scenario 2 chosen for the central fit, and thus they have been split into three components according to jet rapidity and p T [17]. Consequently, there are no longer corresponding systematic sources in the V + jets data and tt data. The choice made for the central fit is to correlate only the remaining six sources (in the right-hand column of Table 2) between the data sets. The alternative choice of correlating these two systematic sources with the other data sets but not including them in the jet decorrelation scenario was also investigated. The effect on the resulting PDFs is negligible.
There is a further caveat on how the correlations between the inclusive jet data set and the V + jets and tt data sets are applied, namely that the last two data sets have radius R = 0.4 jets, and hence the systematic uncertainties may not be fully correlated. Checks were made using 100% correlation and no correlation, yielding little difference between the resultant PDFs. For the central fit a correlation of 100% is used.
The systematic uncertainties of the inclusive jet data at different beam energies are correlated with each other, but understanding these correlations in detail is non-trivial. In the present study, these data sets are fitted separately and results are compared. As already stated the data at 8 TeV are used for the central fit.
The measurement of the direct-photon production ratio already considered correlations between the data at 8 TeV and 13 TeV. The photon energy scale is the largest correlated systematic uncertainty between the two measurements. There are no further important correlations with the other data sets. The luminosity uncertainties of the data at 8 TeV and 13 TeV are not combined for the present study. Instead, the 8 TeV luminosity is correlated with that of the other 8 TeV data sets and the 13 TeV luminosity is correlated with that of the other 13 TeV data sets.

Table 2
Systematic uncertainties that are correlated between the W + jets data at 8 TeV, Z + jets data at 8 TeV, tt lepton + jets data at 8 TeV, tt lepton + jets data at 13 TeV and inclusive jets data at 8 and 13 TeV are listed. The names of the systematic uncertainties are those found in the HEPData entries [31]. Entries in the same row are taken as 100% correlated for the V + jets and tt lepton + jets data, which all have jet radius R = 0.4. Different degrees of correlation are considered for the inclusive jet data at R = 0.6, because of the differing choice of jet radius. Where entries are omitted, that systematic uncertainty does not exist for that data set (denoted by '-'). The luminosity uncertainty of data sets at the same centre-of-mass energy are also fully correlated. The JES 'Flavour Response' and JES 'Pile-up Rho topology' are considered fully correlated with other data sets only for cross-checks. They are not correlated for the central fit because they are part of the Decorrelation Scenario 2 which is applied to the inclusive jet measurements, as explained in the text. For this reason they are marked with the symbol * Systematic uncertainty The present analysis uses the xFitter framework [32][33][34]. This program interfaces to theoretical calculations directly or uses fast interpolation grids to make theoretical predictions for the considered processes. The program MINUIT [35] is used for the minimisation. Each step is cross-checked with an independent fit program [36]. This section describes how these predictions are obtained for each data set. For the DIS processes the light-quark coefficient functions are calculated to NNLO in QCD theory as implemented in QCDNUM [37]. The contributions of charm and bottom quarks are calculated in the general-mass variable-flavournumber scheme of Refs. [38,39], known as the optimised TRVFN scheme.
The renormalisation and factorisation scales for the DIS processes are taken to be the conventional choices, μ r = μ f = Q 2 , where Q 2 is the negative four-momentum transfer squared as already stated. For the DIS data electroweak (EW) effects are already unfolded to leading order, such that LO-EW corrections need not be applied, apart from the running of α. NLO-EW corrections are not well defined and no such corrections are applied to the DIS data.
In the present analysis, the photon PDF within the proton is not accounted for and thus data sets with substantial sensitivity to the photon PDF, such as very high-mass Drell-Yan data, are excluded. It is estimated [40] that the photon takes only ∼ 0.3% of the momentum of the proton in our kinematic range.
The DIS processes are the only ones for which fast NNLO QCD predictions can be made analytically, such that they can be used in an iterative fit. For the LHC processes, NNLO calculations are too time-consuming and thus fast interpolation grids are used. However, currently only the tt production processes in the lepton + jets channel have grids available for NNLO calculations. For the remainder of the data sets the grids are available only at NLO and K -factors are used to correct the QCD predictions for the differential cross sections from NLO to NNLO. These K -factors are calculated from the ratio of the NNLO to NLO cross sections, with the same cuts as the data, using a fixed input PDF. These Kfactors have very little dependence on the choice of input PDF for this calculation, and iteration using the output PDF of the fit is not necessary. The calculations used to construct the interpolation grids are usually to leading order (LO) in the EW part of the calculation. Correction from LO to NLO in EW predictions is also applied by a K -factor technique. For some data sets, this is applied together with the QCD predictions' correction from NLO to NNLO while for others it is a separate calculation. Details for each data set are given below and summarised in Table 3. For data sets measuring Ref. [64] final-state jets, there are also non-perturbative corrections to account for hadronisation. These are also applied using a K -factor technique as specified below. 7 The W and Z inclusive cross sections at 7 TeV are calculated at fixed order, to NNLO in QCD theory and to NLO in EW theory, as described in Ref. [9]. The results obtained from DYNNLO 1.5 [41,42] and FEWZ 3.1.b2 [43,44] are compared. A small difference between the NNLO predictions of FEWZ and DYNNLO of up to ∼1% is observed for the W -and Z -peak data. The sensitivity of NNLO programs to the fiducial cuts applied to the data was highlighted again recently [23]. This is considered further in Sect. 3.2. The scales for the Drell-Yan (DY) processes are taken to be the decay dilepton invariant mass appropriate to the centre of the mass bin in the NC case and the W -boson mass in the CC case. The xFitter package uses the APPLgrid code [45] interfaced to the MCFM program [46,47] for fast calculation of the differential W and Z /γ * boson cross sections at NLO in QCD theory and LO in EW theory, and a K -factor technique is used to correct the NLO QCD predictions to NNLO and the LO EW predictions to NLO. These K -factors are within 1-2% of unity for the W ± and Z /γ * in the c-c dilepton rapidity region and within ∼4% of unity in the c-f dilepton rapidity region. The high-mass sideband of Z /γ * production is also subject to background from photon-induced dilepton production, which was estimated using the MRST2004qed photon PDF [48] and subtracted from the data. Compared to the signal, the size of this background is ∼(1.5 ± 0.5)%.
The QCD predictions for the triple differential distributions of Z /γ * production at 8 TeV are made to NLO by MCFM interfaced to APPLgrid, and K -factors are applied for NNLO QCD and NLO EW effects using NNLOjet. These K -factors are within ∼3% of unity. The renormalisation and factorisation scales for these data are taken to be the decay dilepton invariant mass appropriate to the centre of the mass bin. For the present study, Particle Data Group (PDG) values [49] are used for the electroweak parameters in the G μ scheme, in particular the value of sin 2 θ W is sin 2 θ W = 0.23127.
The QCD predictions for the W cross sections at 8 TeV are calculated to NLO and the EW predictions to LO using MG5_aMC@NLO 2.6.4 interfaced to APPLgrid via aMCfast 1.3.0, and the NNLO QCD + NLO EW K -factors are calculated using DYNNLO 1.5. These K -factors are within 1%-2% of unity. The renormalisation and factorisation scales for these data are taken to be the W -boson mass.
The predictions for W + jets production at 8 TeV are obtained at fixed order, up to NNLO in QCD theory and at LO in EW theory, using the N jetti program [50]. Out-puts from the APPLgrid code are used for fast calculation at NLO in QCD theory, and K -factors from the aforementioned N jetti prediction are used to correct this calculation to NNLO. The renormalisation and factorisation scales are set to m 2 W + ( p jet T ) 2 , where the second term in the square root is a sum over the transverse momentum of each jet. Additionally, all non-perturbative corrections, such as for hadronisation effects, are included in the K -factors. Further K -factors are applied for NLO EW corrections as computed by the authors of Sherpa [51]. As in the previous ATLAS analysis of V + jets data [11], the lowest p jet T bin, p jet T < 25 GeV, is not used, since the calculation is effectively only at NLO for such low p T .
The NNLO QCD predictions for Z + jets production at 8 TeV were calculated by the authors of Ref. [52]. The renormalisation and factorisation scales are set to μ r = μ f = 1 2 m 2 + p 2 T, + p T,partons where m is the invariant mass of the electron pair, p T, is the transverse momentum of the electron pair and p T,partons is the sum of the transverse momenta of the outgoing partons. Four sets of K -factors are provided: the first corrects the NLO QCD predictions to NNLO, the second gives corrections for non-perturbative effects, the third corrects from LO to NLO in QED and the fourth applies NLO EW corrections (excluding QED radiation) as computed by Sherpa 2.2.10 . The K -factors for both W + jets and Z + jets production are typically within 10% of unity. Uncertainties in the non-perturbative corrections are supplied for the Z + jets predictions and these are applied as correlated systematic uncertainties. Their impact is very small. Such uncertainties are not supplied for the W + jets predictions.
The NNLO QCD predictions for tt production at 8 TeV were calculated by the authors of Ref. [53] and are available in the form of fast interpolation grids, APPLgrid [45] or fastNLO [54,55], for the data in the lepton + jets channel. The predictions for m tt are given for renormalisation and factorisation scales equal to H T /4, where H T = 2 , and the predictions for p t T are given for scales equal to m T /2, where m T = m 2 t + ( p t T ) 2 and m t = 173.3 GeV is the pole mass. Non-perturbative corrections and their uncertainties are supplied with the data and these uncertainties are applied as correlated systematic uncertainties. For the dilepton channel, APPLgrid is interfaced to MCFM to produce NLO grids, and a K -factor technique is used to correct NLO predictions to NNLO, using K -factors from Ref. [28]. The scales used for the predictions are equal to H T /4. The NNLO/NLO K -factors are ∼7% above unity and constant for y tt . The calculations are for zero width of the top mass. NLO EW corrections for the spectra are also considered, using the additional K -factors given in Ref. [56].
For the 8 TeV data, these corrections can be up to 1% for y tt and y t , up to 2% for m tt , and up to 4% for p t T . The NNLO predictions for tt production at 13 TeV have been calculated by the authors of Ref. [53] in the form of fast interpolation grids [45,54,55] for the data in the lepton + jets channel [57]. The predictions for m tt , y t and y b tt are given for renormalisation and factorisation scales equal to H T /4 and the predictions for p t T are given for scales equal to m T /2, just as for the 8 TeV tt data. The only difference is that at 13 TeV the value m t = 172.5 GeV is used for the top-quark mass, because this is the value used for the interpolation grids. The small difference between the masses used for the central fit for 8 and 13 TeV tt data is studied as part of the model uncertainties and found to be negligible (see Sect. 5.3.1). NLO EW corrections for the spectra are also considered, using K -factors taken from Ref. [56]. For the 13 TeV data, these corrections can be up to 1% for y tt , up to 0.5% for y t , up to 2.5% for m tt , and up to 5% for p t T . Non-perturbative corrections and their uncertainties are again supplied with the data and these uncertainties are applied as correlated systematic uncertainties.
The NLO QCD predictions for direct photon production were taken from MCFM interfaced to APPLgrid, and K -factors from Ref. [58] were applied to obtain NNLO QCD predictions. Resummation of electroweak effects is also applied as in Ref. [59]. The corrections from these K -factors are 2%-15% and 0.5%-10% for the separate 8 and 13 TeV data sets, respectively, but they almost cancel out in the ratio, leaving corrections of less than 2%. Finally, a photon isolation criterion is applied to the predictions to follow that applied to the data as closely as possible (see Ref. [60]). The renormalisation and factorisation scales are set to E γ T . The NLO QCD predictions for inclusive jet production at 7, 8 and 13 TeV are made with NLOjet++ interfaced to APPLgrid. The renormalisation and factorisation scale choices μ r = μ f = p max T and μ r = μ f = p jet T are both available. For the 7 and 8 TeV data, predictions are available for jet radii R = 0.4 and 0.6, while for the 13 TeV data they are available for R = 0.4 only, matching the data. The K -factors to convert NLO QCD predictions to NNLO have been calculated using NNLOjet [61]. These K -factors are rather different for the two scale choices. For the 8 TeV data and scale p max T they range from ∼+10% at low p T to a few percent at high p T for low |y jet |, but the highp T corrections can become slightly negative at high |y jet |. For the scale p jet T , they range from ∼−5% at low p T to a few percent positive at high p T for low |y jet |, but become increasingly negative for all p T at high |y jet |. The K -factors for the 7 TeV data are very similar, and those for the 13 TeV data follow the same trends while being a bit larger. The choice of scale for the central fit is μ r = μ f = p jet T , since this is preferred theoretically [29], but the alternative scale choice μ r = μ f = p max T and other scale variations are considered in Sect. 5.3.1.
The precision of the K -factors, in the predictions for inclusive jet data, depends on the statistical precision of the NNLOjet calculations which evaluated the NLO and NNLO cross sections. For the 8 TeV inclusive jet data, the resulting K -factors suffer from some statistical fluctuations that can be as large as ∼1%. 8 For the central fit, these K -factors are smoothed and the estimated uncertainty in this procedure is applied as being 60% correlated and 40% uncorrelated between p T bins within the same rapidity bin. Alternative treatments are considered in Sect. 5.3.1.
The QCD inclusive jet predictions are corrected for NLO electroweak effects as detailed in Ref. [17]. This correction can reach more than 10% for the highest p jet T in the lowest |y jet | bin, but decreases rapidly as |y jet | increases. It is less than 3% for |y jet | > 1.0.
In order to compare fixed-order QCD calculations with the measured inclusive jet cross sections, corrections for nonperturbative effects must also be applied. These are derived using LO Monte Carlo event generators. There are significant differences between the corrections derived with different event generators, Pythia [62] and Herwig [63], and different sets of tuned parameter values. The corrections can be up to ∼10% at low p jet T . They are different for R = 0.4 and R = 0.6 due to differing interplay of the underlying-event and hadronisation corrections, and for this reason fits to data with both jet radii are investigated. Values of the non-perturbative correction are taken at the centre of the uncertainty band, which is constructed as the envelope of all such corrections considered, and the systematic uncertainty of each correction value, correlated in p jet T and y jet , is taken from the spread of the band (see Ref. [17] for details).
Fits were performed using 7, 8 and 13 TeV inclusive jet data separately, using both choices for the jet radius, R = 0.4 and R = 0.6, when available, and both choices for the scales, p max T and p jet T , but the final choice for the central fit is 8 TeV jets with R = 0.6 and scales μ r = μ f = p jet T , as discussed in Sect. 5.3.1. Table 3 details the code used for NNLO/NLO QCD corrections and NLO/LO EW corrections to ATLAS data in the QCD fit, for ease of reference.

Scale uncertainties and sensitivity to NNLO code
For some of the ATLAS data sets considered, the precision of the data set is so high that it is comparable to the size of the NNLO scale uncertainties. This is the case for the ATLAS W and Z /γ * inclusive data sets at both 7 and 8 TeV, for which the total experimental uncertainty and the scale uncertainties both approach ∼0.5%. In this case, the scale uncertainties are considered as additional theoretical uncertainties which are added to the χ 2 calculation in the same way as experimental systematic uncertainties (see Sect. 4). These scale uncertainties are evaluated as follows. The K -factors are evaluated for separate changes of the renormalisation and factorisation scales by factors of 2 and 0.5. The magnitude of the K -factor difference is symmetrised as )/2 and its sign is preserved as positive if the upward variation of μ r or μ f makes the K -factor increase and negative if it makes the K -factor decrease. For these renormalisation and factorisation scale changes, the ratios of the signed K -factor changes to the K -factors for the nominal scales are calculated to obtain the fractional scale uncertainties. These can then be used in the fit like any other systematic uncertainty, as suggested in Ref. [65].
Due to the similarity of the W and Z processes, both the renormalisation scale and factorisation scale are considered correlated within the W, Z data sets at 7 TeV and between the W and Z data sets at 8 TeV. They are also considered to be correlated between the W and Z data sets at 7 and 8 TeV for the central fit. Studies of alternative approaches for the scale uncertainties are presented in Appendix A.
Scale uncertainties are accounted for in the central fit only for the W and Z inclusive data sets, because the scale uncertainties are comparable to the experimental uncertainties only in this case. For the other ATLAS data sets the experimental uncertainties are larger and scale uncertainties are treated by repeating the fit with varied scales as discussed in Sect. 5.3.1.
As remarked earlier, there is a small difference of up to ∼1% between the NNLO predictions of FEWZ and DYNNLO. A fractional uncertainty can be defined as the ratio of the signed difference between the FEWZ and DYNNLO predictions to the FEWZ prediction, and applied similarly to the scale uncertainties. The inclusion of such a theoretical uncertainty makes no further difference to the fits, provided the scale uncertainties are applied.
Recently, next-to-next-to-next-to-leading-order (N 3 LO) calculations for the total cross section of the Drell-Yan process have become available [66]. It is notable that the N 3 LO result does not quite lie within the scale uncertainties of the NNLO result. The N 3 LO correction to the central value is ∼2% compared to NNLO scale uncertainties of ∼0.5%. Full accounting for N 3 LO corrections, as well as p T resummation effects which affect the prediction of W and Z fiducial cross sections at the level of ∼0.5 − 1% [23], are beyond the scope of the current fixed-order-NNLO QCD fit.

Fit methodology
The DGLAP [67][68][69] evolution equations yield the PDFs at any value of Q 2 given that they are parameterised as a func-tion of x at a starting scale Q 2 0 . In the present analysis, this scale is chosen to be Q 2 0 = 1.9 GeV 2 so that it is below the charm mass threshold m 2 c . A recent combination of HERA heavy-quark data [70] has stimulated a re-analysis of the optimal values of the heavy-quark masses and their variations as used in the optimised TRVFN scheme [39]. The resulting values are m c = 1.41 GeV and m b = 4.2 GeV [71], and these are used for the central fit as applied to the DIS data. A minimum Q 2 cut of Q 2 min = 10.0 GeV 2 is imposed on the HERA data. All these assumptions are varied in the consideration of model uncertainties, as shown in Sect. 5.3.1. The strong coupling constant is fixed to α s (m Z ) = 0.118.
The quark distributions at the starting scale are represented by the generic form The parameterised quark distributions, xq i , are chosen to be the valence quark distributions (xu v , xd v ) and the light anti-quark distributions (xū, xd, xs). The gluon distribution is parameterised with the more flexible form where C g is set to 25 to suppress negative contributions at high x. The parameters A u v and A d v are fixed using the valence-quark number sum rule, and A g is fixed using the momentum sum rule. The normalisation and slope parameters, A and B, of the light-quark sea, xū, xd and xs, are all independent of each other, such that there is no constraint on xd − xū, or ons/(d +ū), either in shape or in normalisation as x → 0. By default it is assumed that xs = xs. The D, E and F terms in the polynomial expansion P i (x) are used only if required by the data, following the procedure described in Ref. [33], whereby parameters are added only if the χ 2 of the fit decreases significantly. This leads to a 21-parameter fit with: PDFs. It has been established that adding further parameters beyond the point at which there is little further change in χ 2 (called the point of 'saturation' of the χ 2 ) only serves to fit noise in the data [72]. Thus, addition of further parameters to the central fit is not considered. However, parameterisation uncertainties are considered in Sect. 5.3.2.
A cross-check was made using Chebyshev polynomials rather than ordinary polynomials for the P i (x) terms. There is no improvement in χ 2 for a comparable 21-parameter fit using the Chebyshev polynomials and the extracted PDFs lie within the total uncertainty bands of the ordinary polynomial fit.
The level of agreement between the data and the predictions from a PDF parameterisation is quantified by the χ 2 per degree of freedom (χ 2 /NDF, for NDF degrees of freedom). The definition of the χ 2 is where D i represent the measured data, T i the corresponding theoretical prediction, δ i,uncor and δ i,stat are the uncorrelated systematic uncertainties and the statistical uncertainties of D i , and the correlated systematic uncertainties, described by γ i j , are accounted for using the nuisance parameters b j . The quantity C stat,uncor,ik is a covariance matrix for both the statistical and uncorrelated systematic uncertainties. Summations over i and k run over all data points and summation over j runs over all sources of correlated systematic uncertainty. For each data set, the first term gives the main contribution to the partial χ 2 of the data set and the second term is a small bias correction term, referred to as the log penalty, which arises because the diagonal term of the matrix, C, is given by with different weighting for statistical uncertainties and uncorrelated systematic uncertainties. This form of the χ 2 is used as standard in HERA and ATLAS PDF fits [5,9,10]. In reporting the χ 2 for the fits, the first and second terms together give the partial χ 2 of the data set. The third term gives the correlated χ 2 arising from the penalty for the nuisance parameters describing correlations of systematic uncertainties within and between data sets. The experimental uncertainties of the fit are first set using the usual tolerance, T = 1, where T 2 = χ 2 = 1. The use of an enhanced tolerance is considered in Sect. 6.

Results
In this section the ATLASpdf21 PDF set is presented. The impact of variations of the central choice of fit settings and parameterisation is discussed in Sect. 5.3. Table 4 gives the total χ 2 per degree of freedom, χ 2 /NDF, of the fit using all data sets and the χ 2 per data point, partial χ 2 /NDP for NDP data points, of each data set. The correlated terms (term 3) are shown separated into groups with common systematic correlations. In order to evaluate the separate contributions of the data sets to this correlated term, the fit is run with its final parameters fixed for each data set separately. These Table 4 χ 2 contributions for the all data sets entering the PDF fit. The partial χ 2 for the individual data sets are given with respect to the number of data points (NDP). They represent the addition of terms 1 and 2 in Eq. (1). The correlated terms (term 3) are shown separated into groups with common systematic correlations. The total value of the correlated term for these groups is also split into the addition of the separate contributions in the order in which they are given in the values follow the total correlated terms in brackets, in order of their appearance in the table. The quality of the fit to the HERA data is χ 2 /NDP = 1.14, comparable to that of HER-APDF2.0, so that there is no tension between the ATLAS data and HERA data. The quality of the fit to the ATLAS W, Z data is χ 2 /NDP = 1.44, and to the ATLAS V + jets, tt and inclusive jet data it is χ 2 /NDP = 1.40. The quality of fit to the ATLAS direct photon data is χ 2 /NDP = 0.7. These χ 2 values are comparable to those obtained by the global PDF fits for similar data sets, but indicate a need to consider the appropriate χ 2 tolerance of the fit. Both of these points are discussed further in Sect. 6.

Comparison of PDFs with and without inclusion of correlations between data sets
The ATLASpdf21 PDFs at Q 2 = 1.9 GeV 2 (the starting scale) are shown in Figs. 1 and 2. Only experimental uncertainties with tolerance T = 1 are shown for the comparisons throughout this section. Full uncertainties including model and parameterisation variations are considered for the ATLASpdf21 fit in Sect. 5.3. Figure 2 also shows the ratio R s = x(s +s)/x(ū +d) and the difference x(d −ū). In this section, all PDFs are presented with uncertainties evaluated using the standard χ 2 tolerance Enhanced tolerance is considered in Sect. 6. Also shown in these figures are the PDFs from a fit in which correlations of systematic uncertainties between the V + jets, tt (lepton + jets) and inclusive jets data sets are not applied, with the exception of the luminosities for the data at the same centre-of-mass energy. 9 The systematic uncertainties which are correlated are specified in Table 2 and their treatment for the central fit is discussed in Sect. 2.2. 9 The W + jets and Z + jets data sets are always considered as a single data set with correlations applied and these correlations are maintained in the present exercise. Consideration of the impact of correlations between the W + jets and Z + jets data sets was studied in Ref. [11].
The differences between the extracted PDFs with and without inter-data-set correlations are mostly in the d-type sector and can reach ∼20% for xd at large x. The PDF difference x(d −ū) and ratio R s , shown in the bottom half of Fig. 2, are particularly sensitive to the use of these correlations. It is important to examine these differences at a scale relevant for precision LHC physics and this is illustrated in Figs. 3 and 4 where the ratios of the PDFs with and without the correlations are shown at Q 2 = 10, 000 GeV 2 . Although these differences are not large compared to current experimental precision, they can nevertheless be important if O(1%) accuracy is to be a realistic goal for PDFs [73].
These plots also allow a comparison of the size of the current experimental uncertainty of the PDFs with and without inclusion of inter-data-set correlations. The sizes are generally similar but the uncertainty in the high-xd, strange and gluon PDFs is larger if such correlations are not taken into account.

Impact of each data set
In this section the impact of each data set is considered. Only experimental uncertainties with tolerance T = 1 are shown for these comparisons. Full uncertainties including model and parameterisation variations are considered for the ATLASpdf21 fit in Sect. 5.3. Figure 5 shows the ratio R s for the ATLASpdf21 fit and compared with a fit in which the inclusive W, Z data at 7 and 8 TeV are removed (left-hand plot), as well as to a fit in which only W, Z data at 8 TeV are removed (right-hand plot). It is clear that without W, Z inclusive data the ratio R s cannot be In contrast, the valence and gluon PDFs are still reasonably well determined without any W, Z data but the input of these data decreases their uncertainties significantly, as illustrated for the xd v and xg PDFs on the left-hand side of Fig. 6.

Impact of W, Z inclusive data
On the right-hand side of Fig. 6 the decrease in the uncertainties of the xd v and xg PDFs from removing only the W, Z data taken at 8 TeV is illustrated, showing that the major decrease comes from retaining the W, Z data taken at 7 TeV. However, the W, Z data taken at 8 TeV have a major role to play in ensuring that xū ∼ xd holds at low x, even though this constraint is not imposed. Without them, one observes xd < xū at low x, as seen in Fig. 7. These data also somewhat reduce the low-x strange distribution and harden the high-x strange distribution, while softening the high-x xd distribution, as also shown in Fig. 7.
There is mild tension between the W, Z data at 8 TeV and the W, Z data at 7 TeV. The partial χ 2 /NDP for the W, Z data at 7 TeV decreases from 68/55 to 50/55 if the W, Z data at 8 TeV are excluded from the fit, and the partial χ 2 /NDP for the W, Z data at 8 TeV decreases from 239/206   [74] indicate that higher-order corrections are likely to be larger than our current estimate of scale uncertainties. A further study of scale uncertainties is given in Appendix A.

Impact of V + jets data
The impact of the V + jets data is shown in Figs. 8 and 9.
There are significant changes in the xd and xs PDF shapes such that the high-x xs and xd PDFs become softer and harder, respectively, with the input of V + jets data. Because of the change in the xd shape, the difference x(d −ū) is also strongly affected. This is shown in Fig. 8. This figure also shows that, without the V + jets data, there is little information about the ratio R s at high x. The changes in Fig. 8 are large because the V + jets data resolve a double minimum in parameter space such that the fit now prefers a hard xd and soft xs at high x, whereas it previously had an additional minimum with xd ∼ xs at high x, which was marginally preferred. These results are similar to those already seen and fully explained in the ATLASepWZVjets20 PDF analysis [11] and the double minimum may be seen in Figure 5 of that paper. A fit with only HERA data, or with HERA plus ATLAS W, Z data at 7 TeV (ATLASepWZ20), prefers the minimum with xd ∼ xs at high x, but once ATLAS V + jets data at 8 TeV are added to the fit the double minimum with a hard xd and soft xs at high x is preferred and the previous minimum disappears. In Figure 3 of Ref. [11], it can also be seen that for ATLASepWZ20 (the fit without V + jets data) the full uncertainties, including model and parameterisation variations, are very large because they cover both minima, whereas the full uncertainties of the ATLASepWZVjets20 fit (including V + jets data) are much smaller because there is no double minimum. There is thus no strong inconsistency between the fits with or without the V + jets data because of the large uncertainties of the fit without the V + jets data. This also applies in the present analysis, but in Fig. 8, only experimental uncertainties are shown. The full uncertainties for the fit without V + jets data in our present analysis are of little interest and are not presented. Model and parametrisation uncertainties for the ATLASpdf21 fit will be discussed in Sect. 5.3. These V + jets data also effect a modest change in the xd v shape and the gluon PDF shape as shown in Fig. 9. The utype quarks are not strongly affected by these data and thus they are not shown. There is no tension between the V + jets data at 8 TeV and other data sets in the fit, since all data are fitted well at the minimum chosen by these V + jets data.

Impact of tt data
The impact of the tt data is shown in the top half of Fig. 10. The high-x gluon distribution is mildly softened when the tt data are added to the fit. This effect is opposite to the one observed in the ATLASepWZtop18 fit. This is because more data which harden the gluon PDF, in particular the V + jets and inclusive jet data, are included in the present fit. The more significant effect is in the uncertainties of the high-x gluon distribution, which are reduced. There is no significant tension between the tt data and the other data in the fit. Figure 10 (bottom half) also shows the impact of removing only the tt data at 13 TeV (left) or only the tt data at 8 TeV (right). It is clear that the data at 8 TeV have the stronger impact on the shape of the xg PDF but both data sets contribute to a modest reduction in the uncertainties. Bottom left: the shape ratio to a fit not including tt data at 13 TeV. Bottom right: the shape ratio to a fit not including tt data at 8 TeV

Impact of photon data and inclusive jet data
There is little impact from the addition of the direct-photon production ratio data apart from a marginal softening of the high-x gluon distribution as shown in Fig. 11 (left). However, it is notable that these data can now be well fitted at NNLO in QCD, given that they have been excluded from PDF fits for the last 20 years because of poor fits to lower-energy data [60,75]. There is minimal tension with other data sets.
The principal impact of the inclusive jet data is on the gluon PDF. The main effect is a considerable decrease in high-x gluon uncertainties, with a mild hardening of the gluon PDF at high x, as shown in Fig. 11 (right). There is minimal tension with other data sets. As explained earlier in Sect. 2, the central ATLASpdf21 fit includes only the inclusive jet data at 8 TeV, with R = 0.6. The full uncertainties of the fit, see Sect. 5.3.1, include the difference between the choices R = 0.6 and R = 0.4. The impact of using inclusive jet data at 7 TeV and at 13 TeV, instead of at 8 TeV, is explored in Appendix B.

Model, theoretical and parameterisation uncertainties
The consideration of additional uncertainties affecting the PDFs is presented in this section. These are classified and

Model and theoretical uncertainties
The class of model uncertainties includes effects due to variations of the heavy-quark masses input to the TRVFN heavyquark-mass scheme for the inclusive DIS calculations, the minimum Q 2 cut on the HERA inclusive DIS data and the value of the starting scale for evolution. The minimum Q 2 cut was varied in the range 7.5 < Q 2 min < 12.5 GeV 2 and the starting scale was varied in the range 1.6 < Q 2 0 < 2.2 GeV 2 . In the inclusive DIS calculations, the heavy-quark masses were varied in the ranges 1.37 < m c < 1.45 GeV and 4.1 < m b < 4.3 GeV [71]. The variations of m c and Q 2 0 are coupled since the requirement Q 2 0 < m 2 c must be met. For this reason the upward variation of m c and the downward variation of Q 2 0 are symmetrised. Figure 12 illustrates the effect of these variations on the gluon distribution since this is the PDF most sensitive to these changes. The impact of the choice of m c and m b is modest. The effect of the variation of the Q 2 min cut is larger but still within the experimental uncertainties. The largest of these uncertainties is due to the choice of Q 2 0 , which gives a gluon PDF differing from the central fit by ∼2σ for x ∼ 0. A further potential model uncertainty comes from the treatment of the jet systematic uncertainties. Alternative decorrelation scenarios are considered as follows: the alternative option in which the JES Flavour Response is split into only two components rather than three, called "Decorrelation scenario 1"; complete decorrelation of the Jet Flavour Response between rapidity bins, called "FR decorrelated"; and no decorrelation, called "Fully correlated". Table 5 gives the total χ 2 /NDP for the jets (including all three terms of Eq. (1)) for alternative correlation scenarios, showing that the difference between full correlations and various choices of decorrelation can have a significant effect on the χ 2 .
However, the difference between the ATLASpdf21 gluon PDFs obtained using jet data with these different correlation scenarios is relatively small compared to the model uncertainties considered above, e.g. the variation of Q 2 0 . There are also no changes in the PDF uncertainties as a result of using different correlation scenarios. Hence, these variations are not considered as a source of significant uncertainty.
A further source of uncertainty for the inclusive jet data comes from the choice of jet radius R. The difference between the PDFs due to a different choice of jet radius for the 8 TeV jets is shown in Fig. 12. The only significant change is in the gluon PDF. This small difference between the PDFs   [29], this paper also inputs tt data from the lepton + jets channel and V + jets data, and for these data sets only R = 0.4 jets are available.
The effect of using jet production data at different centreof-mass energies of 7, 8 and 13 TeV is shown in Appendix B. This is not considered as an additional uncertainty but is presented as a cross-check that the choice of centre-of-mass energy for the jet production data does not have a significant influence on the PDFs extracted.  All model uncertainties are added in quadrature to form a total model uncertainty.
Theoretical uncertainties include the scale uncertainties of the predictions. The largest of these are the scale uncertainties for the inclusive W and Z data which are considered in detail in Sect. 3.2. Since these are treated as correlated systematic uncertainties in the fit χ 2 they are actually already included in what has been labelled as the experimental uncertainty of the fit.
The effect of scale uncertainties for the V + jets data was studied in Ref. [11] and was found to be negligible. The effect of scale uncertainties for the tt lepton + jets data was studied for the present paper and was also found to be negligible. The direct-photon production ratio data are relatively insensitive to scale variations because 100% correlation of the scale uncertainties is assumed between the 8 and 13 TeV data. Since the fit is relatively insensitive to these data, alternative assumptions for these correlations were not pursued.
The scale uncertainty for the inclusive jets at 8 TeV requires further consideration. These K -factors were supplied for two choices of scale: μ r = μ f = p max T or μ r = μ f = p jet T . A scale related to p jet T is favoured over p max T [29], as already mentioned. However, this difference in scale choice is now investigated. A comparison of the gluon PDFs for these two scales is shown in Fig. 13 (left), since this is the PDF most sensitive to this change. It can be seen that there is no significant difference between the resulting gluon PDFs. Changes in the nuisance parameter values for the correlated systematic uncertainties absorb the change in the predictions for the two scales. The χ 2 values for these fits are given in Table 6. The only significant change in χ 2 comes from the inclusive jets, not from other data sets.
Next, scale variations μ r = μ f = 2 p jet T and μ r = μ f = p jet T /2 are considered. The effect on the gluon PDF is shown in Fig. 13 (right). The χ 2 values for these fits are given in Table 6. Again, the only significant change in χ 2 comes from the inclusive jets, not from other data sets.
Finally, a cross-check is performed in which the K -factors are not smoothed but their statistical uncertainties are used as an additional uncorrelated uncertainty. The χ 2 value for this fit is also given in Table 6. It is lower than that for smoothed K -factors because the statistical uncertainties of the unsmoothed K -factors are ∼1%. However, the resulting PDFs are very similar to those obtained using smoothed Kfactors. Since none of the scale variation effects produced significant changes in the PDFs, no further theoretical uncertainty is added for this source.
Thus the theoretical uncertainties considered so far are either already included in the experimental uncertainties of the fit, or they are negligible. It should be noted that the data are also sensitive to the value of α s (m Z ), which affects the shape of the gluon PDF. The correlation between α s (m Z ) and the gluon PDF shape is specified by the DGLAP formalism. A determination of α s (m Z ) is beyond the scope of the current paper, since a correct NNLO treatment requires variation of the K -factor calculations with α s (m Z ), or direct NNLO grids for all processes. This is left for future work. The con-ventional value α s (m Z ) = 0.118 is used, in line with the value used by the global fitting groups, CT [76], MSHT [77] and NNPDF [78].

Parameterisation uncertainties
The optimal number of parameters was determined by 'saturation' of the χ 2 as explained in Sect. 4. However, the effect on the PDFs of adding extra parameters is investigated. Although there is no significant further decrease in χ 2 , some small shape changes are observed when adding an F u v term to the u-valence PDF and/or a Dd term to the xd PDF. Figure 14 shows the impact of adding both of these as free parameters on the xu v , xd v , xū and xd PDFs, which are the PDFs which show the largest variations. The total    Table 7.

Imposing a maximum Q 2 cut on the input data
A further theoretical uncertainty stems from the possibility that subtle effects of BSM physics may be present in highscale data. If these are included in a PDF fit, then the estimates of background to further higher-scale physics from new data could be distorted. A fit is performed in which the maximum scale for events accepted from each process is 500 GeV, or Q 2 = 250, 000 GeV 2 . This removes 82 points from the fit: 63 from the inclusive jets data, 6 from the V + jets data, 11 from the direct-photon ratio data, and one each from the tt p T spectra at 8 and 13 TeV. The χ 2 /NDF value of the fit improves from 2010/1620 to 1870/1538, with the largest improvement coming from the inclusive jets, for which the χ 2 /NDP value improves from 248/171 to 136/108. Figure 15 compares the PDFs at the scale of Q 2 = 10, 000 GeV 2 , with a linear scale in x so as to emphasise the differences at high x. The shapes of the PDFs and their uncertainties are not strongly affected by the cut, and a maximum Q 2 cut is not considered further.

Combination of uncertainties
The PDFs are now presented with experimental uncertainties from the fit and with additional model and parameterisation uncertainties as discussed in this section. These model and parameterisation uncertainties are added in quadrature to the experimental uncertainty. It should be noted that the scale uncertainties for the inclusive W, Z data appear as part of the experimental uncertainty, whereas the scale uncertainties for other data sets are negligible. Figure 16 shows the valence distributions and the xū and xd sea distributions, and Fig. 17 shows the xs and xg distributions and the x(d −ū) distribution. The experimental uncertainties are much larger for the d-type sector than for the u-type sector. The larger fractional uncertainty of the xs PDF in comparison with the xd PDF reflects its smaller absolute size as x → 1. The gluon PDF is well determined for 0.01 < x < 0.3 but not at low or high x. The model uncertainties in all PDFs are moderate, being largest in the gluon PDF. The parameterisation uncertainty of xd affects both xd and xd v since xd sea = xd and the sea and valence d-quarks act together. The xd v distribution is affected at both low and high x because of the valence-quark number sum rule. Similarly, the parameterisation uncertainty of xu v is seen at both low and high x. The parameterisation uncertainty of the gluon PDF comes from the other PDFs through the momentum sum rule. Figure 18 shows the R s distribution and its uncertainties as a function of x, at Q 2 = 1.9 GeV 2 . Figure 19 shows the value of R s , at Q 2 = 1.9 GeV 2 and x = 0.023 and at Q 2 = m 2 Z and x = 0.013, compared with other PDFs. For the ATLASpdf21 fit, the value of R s at this low scale has decreased relative to previous ATLAS fits. The ATLASpdf21 central value of R s ∼ 0.8 is due firstly to the input of the V + jets data, as seen in the change from ATLASepWZ16 (R s ∼ 1.15) to ATLASepWZVjets20 (R s ∼ 1.0), secondly to the new freedom in the low-x parameterisation, whereby the ATLASpdf21 central parameterisation corresponds to the low end of the uncertainty band of the ATLASepWZV-jets20 result (R s ∼ 0.85), and thirdly to the input of the W, Z data at 8 TeV. On the other hand, R s has increased for some of the other PDFs, e.g. MSHT20 [77] compared with MMHT14 [79], and CT18A [76] compared with CT14 [80]. This increase is due to the input of the ATLAS precision inclusive W, Z data at 7 TeV. Thus, the recent evaluations of this ratio of strangeness to light quarks are now in fair agreement. However, the strangeness is still substantially larger at low x than the older values which estimated R s ∼ 0.5. This illustration of the ratio R s at x = 0.023 should be supplemented by a comparison of the full shape of the R s distribution in Sect. 6.  The experimental uncertainties presented for the fitted PDFs were evaluated by the usual criterion of parameter setting, χ 2 = 1. However, global PDF fitting groups such as CT and MSHT use enhanced χ 2 tolerances, χ 2 = T 2 , where T 2 is a dynamic tolerance ranging from 10 to 100. 10 The justification for this comes from consideration of the consistency, or relative inconsistency, of input data sets [84]. In the present case, whereas the HERA data is a single consistent set of data with common correlated uncertainties, the ATLAS data are more disparate and a study to find an appropriate tolerance was performed.
One way to estimate a suitable tolerance for a fit including the ATLAS data is to consider the dynamic tolerance proce- 10 NNPDF uses a completely different method for estimation of PDF uncertainties, but the magnitude of their uncertainties is comparable to those of MSHT and CT. dure of the MSHT group, as first introduced in the MSTW paper [84]. Briefly restated, the method is as follows. The inverse of the Hessian matrix of the fit is diagonalised to give the eigenvector combinations of the parameters of the fit. The eigenvalues are the squares of the uncertainties in these eigenvector combinations of the parameters. The parameter displacements from the global minimum can then be expanded in a basis of rescaled eigenvectors, e ik = √ λ k v ik , where λ k is the k-th eigenvalue and v ik is the i-th component of the kth orthonormal eigenvector. Pairs of eigenvector PDFs have parameters given by a i (k ± ) = a 0 i ± te ik , where a 0 i are the central parameters, and t = 1 should give the usual change in χ 2 of T 2 = χ 2 = 1. In the quadratic approximation, t = T , and this is reasonably well obeyed for the present ATLASpdf21 21-parameter fit, at least up to t = 5.
Having determined the eigenvectors, each of these is treated as a different 'hypothesis' for a theory which can fit the data, and each of them is checked to see if it can describe each of the data sets within the hypothesis-testing criterion, χ 2 n N + √ 2N , as t is increased and decreased along the eigenvector, where χ 2 n is the χ 2 for a particular data set and N is the number of data points for that set. As soon as χ 2  The lower panels illustrate the fractional uncertainties goes above this limit, this sets the maximum value of t, and hence of T , that can be tolerated for this eigenvector in this direction, for this data set. Each eigenvector will have a most constraining data set, for each direction, and in principle the values of the tolerances T for the most constraining data set can be different for each eigenvector and each direction, hence the term 'dynamic'.
. This method is made more sophisticated in a number of ways. Firstly, the hypothesis-testing criterion chosen is not exactly the simple χ 2 n < N + √ 2N , but is χ 2 n < ξ 68 , where ξ 68 is given by is the usual probability density function for the χ 2 distribution with N degrees of freedom. Thus each data set should be described within its 68% confidence level. Secondly, the criterion imposed is not actually χ 2 n < ξ 68 , but χ 2 n < (χ 2 n,0 /ξ 50 ) ξ 68 , to allow for the fact that the central fit χ 2 n,0 for the data set may not be perfect, where perfect would imply χ 2 n,0 = ξ 50 . For the ATLASpdf21 fit the most constraining data sets for the eigenvectors are the W, Z data at 7 TeV or the Z /γ * triple differential cross-section data at 8 TeV (and very occa-   [83], ATLASepWZ16 [9] and ATLASepWZV-jets20 [11]. Left: R s at Q 2 = 1.9 GeV 2 and x = 0.023. Right: R s at Q 2 = m 2 Z and x = 0.013 4.2. These values are all fairly similar, so a single value can be used and, since the tolerance should not be set too large whereby a data set is far outside its 68% CL limit, this suggests the simple choice of T = 3.
In the above procedure to evaluate the dynamic tolerance, the fact that the data sets are not fitted perfectly was taken into account. One may estimate how much of the increased tolerance comes from this part of the procedure by increasing the uncertainties of each data set by χ 2 /NDP for those data, such that χ 2 /NDP ∼ 1 is achieved in the fit. This will clearly increase the uncertainties of the output PDFs. However, a naive application of this procedure would increase all the experimental uncertainties and this seems somewhat unnecessary for the statistical uncertainties, which are well known. Hence, only the systematic uncertainties are increased. Starting with the χ 2 /NDP scaling factors, the fit is iterated, adjusting these factors at each iteration, until a χ 2 /NDP ∼ 1 for each data set is achieved, After three iterations there is no longer any significant change and the scaling factors range from 1.0 to 1.6.
It is non-trivial that for these reweightings of the data the PDF shapes hardly differ from those of an unweighted fit. This shows that the fit is stable for modest reweighting of the input data sets. The uncertainties of the PDFs have increased, but only by a factor which corresponds to a tolerance of up to T ∼ 1.5. Thus the greater part of the dynamic tolerance estimate of T = 3 comes from the application of the hypothesistesting criterion to the χ 2 for the eigenvectors of the fit.
The tolerance T = 3 is applied for the final estimation of PDF uncertainty; however, ATLASpdf21 PDFs with both choices of tolerance, T = 1 and T = 3, will be made available on the LHAPDF PDF repository [85]. Having fixed the tolerance for the experimental uncertainties of the PDFs, the full uncertainties, including experimental, model and param-eterisation uncertainties are compared for the choices T = 3 and T = 1 in Fig. 20. When an enhanced tolerance such as this is applied to the experimental uncertainties, the model and parameterisation uncertainties are far less significant.
Comparisons of the differential cross-section measurements used as input to the fit with the predictions of ATLASpdf21 fit, with T = 3 and full uncertainties, are shown in Appendix C.

Comparison of ATLASpdf21 with global PDFs
In Figs. 21, 22 and 23, the ATLASpdf21 PDFs are compared with the HERAPDF2.0 [5], CT18, CT18A [76], NNPDF3.1 [78], MSHT20 [77] and ABMP16 [81] PDFs. All figures in this section show ATLASpdf21 PDFs with full uncertainties, such that experimental uncertainties are evaluated with T = 3 and model and parametrisation uncertainties are also included. It is observed that the addition of the ATLAS data sets to the HERA data brings the PDFs much closer to those of the global fits than to HERAPDF2.0, particularly for xd v , xd and xū. The xu v and xd v distributions of the ATLASpdf21 fit are in good agreement with CT18, CT18A, MSHT20 and ABMP16, whereas NNPDF3.1 is an outlier. The xū distribution agrees reasonably well with all the other PDFs. The xd distribution is in fair agreement with CT, MSHT and NNPDF, with ABMP presenting as somewhat of an outlier. 11 However, it should be noted that xd from the ATLASpdf21 fit is somewhat higher at high x. This is further illustrated by the x(d −ū) distribution, which changes from negative at high x when fitting only HERA data in HERAPDF2.0, to positive at high x for ATLASpdf21, in agreement with the global PDFs. However, the ATLASpdf21 x(d−ū) distribution is more positive than the global PDFs for x > 0.3, in better agreement with the newer E906 Drell-Yan data [86], rather than the E866 data which is in the global fits. This is explored further in Appendix D. The central value of the xs distribution is larger than those in NNPDF3.1 and CT18 for x ∼ 0.02, but is in good agreement with CT18A and MSHT20. The gluon distribution agrees best with NNPDF3.1, but is in reasonable agreement with the other global PDFs. 12 The distributions of R s for all the PDFs illustrated are now in broad agreement but there are differences in detail. The uncertainties of HERAPDF2.0 and CT18 are larger than the others because they do not include ATLAS 12 The ABMP16 gluon PDF corresponds to α s (m Z ) = 0.1145, a lower value than in the other PDFs illustrated, which all use α s (m Z ) = 0.118. inclusive W, Z data at 7 TeV. 13 The uncertainties of ABMP16 are smaller for the usual reason that they use tolerance T = 1. The input of ATLAS inclusive W, Z data at 7 TeV is the only difference between the CT18 and CT18A analyses. After input of these data the CT18A R s PDF ratio has smaller uncertainties and a larger central value of R s at low x than CT18. CT18A is in good agreement with ATLASpdf21, for both central value and uncertainty, over the full x range illustrated. The PDF analyses of MSHT20 and NNPDF3.1 also use ATLAS inclusive W, Z data at 7 TeV and have a similar size of uncertainty and level of agreement with ATLASpdf21. In addition, the MSHT20 analysis uses ATLAS inclusive W, Z data at 8 TeV, just as the ATLASpdf21 analysis does. Note that NNPDF have updated their study of strangeness recently to NNPDF3.1_strange [83], which has a somewhat larger strangeness at low x than NNPDF3.1. The modern PDFs which use ATLAS inclusive W, Z data, including ATLASpdf21 itself, all agree that strangeness is not suppressed strongly at low x, but there is substantial suppression at high x.
In summary, the ATLASpdf21 fit now agrees with the global fits as well as they agree with each other. Thus ATLAS data seem able to replicate most of the features that the fixedtarget DIS and DY data plus the Tevatron data brought to the global PDFs, see Appendix D. Using only the HERA and ATLAS data allows a more rigorous treatment of correlated systematic uncertainties and, in particular, of correlations between data sets.
For the global fits considered here, the χ 2 values for the data sets included in the ATLASpdf21 fit are HER-APDF2.0: 2262, CT18: 2135, CT18A: 2133, MSHT20: 2218, NNPDF3.1: 2109, compared with ATLASpdf21: 2010. Although the global PDFs from CT, MSHT and NNPDF have more flexible parameterisations, the χ 2 value of the ATLASpdf21 fit is better for the data sets considered. This is a further indication that the ATLASpdf21 parameterisation is sufficiently flexible. Figure 24 shows the uncertainties of the ATLASpdf21 fit, with the enhanced tolerance T = 3, compared with CT18A and MSHT20 PDFs. These two PDFs are chosen because they both use the ATLAS W, Z inclusive cross-section data at 7 TeV and they both evaluate uncertainties in a similar way to the present analysis. In general, the uncertainties of CT18A are somewhat larger than those of MSHT20, even though both represent 68% CL uncertainties. For the xū and xd sea distributions at low x, the uncertainties of ATLASpdf21 lie between those of CT18A and MSHT20, becoming larger at high x, x > 0.1, reflecting the absence of fixed-target DY data. For the xs sea distribution there is a similar pattern, with the low x uncertainties of the ATLAS PDF being comparable to those of CT18A. The uncertainty of the ATLASpdf21 gluon distribution at x < 2 × 10 −3 is very similar to that of both CT18A and MSHT20, but the high x gluon uncertainty is somewhat larger for the ATLASpdf21 fit, reflecting the absence of Tevatron jet data. In the valence sector, the xu v uncertainties of the ATLASpdf21 fit are comparable to those of CT18A at low x, but larger at high x, reflecting the absence of Tevatron W, Z data. For xd v , the uncertainties are larger for all x except x ∼ 0.003-0.008, reflecting the fact that there is less information about the d-quark than the u-quark in the absence of deuterium target data. Appendix D shows comparisons between the ATLASpdf21 predictions and some of these older data sets which, while not fitted, are generally well described.
In order to focus more on the high x regime, which is important in searches for new physics, the comparisons between the ATLASpdf21 fit, with the enhanced tolerance of T = 3, and the CT18A and MSHT20 PDFs, are repeated with a linear-x and log-y scale in Fig. 25. Some discrepancies, not evident before, appear for x ≥ 0.7. In this high x region there is no data to determine any of the PDFs. Nevertheless, there is agreement within ∼2σ .

Conclusion
A PDF fit is presented using selected proton-proton collision data sets recorded by the ATLAS experiment at the LHC together with the final combined inclusive HERA data in order to assess the impact of the ATLAS data with a minimum of other input data. The resulting PDF set is called ATLASpdf21. The addition of ATLAS inclusive W and Z data to HERA data allows a fit to be made without any constraints imposed on the relationship between xū, xd and xs, nevertheless, xd ∼ xū is observed at low x. The strangeness at low x is found not to be as suppressed as found in earlier PDFs such as CT14, MMHT14 and NNPDF3.0, but to be in agreement with modern PDFs such as CT18A, MSHT20 and NNPDF3.1_strange which use these ATLAS inclusive W, Z data. This increase in the low x strange was first seen in the ATLASepWZ12 and ATLASepWZ16 PDF analyses, but is now somewhat moderated by further ATLAS data, and the freedom of the parametrisation at low x. These inclusive W, Z data also exhibit sensitivity to the valence quark distributions and the gluon distribution, considerably reducing their uncertainty. The ATLAS V + jets data constrain the light-quark sea at higher x, such that strangeness is strongly suppressed at high x, and impact the high x gluon distribution. The tt data serve to moderately reduce the uncertainty in the high x gluon distribution. Ratios of direct photon production at different proton-proton collision energies have only a mild impact on the gluon PDF, but it is notable that they may now be reliably fitted to NNLO in QCD. The jet production data are sensitive to the gluon distribution at medium to high x and they reduce its uncertainty considerably.
The role of scale uncertainties is considered for all data sets and these theoretical uncertainties are implemented in the fit for the inclusive W, Z data, where they are most impactful. The effects of such scale uncertainties are small.
A study is made of excluding high-scale data (with a scale above 500 GeV), which may contain subtle effects of physics beyond the Standard Model, from the fit. The resulting PDFs are not strongly affected by this restriction of the fitted kinematic region.
It is observed that the addition of the ATLAS data sets to the HERA data brings the PDFs much closer to the global PDFs of MSHT, CT and NNPDF than to HERAPDF2.0. The ATLASpdf21 PDFs agree with these global fits as well as they agree with each other. Thus, ATLAS data seem to be able to replicate many of the features that the fixed-target deep inelastic scattering and Drell-Yan data plus the Tevatron Drell-Yan data bring to the global PDFs. Using only the HERA and ATLAS data allows a detailed treatment of correlated systematic uncertainties.
Correlations of systematic uncertainties within and between ATLAS data sets are considered, with particular emphasis on the larger systematic uncertainties appertaining to the jet energy scale. Information about correlations between data sets is provided such that the major correlations could be considered by the global fitting groups CT, MSHT and NNPDF in future fits. The effects of these correlations are relatively small, O(∼1%), at the scale Q 2 ∼ 10,000 GeV 2 considered for precision physics at the LHC, but large enough to be considered in future precision data analyses such as the measurement of m W and sin 2 θ W , given that such a level of accuracy is now desired to resolve effects of physics beyond the Standard Model in the measurements of Standard Model parameters.
Acknowledgements We thank CERN for the very successful operation of the LHC, as well as the support staff from our institutions without whom ATLAS could not be operated efficiently. We acknowledge the support of ANPCyT, Argentina; YerPhI, Armenia; ARC, Australia;

Data Availability Statement
This manuscript has no associated data or the data will not be deposited. [Authors' comment: All ATLAS scientific output is published in journals, and preliminary results are made available in Conference Notes. All are openly available, without restriction on use by external parties beyond copyright law and the standard conditions agreed by CERN. Data associated with journal publications are also made available: tables and data from plots (e.g. cross section values, likelihood profiles, selection efficiencies, cross section limits, ...) are stored in appropriate repositories such as HEPDATA (http:// hepdata.cedar.ac.uk/). ATLAS also strives to make additional material related to the paper available that allows a reinterpretation of the data in the context of new theoretical models. For example, an extended encapsulation of the analysis is often provided for measurements in the framework of RIVET (http://rivet.hepforge.org/). This information is taken from the ATLAS Data Access Policy, which is a public document that can be downloaded from http://opendata.cern.ch/record/413 [opendata.cern.ch].] Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecomm ons.org/licenses/by/4.0/. Funded by SCOAP 3 .

Appendix A Scale uncertainties in W, Z inclusive data
In this appendix, different choices for the treatment of the scale uncertainties in inclusive W, Z data are considered. For these data sets the experimental uncertainties are comparable to the scale uncertainties of NNLO predictions (∼0.5%).
These theoretical uncertainties can therefore have some significant impact on the PDF fit. For the central fit, these theoretical uncertainties are applied in the same way as correlated systematic uncertainties. Changes in renormalisation and factorisation scales are taken to be correlated between W and Z data and to be correlated between the 7 and 8 TeV data sets. Two alternatives are considered here: (i) the scale uncertainties are not correlated between the 7 and 8 TeV data sets and (ii) scale uncertainties are not applied at all.
The results of fits for these two cases, compared with the central fit, are shown as a ratio at a scale Q 2 = 10,000 GeV 2 , relevant for LHC physics, in Fig. 26. If should be noted that the central fit is shown with just experimental uncertainties, evaluated with tolerance T = 1, so that its uncertainties are directly comparable with those of the fit without scale uncertainties. The uncertainties are very similar in size. The differences between the shapes of the PDFs are not large, but they can be important if the desired accuracy of the PDFs is O(∼1%). The difference between the cases where the scale uncertainties are applied as being correlated or uncorrelated between the 7 and 8 TeV inclusive W, Z data sets is shown by the green line in these figures and it can be seen that it is generally a smaller effect.

B Comparison of the impact of inclusive jet data at different centre-of-mass energies
It is not possible to fit inclusive jet production data at different centre-of-mass energies simultaneously, because the full experimental systematic uncertainty correlations between these data sets are not known. The inclusive jet production data at 8 TeV with R = 0.6 were selected for input to the central fit. In this appendix, fits using the inclusive jet production data at 7 or 13 TeV instead of the data at 8 TeV are compared with the central fit. The data at 7 TeV shown here were extracted for R = 0.6 whereas the data at 7 TeV were extracted only for R = 0.4. The gluon PDF and R s PDF ratio using these jet production data sets at 7 and 13 TeV are shown in their ratio to the central fit results in Fig. 27. The scale choice was p jet T for all the jet production data sets included in this figure. Since the effect of using R = 0.4 or R = 0.6 is very similar, as illustrated for the jet production data at 8 TeV, the differences between the PDFs are dominated by the change in centre-of-mass energy. The PDFs using the jet production data at 8 TeV lie between those of the jet production data at 7 TeV and the jet production data at 13 TeV. However, these differences are not significant compared to the full uncertainties of the PDFs evaluated with T = 3. It is interesting to observe that the N 3 LO cross section for Z bosons is expected to be ∼2% lower than the NNLO cross section [74], which would bring the data into agreement with theory without need of shifts of systematic uncertainties (1) are allowed to vary to minimise the χ 2 . The red band represents the full uncertainty (experimental (evaluated with T = 3) + model + parameterisation) of the fit prediction. It is interesting to observe that the N 3 LO cross section for Z bosons is expected to be ∼2% lower than the NNLO cross section [74], which would bring data into agreement with theory without need of shifts of systematic uncertainties.

D Comparison with extra data sets not included in the fit
The ATLASpdf21 fit does not include data from the Tevatron or from fixed-target DY data. In this appendix the predictions of the ATLASpdf21 fit for some of these data sets, which were found to be most impactful in the global fits, are explored and found to be satisfactory. In Figs. 38 and 39 the ATLASpdf21 fit is shown in comparison with Tevatron W and Z data.
The χ 2 /NDF values for the CDF data are 31/28 for the Z data and 35/13 for the W -asymmetry data.
The χ 2 /NDF values for the D0 data are 23/28 for the Z data, 25/13 for the W -electron asymmetry data and 13/10 for the W -muon asymmetry data. The ATLASpdf21 fit therefore provides a fair description, χ 2 /NDF = 126/92, of these Tevatron data, which mostly influence the high-x valence quarks.  Fig. 39 The differential cross-section measurements of (left) Z , (right) W (electron decay channel) and (bottom) W (muon decay channel) in Refs. [90][91][92] (black points) as a function of the absolute rapidity of the Z boson or absolute pseudorapidity of the decay lepton. The bin-to-bin uncorrelated part of the data uncertainties is shown as black error bars, while the total uncertainties are shown as a yellow band. The cross sections are compared with the predictions computed with the PDFs resulting from the ATLASpdf21 fit. The solid line shows the predictions without shifts of the systematic uncertainties, while for the dashed line the b j parameters associated with the experimental systematic uncertainties as shown in Eq. (1) are allowed to vary to minimise the χ 2 . The red band represents the full uncertainty (experimental (evaluated with T = 3) + model + parameterisation) of the fit prediction It is also interesting to consider the description of the fixedtarget Drell-Yan data from E866 and E906, since these are uniquely able to constrain the difference x(d −ū) at high x. Figure 40 compares the predictions of the ATLASpdf21 fit with the E866 pD/ pp data [93] in three mass regions. The χ 2 /NDF values are 9.6/10, 14/14 and 21/15 for the low-, intermediate-and high-mass regions, respectively. Thus, the description of the E866 data, which mostly give information about high-x sea quarks, is good. However, the high-mass region, which covers larger Bjorken-x, is not as well fitted as the lower-mass regions. The ATLASpdf21 fit is in better agreement with the new data from E906 [86] in the high-mass region, as seen in Fig. 41, which compares ATLASpdf21 and other PDFs with the xd/xū ratios extracted from E866 and E906.