Determination of the theory uncertainties from missing higher orders on NNLO parton distributions with percent accuracy

We include uncertainties due to missing higher order corrections to QCD computations (MHOU) used in the determination of parton distributions (PDFs) in the recent NNPDF4.0 set of PDFs. We use our previously published methodology, based on the treatment of MHOUs and their full correlations through a theory covariance matrix determined by scale variation, now fully incorporated in the new NNPDF theory pipeline. We assess the impact of the inclusion of MHOUs on the NNPDF4.0 central values and uncertainties, and specifically show that they lead to improved consistency of the PDF determination with an ensuing moderate reduction of PDF uncertainties at NNLO.


Introduction
The uncertainty on the parton distribution functions (PDFs) that enter any prediction of physical processes is one main bottleneck for precision physics at the LHC.Thanks to methodological progress, especially the use of machine learning techniques and the increase of experimental information, we have recently achieved a determination of PDFs, NNPDF4.0 [1], whose nominal precision reaches the percent level.It is clearly crucial to assess whether this level of precision is reliable, and whether it is matched by the same level of accuracy.
A considerable effort has gone into assessing the impact on uncertainties of the methodology used for the determination of PDFs, and specifically the way it propagates the information contained in the data onto the PDF uncertainty (see e.g. the recent studies in Refs.[2,3]).However, PDF uncertainties, as given in all standard PDF sets, such as NNPDF4.0[1], CT18 [4], MSHT20 [5] or ABMP16 [6], do not include theoretical uncertainties, i.e., the uncertainties that affect the predictions that are compared to the data in the process of determining PDFs from a set of experimental data.The only exceptions are the parametric uncertainty related to the value of the strong coupling α s , which is routinely included since the early days of LHC physics [7], and nuclear uncertainties that affect e.g.deep-inelastic scattering (DIS) data on nuclear targets (such as neutrino DIS data), that are for instance included in the NNPDF4.0PDF determination [8,9].
In principle, theoretical uncertainties may come from a variety of different sources, both parametric (such as the values of the heavy quark masses) and non-parametric (such as the aforementioned nuclear corrections).Theory uncertainties related to missing higher orders in QCD computations -MHOUs henceforth -are particularly relevant, because they affect any prediction.The current typical perturbative accuracy of QCD computations is next-to-next-to-leading order (NNLO), with N 3 LO corrections only known in a small number of cases [10].At NNLO, MHOUs are typically of the order of a few percent or bigger.For LHC precision observables used for PDF determination, such as gauge boson or top-pair production, this is surely comparable to the experimental systematic uncertainties, and often larger or even much larger than the experimental statistical uncertainty.Since the uncertainty on the experimental measurement and on the theoretical prediction enter in a completely symmetric way in the figure of merit used for PDF determination [2], there is no justification to include the former and not the latter if they are of comparable sizes.
In Refs.[11,12] we have presented a methodology for the systematic inclusion of theory uncertainties in PDF fits through a theory covariance matrix, and for the computation of the theory covariance matrix related to MHOU through correlated scale variations.A first application of this methodology to the construction of a set of NLO PDFs with MHOUs, based on the NNPDF3.1 [13] PDF set and methodology, was also presented in these references, but no global NNLO PDF set with inclusion of MHOUs is currently available.Such a construction is now greatly facilitated by the availability of the EKO [14] evolution code, and its inclusion in a new pipeline for producing theory predictions for PDF determination [15], recently used for the construction of the NNPDF4.0QEDPDF set [16] and currently adopted by NNPDF as a standard.Other approaches to the determination of MHOUs have been proposed [17][18][19][20][21][22]; also, MHOUs on NNLO PDFs have been recently estimated based on an approximate N 3 LO PDF determination [23].
It is the purpose of this paper to include MHOUs in the NNPDF4.0NLO and NNLO sets of parton distributions using the methodology of Refs.[11,12].The final deliverables of the paper are thus new versions of the NNPDF4.0global PDF determination, with more accurate central values and uncertainties that now also account for the inclusion of MHOUs in the process of PDF determination.It was indeed shown in Refs.[11,12] that the main effect of including MHOUs in PDF determination is to modify central values, specifically leading to better perturbative convergence.Parton distributions with MHOUs included in the PDF uncertainty should henceforth become the default choice.
This paper is organized as follows.First, in Section 2 we succinctly review the formalism of Refs.[11,12] for the determination of a covariance matrix accounting for MHOUs and its implementation in a PDF fit, specifically referring to its recent implementation in the EKO evolution code.In Section 3 we present the MHOU covariance matrices at NLO and NNLO and validate the NLO covariance matrix by comparing the estimated MHOUs against the known NNLO results.The main deliverables of this paper, namely the NNPDF4.0NLO and NNLO PDFs with MHOUs, are presented in Section 4, where they are compared, both in terms of central values and uncertainties, to their counterparts without MHOUs, thereby assessing the impact of MHOUs on the consistency of the PDF determination.Finally, in Section 5 we discuss the delivery and usage of the PDFs with MHOUs and summarize our results.In an Appendix we provide explicit expressions for the missing higher order terms that are generated upon performing renormalization and factorization scale variation, and that are used in Section 2 to construct the MHOU covariance matrix.

The MHOU covariance matrix
The inclusion of MHOUs is done by supplementing theory predictions with a covariance matrix that accounts for their expected correlated variation upon inclusion of higher-order corrections.This in turn is estimated through scale variation.The whole construction is explained in detail in Refs.[11,12], to which we refer for a detailed discussion.In this section we provide a brief summary of the main aspects of the procedure, with specific reference to its implementation in the EKO [14] code that, as mentioned, is part of the new NNPDF theory pipeline [15] used in this paper.In particular, for ease of reference, in this Section we adopt the same notation as in the EKO documentation, even though they depart somewhat from those of Ref. [14].
We first summarize the way perturbative expansions of various quantities are defined; we then review the way MHOUs on hard cross sections and anomalous dimensions can be estimated by mean of scale variation, and we finally summarize the construction of the MHOU covariance matrix.The expressions that are needed for scale variation up to N 3 LO are summarized in Appendix A.

Perturbative expansion and factorization
We start by writing down explicitly the perturbative expansion of an observable factorized in terms of a hard cross-section and PDFs, with the main goal of establishing notation.Given this, we only consider the case of inclusive electroproduction with a single parton species.The hadronic observable is then a structure function F (Q 2 ) that depends on a physical scale Q 2 , and it is written in terms of a partonic quantity, the coefficient function C(Q 2 ), perturbatively computed as an expansion in the strong coupling and a PDF f (Q 2 ).In Mellin space we simply have The partonic coefficient function (for hadronic processes the partonic cross-section) is expanded perturbatively, and at N k LO it is given by where the LO cross-section is O(a m s ): so for deep-inelastic structure functions m = 0 (in the case of F 2 , F 3 ) or m = 1 (in the case of F L ), for top pair production m = 2 and so on.
The scale dependence of the strong coupling a s (Q 2 ) and of the PDF f (Q 2 ) are given by At N k LO the beta function and anomalous dimensions are respectively given by The coefficients β j are known up to k = 4 (N 4 LO or five loops) [24][25][26][27], while the coefficients γ j are known exactly up to k = 2 and approximately for k = 3 (N 3 LO or four loops) [23,[28][29][30][31][32][33][34][35].The perturbative expansion of the solutions to Eqs. (2.4) and (2.5), which are needed in order to compute the scale variation terms that we are interested in, are given explicitly in Appendix A.
The solution to Eq. (2.5) can be written in the form of an evolution kernel operator (EKO) E(µ 2 ← µ 2 0 ) [14] given by where P denotes path ordering.Of course, including the anomalous dimension to N k LO accuracy yields a PDF f (Q 2 ) whose scale dependence has a resummed (next-to-) k -leading-logarithmic (N k LL) accuracy.The perturbative expansion of this solution is given up to (fixed, i.e. not resummed) order a 3 s in Appendix A.

MHOUs from scale variation
Theoretical predictions at hadron colliders depend on two quantities that are computed perturbatively: the partonic cross sections or coefficient functions, Eq. (2.3), and the anomalous dimensions, Eq. (2.7), that determine the scale dependence, Eq. (2.8), of the PDF.Both quantities can be expressed as a series in the strong coupling a s (Q 2 ), in turn perturbatively given, through the solution to Eq. (2.4), in terms of the value of the strong coupling at a reference scale, typically a s (M Z ).The MHOU on the predictions is due to the truncation of these perturbative expansions at a given order.In principle, if a variable-flavor-number scheme [36,37] is used, a further MHOU is introduced by the truncation of the perturbative expansion of the matching conditions that relate PDFs in schemes with a different number of active flavors.These uncertainties, especially those related to the matching at the charm threshold, are very important if one is interested in PDFs below the charm threshold, such as for instance when trying to determine the intrinsic charm PDF [38,39].However, if one is interested in precision LHC phenomenology, then physics predictions are produced in a n f = 5 scheme, but PDFs are also determined by comparing to data predictions whose vast majority is computed in the n f = 5 scheme.Hence, the matching uncertainties only affect the small amount of data below the bottom threshold (no data below the charm threshold are used), and then through the MHOU at the bottom threshold, which is very small.The MHOU related to the matching conditions are thus subdominant and we will neglect them here.
We thus focus on MHOUs on the hard cross-sections and anomalous dimensions.The estimate of these MHOUs from scale variation are obtained by producing various expressions for a perturbative result to a given accuracy, that differ by the subleading terms that are generated when varying the scale at which the strong coupling is evaluated.Starting with the coefficient function, Eq. (2.3), we thus construct a scale-varied N k LO coefficient function which fixes the scale-varied coefficients Cj (ρ r ) in terms of the starting C j .Explicit expressions are given up to N 3 LO in Appendix A. At any given order C and C differ by subleading terms: their difference is taken as an estimate of the missing higher orders, and it may be used for the construction of a MHOU covariance matrix, as summarized in Sect.2.3.We refer to this way of estimating MHOUs on partonic cross-sections as renormalization scale variation.Through the same procedure, we may obtain an estimate of the MHOU on the anomalous dimension, Eq. (2.7).Namely, we construct a scale-varied N k LO anomalous dimension by requiring that which fixes the coefficients γj (ρ f ) in terms of γ j : of course, the corresponding expressions are the same as those obtained by expressing the Cj (ρ f ) in terms of C j , in the particular case m = 1.Again the subleading difference between γ and γ may be taken as an estimate of the MHOU on anomalous dimensions.This uncertainty then translates into a MHOU on the PDF f (Q 2 ) when this is expressed through Eq. (2.8) in terms of the PDFs at the parametrization scale.We refer to this estimate of the MHOU on the scale dependence of the PDF as factorization scale variation.By substituting the scale-varied anomalous dimension γ(α(µ 2 ), ρ f ) in the expression Eq. (2.8) of the PDF one can show [12] that factorization scale variation can be equivalently performed directly at the level of the PDF, by defining a scale-varied PDF f (Q 2 , ρ f ) whose scale dependence is given by a scale-varied evolution kernel operator (EKO) Ē( and the scale-varied EKO Ē, computed at N k LL, differs by subleading terms from the original EKO: The scale-varied EKO can be constructed as where at N k LL (i.e. with the anomalous dimension computed at N k LO) the additional evolution kernel These two different ways of performing factorization scale variation, by varying the scale of the anomalous dimension or varying the scale of the PDF were respectively referred to as scheme A and scheme B in Ref. [12].
A third way, referred to as scheme C in Ref. [12], consists of using the scale-varied PDF, Eqs.(2.13-2.15),namely in the factorized expression, Eq. (2.2), but including K(a s (ρ f Q 2 ), ρ f ) in the coefficient function instead of the PDF.This amounts to evaluating the PDF at a different scale, but with a modified coefficient function.
The corresponding explicit expressions are also given for completeness in Appendix A.
In standard practice, factorization scale variation is usually performed using scheme C, because this does not require changing the PDFs, which are typically taken as given from an external provider.However, in the context of a PDF determination, factorization scale variation through scheme B, Eq. (2.13), is simplest, as it only requires modifying the EKO used to compute PDF evolution.This is the way we will perform factorization scale variation in this paper.

Construction of the covariance matrix
The MHOU due to the perturbative truncation of the partonic cross-sections and the scale dependence of the PDFs, respectively estimated through renormalization scale variation, Eqs.(2.9) and (2.10), and factorization scale variation according to scheme B of Ref. [12], Eqs.(2.13) and (2.16), are included through a MHOU covariance matrix.This is constructed as follows [11,12].
First, we define the shift in theory prediction for the i-th datapoint due to renormalization and factorization scale variation where T i (ρ f , ρ r ) is the prediction for the i-th datapoint obtained by varying the renormalization and factorization scale by a factor ρ r , ρ f respectively.Note that in Refs.[11,12] the scale variations were parametrized by their logarithms, i.e. through parameters κ r = ln ρ r , κ f = ln ρ f .Next, we choose a correlation pattern for scale variation, as follows [11,12]: • factorization scale variation is correlated for all datapoints, because the scale dependence of PDFs is universal; • renormalization scale variation is correlated for all datapoints belonging to the same category, i.e. either the same observable (such as, for instance, fully inclusive DIS cross-sections) or to different observables for the same process (such as, for example, the Z transverse momentum and rapidity distributions).
Note that this requires a categorization of processes: for instance we consider charged-current and neutralcurrent deep-inelastic scattering as separate processes.The particular process categorization adopted in this work is discussed in Section 3.1.These choices correspond to the assumptions that factorization and renormalization scale variation fully capture the MHOU on anomalous dimensions and partonic cross-sections respectively, and that missing higher order terms are of a similar nature and thus of a similar size in all processes included in a given process category.Different assumptions are consequently possible, for instance decorrelating the renormalization scale variation from contributions to the same process from different partonic sub-channels, or introducing a further variation of the scale of the process on top of the renormalization and factorization scale variation discussed above (see Section 4.3 of Ref. [12] for a more detailed discussion).
We then define a MHOU covariance matrix, whose matrix element between two datapoints i, j is where the sum runs over the space V m of the m scale variations that are included; the factorization scale ρ f is always varied in a correlated way, the renormalization scales ρ r i , ρ r j are varied in a correlated way (ρ r i = ρ r j ) if datapoints i and j belong to the same category, but are varied independently if i and j belong to different categories, and n m is a normalization factor.The computation of the normalization factor is nontrivial because it must account for the mismatch between the dimension of the space of scale variations when two datapoints are in the same category (so there is only one correlated set of renormalization scale variations) and when they are not (so there are two independent sets of variations).These normalization factors were computed for various choices of the space V m of scale variations and for various values of m in Ref. [12], to which we refer for details.As in Refs.[11,12] we consider scale variation by a factor 2, so In Ref. [12] various different choices for the space of allowed variations were considered.These included, among others: the 9-point prescription, in which κ r , κ f are allowed to both take all values (0, ± ln 4), with m = 8 (eight variations about the central value); and the commonly used 7-point prescription, with m = 6, which is obtained from the former by discarding the two outermost variations, in which κ r = + ln 4, κ f = − ln 4 or κ r = − ln 4, κ f = + ln 4. We will show in Section 3 that, upon validation of the MHOU covariance matrix, the 7-point and 9-point prescription have a similar behavior, in agreement with what was already found in Ref. [12].Other prescriptions, with a more limited set of independent scale variations, where shown in Ref. [12] to perform less well, and we will not consider them any further.The explicit expressions for the MHOU covariance matrix with the 7-point and 9-point prescription are respectively given in Eqs.(4.18-4.19)and Eq.(4.15) of Ref. [12].
The set of assumptions that include the correlation patterns of renormalization and factorization scale variations, the process categorization, the range of variation of the scales, and the specific choice of variation points involves a certain degree of arbitrariness.This is inevitable given that the MHOU is the estimate of the probability distribution for the size of an unknown quantity which has an unique true value, and thus it is intrinsically Bayesian.The only way to validate this kind of estimate is by comparing its performance to cases in which the true value is known, as we shall do in Section 3.2.

The MHOU covariance matrix and its validation
We now compute and validate MHOU covariance matrix obtained using the procedure discussed in the previous section: first, we present its construction based on a suitable dataset categorization, and then its validation at NLO where the next-order corrections are known, so the true MHOUs can be determined exactly.

The covariance matrix at NLO and NNLO
In order to determine the covariance matrix we must first choose a dataset and process categorization.The dataset used for the determination of the NNPDF4.0MHOUPDFs is the same as the dataset used for the determination of the NNLO NNPDF4.0PDFs, see Ref. [1] for details.This same dataset is adopted both for the NLO and NNLO NNPDF4.0MHOUPDFs discussed here.In Ref. [1] a somewhat different dataset was used for the NLO PDF determination, in particular excluding datapoints for which NNLO corrections are sizable and including at NLO some data for which NNLO corrections were not available at the time of the writing of that paper.
Here we wish to adopt exactly the same dataset at NLO and NNLO in order to be able to analyze the impact of the inclusion of MHOUs on perturbative convergence without changes in dataset acting as a confounding effect.Note that this involves first, including in the NLO dataset also datapoints for which NNLO corrections are known to be very large, and furthermore including in both datasets datapoints for which downward scale variation leads to a low scale.The use of a MHOU covariance matrix should take care of both issues.Indeed, data with large NNLO corrections should have correspondingly large MHOUs, to the extent that the estimate based on scale variation is accurate.Also, scale variation of low scale data in the worst case will induce sizable shifts and thus large MHOUs that will possibly reduce the constraining power of these data in the direction of the shift, thereby effectively deweighting the data.
As explained in the previous Section, process categories correspond to classes of processes for which missing higher order terms are likely to be of similar enough origin.Therefore the correlation between the MHOU on any pair of predictions for processes in the same category can be computed to good approximation as if they were two datapoints for the same physics process.We thus group processes into nine process categories, namely, neutral-current deep-inelastic scattering (DIS NC); charged-current deep-inelastic scattering (DIS CC); and the following seven hadronic production processes: top-pair; Z, i.e. neutral-current Drell-Yan (DY NC); W ± , i.e. charged current Drell-Yan (DY CC); single top; single-inclusive jets; prompt photon; dijet.With these choices the covariance matrices at NLO and NNLO can be computed from Eq. (2.20).Results at NLO and NNLO computed using the 7-point prescription are shown in Fig. 3.1.As expected, the absolute value of the matrix elements is almost always smaller at NNLO than at NLO: the reduction is in fact typically by more than a factor 2 and on average almost one order of magnitude.However, the pattern of correlations appears to be quite stable upon changes of perturbative order.Note that all data points, including those that belong to different experiments, are correlated through MHOUs on perturbative evolution.This is a significant difference in comparison to a typical experimental covariance matrix.
The relative uncertainties on individual points (i.e. the square-root of the diagonal covariance matrix elements) before and after the inclusion of the MHOU and the MHOU itself are compared in Fig. 3.2 at NLO and NNLO.It is clear that for hadronic processes at NLO the MHOU uncertainty is on average the same size as the experimental uncertainty, while at NNLO it is subdominant.For DIS the difference between NLO and NNLO is less marked, but at NNLO the uncertainty is again subdominant, except at small x and Q 2 .Consequently, we might expect the effect of MHOUs at NNLO to be mostly through correlations, and thus to mostly impact PDF central values, and less so PDF uncertainties, except at small x, for the PDF combinations that dominate the small-x behavior, i.e. gluon and singlet.

Validation
The MHOU covariance matrix at NLO can be validated by comparing it to the known difference between NLO and NNLO predictions.This comparison can be performed using various estimators, originally proposed in Ref. [11].We present here results of this validation, both for our default 7-point prescription, as well as for the 9-point prescription discussed in Section 2.3.
We define a normalized shift vector, whose i-th component δ i is the normalized shift of the i-th datapoint due to the change in theory prediction from NLO to NNLO for fixed PDF, namely where the index i runs over all the datapoints, and T NNLO i and T NLO i are respectively the NNLO and NLO theory predictions both computed using the NLO PDF set.
The simplest validation consists of comparing the shift δ i to the uncertainty on individual points (also normalized), i.e. to the square root of the diagonal entries of the normalized NLO MHOU covariance matrix range of scale variation on a process-by-process basis, it is unclear to which extent the NLO behavior could be generalized to higher orders: perhaps this approach could be pursued once more orders are known, along the lines of Refs.[17,[19][20][21].This validation is however very crude, in that it does not test correlations at all.These can be checked by comparing the eigenvalues of the covariance matrix to the projection of the shift along its eigenvectors.It is important to realize that the shift δ i is a vector in the N dat -dimensional space of data, of which the independent eigenvectors of the covariance matrix span a small subspace S with dimension N sub ≪ N dat .In our case, N dat = 4616 while N sub = 22 for 7-point scale variation, and N sub = 48 for 9-point (see formulae in Appendix A of Ref. [11] with p = 9 process classes).Therefore, a further nontrivial requirement is that the shift vector be mostly contained within the subspace S.
We can perform both tests quantitatively as follows [11].First, we determine the eigenvectors e α i and the eigenvalues λ α = (s α ) 2 of the MHOU covariance matrix, with s α > 0.Then, we determine the N sub projections δ α of the shift vector δ i on the eigenvectors e α i , i.e.
Finally, we determine the component of the shift vector in the N sub dimensional subspace S: and the orthogonal component which is the part of the shift vector that is missed by the MHOU covariance matrix.
We can now test whether correlated uncertainties are correctly accounted for by checking whether s α are of comparable size of δ α -in principle, assuming MHO terms to be Gaussianly distributed, 68% of δ α should be smaller than or equal to s α .We can further test how much of the shift vectors lies in the subspace S by determining the length |δ miss | of the missed vector, and the angle between the full shift vector and its component contained in the S subspace Clearly, if the shift was entirely explained by the MHOU covariance matrix, then |δ miss | = 0, |δ S | = |δ|, and θ = 0. Related to this, it is interesting to observe that the way scale variation prescriptions affect the final result significantly differs when using a MHOU covariance matrix approach, in comparison to the frequently adopted method of estimating MHOUs by taking the envelope of results that are found when varying the scales (as e.g.discussed in Sect.12.4 of Ref. [40]).In the latter case, if the shift produced by scale variation is for some reason unnatural, taking the envelope may lead to overestimated uncertainties.This is in fact the standard argument for favoring the 7-point prescription over the 9-point prescription, as the latter may generate unnaturally large scale ratios.In contrast, with a covariance matrix formalism, an unnatural shift corresponds to a large eigenvalue associate to an eigenvector along which the actual shift is instead small, or perhaps even zero.The effect of this is generally moderate or innocuous: the large eigenvalue means that the best fit can move in the direction of the corresponding eigenvector at little cost in χ 2 , but if the actual shift is small, nothing should be gained by moving in that direction.
Consequently, what matters in judging the effectiveness of the theory covariance matrix is whether the largest components of the shift vector are well reproduced by the corresponding theory covariance matrix eigenvalues (and specifically not underestimated).In practice, we order the shift projections |δ α | by decreasing size, and we compare them in Fig. 3.4 to the covariance matrix eigenvalues |s α |, both for the 7point and the 9-point prescriptions.We also show in figure the length of the missed component |δ miss |.There is good agreement between shift projections and predicted MHOUs for the largest eigenvectors using both prescriptions.For smaller eigenvectors there is also generally good agreement, but the 9-point prescription somewhat underestimates the size of individual components of the shift.
The size of the missed component of shift vector can be seen in Fig. 3.4 to be similar to its largest eigenvector component for both prescriptions, i.e. relatively small in comparison to the full shift, given that the first ten components or so are of comparable size.Indeed, this can be seen from the angle Eq. (3.6) between the shift vector and its projection in the subspace S, which is tabulated in Tab.3.1 both for individual datasets and the full dataset.The two prescriptions perform both surprisingly well, given the very small size of the S subspace, with a very small difference between the two prescriptions despite the difference by more than a factor 2 in the size of the S subspace.For almost all datasets the direction of the shift and its projection in the S subspace are quite close and in some cases very close, the only exception being NC DIS and to a lesser extent DY, especially CC.Table 3.1.The angle θ, Eq. (3.6), for the 7-point and the 9-point prescriptions for individual process categories and for the total dataset.The dimension N sub of the S subspace is also shown.In summary, we conclude that the NLO MHOU covariance matrix accounts quite well for the uncertainty due to the missing NNLO corrections, with only the uncertainty on the DY prediction somewhat underestimated by scale variation, and the performance of the 7-point and 9-point prescription fairly close.Specifically, the 9-point prescription performs slightly better in terms of capturing the subspace in which the shift lies, while the 7-point performs somewhat better in terms of correctly describing the size of the uncertainty in the subspace.In the sequel we will adopt the 7-point prescription as a default.

The NNPDF4.0MHOU determination
We now turn to the main deliverables of this paper, namely the NNPDF4.0NLO and NNLO PDF sets with MHOUs, which are obtained by repeating the corresponding NNPDF4.0PDF determinations, but now also including a MHOU covariance matrix determined with a 7-point prescription, as discussed in Section 2. The underlying dataset is identical to that used for the determination of the NNPDF4.0NNLO PDFs [1].As already mentioned in Section 3.1, here we adopt exactly the same dataset at NLO and NNLO, while in Ref. [1] a somewhat different dataset was used at NLO.Hence, we compare here four PDF sets: NLO and NNLO, with and without MHOUs, all determined based on the same underlying data.
Note that the NLO PDFs without MHOUs shown here are unsuitable for phenomenology, because they include data for which NNLO corrections are very large, and were thus excluded from the NNPDF4.0NLOdataset of Ref. [1].However in this study we prefer to compare PDFs produced using exactly the same code and the same dataset, and that only differ in perturbative order and in the presence of MHOUs, so that the  effect of the latter can be assessed without any confounding effect, however small.Note also that the NNPDF4.0NNLO without MHOUs shown here are equivalent but not identical to the published NNPDF4.0PDFs [1]: they differ from them because of the correction of a few minor bugs in the data implementation, and because of the use of a new theory pipeline [15] for the computation of predictions, which in particular includes a new implementation of the treatment of heavy quark mass effects that differs from the previous one by subleading terms.The impact of these changes was assessed in Appendix A of Ref. [16], and was found to be very limited, so that for any application the NNPDF4.0NNLO MHOU PDFs presented here can be considered to be the counterpart of the published NNPDF4.0NNLO PDFs (without MHOU) [1].

Fit quality
In Tabs.4.1-4.4we report the number of data points and the χ 2 per data point in the NLO and NNLO NNPDF4.0PDF determinations before and after inclusion of MHOUs.When MHOUs are not included, the covariance matrix is defined as in Ref. [1], namely, it is the sum of the experimental covariance matrix C and of a theory covariance matrix accounting for missing nuclear corrections S (nucl) , as determined in Refs.[8,9], and whose impact is discussed in Section 8.6 of Ref. [1].When MHOUs are included, the covariance matrix also contains the contribution Eq. (2.20) discussed in Sections 2.3-3.1, that we call S (7pt) .
Note that the MHOU contribution is respectively excluded or included both in the definition of the χ 2 used by the NNPDF algorithm (i.e. for pseudodata generation and in training and validation loss functions), and in the covariance matrix used in order to compute the values given in Tabs.4.1-4.4.Note also that the experimental covariance matrix used in order to compute the values given in Tabs.4.1-4.4differs from that used in the NNPDF algorithm, because the latter treats multiplicative uncertainties according to the t 0 method [41] in order to avoid the d'Agostini bias, while the former is just the published experimental covariance matrix.In Tab.4.1 datasets are aggregated according to the process categorization of Section 3.1.Individual data sets are displayed in Tab.4.2 (NC and CC DIS), in Tab.4.3 (NC and CC DY), and in Tab.4.4 (top pairs, single-inclusive jets, dijets, isolated photons, and single top).The naming of the datasets follows Ref. [1].Note finally that the χ 2 values shown in Tab.4.1 cannot be obtained by taking the weighted average of those from Tabs.4.2-4.4,i.e. by adding each χ 2 value multiplied by the corresponding number of datapoints and dividing the result by the total number of datapoints, because several of the measurements reported in Tabs.4.2-4.4 are correlated to each other, and furthermore, upon inclusion of the MHOUs, the covariance matrix correlates all points to each other, as discussed in Sect.3.1 and shown in Fig. 3.1.These correlations are lost when showing χ 2 values for data subsets.For the same reason, the total χ 2 shown in Tab.4.1 is not the weighted average of individual values.
Tables 4.1-4.4show that upon inclusion of the MHOU covariance matrix the total χ 2 decreases for both the NLO and NNLO fits, but the decrease is more substantial at NLO.Even after inclusion of the MHOU, the NLO χ 2 remains somewhat higher than the NNLO one.Inspection of Tab.4.2-4.4shows that this is in fact due to a small number of datasets (specifically ATLAS low-mass Drell-Yan), and we have further verified that this is in turn due to a small number of very accurately measured data points (excluded  by the NLO cuts of Ref. [1]) for which NNLO corrections are very substantially underestimated by scale variation.However, for the majority of datapoints and of process categories, the MHOU covariance matrix correctly accounts for the mismatch between data and theory predictions at NLO due to missing NNLO terms, consistently with the validation of Section 3.2.

PDFs and PDF uncertainties
Individual PDFs at NLO and NNLO, with and without MHOUs, are compared in Fig. 4.1 at Q = 100 GeV.We show the gluon, singlet, valence (V , V 3 , V 8 ), and triplet (T 3 , T 8 , T 15 ) distributions (see Section 3.1.1 of Ref. [1]), all shown as a ratio to the NNLO PDFs with MHOUs.The corresponding one sigma uncertainties are shown in Fig. 4.2.The change in central value due to the inclusion of MHOUs is generally moderate at NNLO; at NLO it is significant for the gluon and singlet, but quite moderate for all other PDF combinations.Inspection of Fig. 4.2 shows that the PDF uncertainty at NNLO in the data region remains on average unchanged upon inclusion of MHOUs, though in the singlet sector it increases at small x, especially for the gluon where the increase is up to x ∼ 0.1.At NLO the uncertainty is generally reduced in the nonsinglet sector, while in the singlet sector the uncertainty increases for all x, especially for the gluon.This is consistent with the observation of Sect.4.1 that at NLO the MHOU from scale variation does not fully account for the large shift from NLO to NNLO for some datasets.The somewhat counter-intuitive fact that the uncertainty on the PDF does not increase and may even be reduced upon inclusion of an extra source of uncertainty in the χ 2 was already observed in Refs.[8,9] and demonstrates the increased compatibility of the data due to the MHOU.
The effect of the inclusion of MHOUs on PDF uncertainties is both x-and PDF-dependent, hence in order to obtain an overall quantitative assessment it is necessary to look at the PDF uncertainty on physics predictions.This can be obtained through the ϕ estimator, which was introduced in Ref. [42], and is defined as where by ⟨χ 2 ⟩ we denote the average value of the χ 2 (per datapoint) evaluated for each single replica and averaged over replicas, while χ 2 is the value shown in Tab.4.1 and computed using the best-fit PDF, i.e. the average over replicas.For a single datapoint, ϕ is just the ratio of the PDF uncertainty over the data uncertainty; for many uncorrelated datapoints it is the square root of the average value of the ratio of the    Dataset PDF variance to the data variance; and for correlated datapoints it is this quantity when computed in the basis of eigenvectors of the experimental covariance matrix, thus averaging ratios of the diagonal elements of the theory covariance matrix in this basis to the eigenvalues of the experimental covariance matrix.Hence, ϕ directly measures the PDF uncertainty on the predictions in units of the experimental uncertainties, and thus provides an estimate of the consistency of the data.Indeed, a value ϕ < 1 means that on average the uncertainties in the predictions are smaller than those of the original data, indicating that consistent data are being combined successfully by the underlying theory (see also the discussion in Section 6 of Ref. [12]).The value of ϕ before and after inclusion of the MHOUs is shown at NLO and NNLO in Tab.4.5.It is clear that upon inclusion of MHOUs ϕ is always either unchanged or reduced.The reduction is more significant for processes that are sensitive to nonsinglet combinations, such as charged-current Drell-Yan, in agreement with the behavior of the PDF uncertainties of Fig. 4.2, and on average it is more marked at NNLO than at NLO.
The reduction of ϕ means that PDF uncertainties on physical predictions in the data region are reduced on average by the inclusion of MHOUs.It is interesting to observe that this reduction, while apparent at NLO at least for some PDF combination, is not always visible at NNLO in the PDF plots of Figs.where instead an increased uncertainty is seen, especially in the singlet sector.It should be however observed that the the uncertainty displayed in Figs.4.1-4.2 is the diagonal uncertainty, which combines correlated and uncorrelated uncertainties in quadrature.On the other hand, the computation of physical observables involves different PDF combinations, and beyond LO always an integration over different values of the momentum fraction, all of which are correlated to each other.These correlations are fully included in the ϕ indicator.Now, as seen in Fig. 3.1, MHOUs are generally highly correlated: for example, because MHOUs are smooth, they have a similar impact on datapoints which are kinematically close.Of course, the correlated uncertainty is always smaller or equal to the uncorrelated one.Hence the different behavior of diagonal PDF uncertainties and ϕ shows that the correlation of MHOUs leads in turn to highly correlated MHOUs on PDFs.Quite apart from this, note that of course the ϕ indicator does not provide any information on the behavior of uncertainties in the extrapolation region, hence when computing physical predictions outside the kinematic region covered by the current dataset the PDF uncertainty may well increase upon inclusion of MHOUs.
It is interesting to contrast the behavior of the ϕ indicator seen in Tab.4.5 to that which was discussed in Ref. [12] (at NLO only), see Sect.6 (Tab.6 and especially Tab.8) of that work.Firstly, it should be noticed that the ϕ value before inclusion of MHOU in that reference was more than twice as large as it is here.This is due to the fact that uncertainties in the NNPDF4.0PDF set, discussed here, are significantly smaller than in the then available NNPDF3.1 PDF set.The reason is the improvement in methodology, even with fixed underlying dataset, as extensively discussed in Sect.8.2 (see Fig. 46) of Ref. [1].Indeed, for NNPDF4.0NNLO, ϕ = 0.16 (Tab.31 of Ref. [1]) while for NNPDF3.1 NNLO, ϕ = 0.36 (Tab.8 of Ref. [12]).
Furthermore, in Ref. [12] it was observed that upon addition of a MHOU term to the covariance matrix used in the fit one would expect the uncertainty of the result to increase by an amount which for a single datapoint would just be the sum in quadrature of the MHOU term and the experimental uncertainty.For several correlated measurements, this can again be formalized in terms of an expected increased of the ϕ indicator based on the experimental and MHOU covariance matrices (see Eq. (6.5) of Ref. [12]).The value of ϕ was then observed to increase upon the addition of MHOUs, but by less than the expected amount, and this was interpreted as a sign of the increased compatibility of the data upon inclusion of the MHOUs.
Here instead upon inclusion of MHOUs the value of ϕ is actually reduced, and this despite the fact that with NNPDF4.0 methodology the value of ϕ is lower to begin with.This suggests that the improvement in data compatibility is now rather more significant.This is surely at least in part due to the fact that whereas for NNPDF3.1 hadron collider data played a subdominant role in comparison to DIS data, the converse is true for NNPDF4.0(see the discussion in Sects.7.2.4 and 7.2.5. of Ref. [1]).Because higher-order corrections are more substantial for hadronic processes than for DIS, the impact of MHOUs is accordingly enhanced.Also, because uncertainties with NNPDF4.0 methodology are smaller for equal data uncertainties, the effect of tension between data from different processes due to MHO corrections is enhanced, and the impact of the improved compatibility upon inclusion of MHOUs accordingly enhanced.We conclude that the evidence points towards the fact that data compatibility is increased by the inclusion of PDF uncertainties, especially at NNLO.
A priori, PDF sets with and without MHOUs should not necessarily be compatible within uncertainties, given that the latter do not include an existing source of uncertainty.As a matter of fact, they do agree in the nonsinglet sector, where MHOU are at most of about the same size as the PDF uncertainty before inclusion of MHOUs, but they generally do not in the singlet sector, where NNLO corrections can be very large.
Inclusion of MHOUs generally moves the NLO PDFs towards the NNLO, thereby improving perturbative convergence, except for the gluon.In fact, even for the singlet, while the NLO moves towards the NNLO upon inclusion of MHOUs, the NNLO result remains well outside the NLO uncertainty band, especially at small x.Again, this shows that in the singlet sector there are large NNLO corrections to the NLO result that are underestimated by MHOUs determined through scale variation.At small x this can be understood as the consequence of unresummed small-x logarithms [43] whose increase with perturbative order is not accounted for by scale variation.
The general conclusion is thus that the inclusion of MHOUs estimated through scale variation improves data compatibility.This results in a moderate shift of PDF central values, and a reduction of PDF uncertainties on physics predictions in the data region, demonstrated by a reduction of the ϕ indicator, with diagonal PDF uncertainties mostly unchanged at NNLO and reduced in the nonsinglet sector at NLO, but generally somewhat increased for the gluon both at NLO and NNLO.At NLO the inclusion of MHOUs manages to account for the effect of MHO terms on fit quality while having a moderate impact on PDF uncertainties and central values in the nonsinglet sector, and a more significant impact on central values with an increase in uncertainties in the singlet sector, while not fully accounting for the largest missing NNLO corrections.In the small-x extrapolation region PDF uncertainties generally increase upon inclusion of MHOUs, both at NLO and NNLO.
A more detailed study of perturbative stability and the effect of the inclusion of MHOUs on it is reserved to a companion publication [44], in which the PDF determination and consequently the results presented here are extended to N 3 LO.We also refer to this work for more extensive PDF comparisons, including comparisons at the level of flavor-basis PDFs and parton luminosities, as well as for first studies of the implication of the inclusion of MHOUs at various perturbative orders on predictions for LHC cross-sections.

Delivery and outlook
We have presented NLO and NNLO sets of parton distributions for which the PDF uncertainty includes not only the uncertainty coming from the data and that from the analysis methodology used to go from the data to the PDFs, but also the uncertainty coming from the perturbative truncation of the computations used in order to get the theory predictions that are compared to data (MHOUs).We followed the methodology that was developed in Refs.[11,12] and used there to construct the first NLO PDF sets including MHOUs.This methodology can now be used to determine MHOUs up to N 3 LO, thanks to the availability of the EKO evolution code [14], and the interfacing of the NNPDF code [45] both to it and to flexible tools for the computation of physical processes (such as the YADISM module [46] for DIS) through a new and streamlined theory pipeline [15].
The MHOUs on PDFs discussed in this paper should be treated as an extra contribution to the PDF uncertainty: they reflect the uncertainty in the theory predictions used in PDF determination and are thus on a par with the experimental uncertainty on the data themselves.They are consequently independent of the further MHOU on the hard cross-section of the processes which are being predicted.The total uncertainty on predictions must therefore be obtained by combining the PDF uncertainty, now also including a MHOU component, with the MHOU uncertainty on the hard cross section computation.The latter is typically determined as the envelope of a 7-point scale variation (see e.g.Ref. [47]).However, it is also possible to include MHOUs on the hard cross section by constructing a theory covariance matrix for the hard corss section itself.An advantage of doing so is that it is then also possible to keep into account the correlation between the theory uncertainty in the process used for PDF determination, and that on the hard crosssection, which might become relevant if the experimental uncertainties are small and the predicted process was also used for PDF determination [22,48].This can be done by determining the cross-correlation of MHO and PDF uncertainties between the predicted process and those used for PDF determination [49].
In this respect, it is interesting to observe that an altogether different option for the inclusion of MHOUs on predictions that accounts for MHOUs on PDFs, and their full correlation to MHOUs on the hard crosssections, is to include the scale variation in the Monte Carlo sampling [22].A comparative study of this methodology to that adopted in the present paper, as well of different prescriptions for scale variation (such as, for instance different prescrriptions for the connstruction of the theory covariance matrix of Sect.2.3) will be left for future studies.
It is delivered as a set of N rep = 100 Monte Carlo replicas, and it is denoted as

NNPDF40 nnlo as 01180 mhou
It should be considered a more accurate version of the published NNPDF4.0NNLO PDF set [1].This PDF set is also made available via the NNPDF collaboration website https://nnpdf.mi.infn.it/nnpdf4-0-mhou/, where we also make available the other PDF sets presented in Section 4. These include the NLO PDFs with MHOUs based on the same dataset used at NNLO NNPDF40 nlo as 01180 mhou nocuts and the corresponding baseline NLO and NNLO sets without MHOUs NNPDF40 nlo as 01180 qcd nocuts NNPDF40 nnlo as 01180 qcd .
The latter NNLO set is equivalent to but differs from the published NNPDF4.0NNLO because of the adoption of a new theory pipeline, and was already presented in Ref. [16] (see in particular Appendix A of that reference).The availability of PDFs that include MHOUs in their uncertainty is a step forward in achieving highaccuracy PDFs that can be used for precision phenomenology at the percent level.We will soon extend the results presented here to approximate N 3 LO [44] (for which approximate results are already available [23]).This will allow us to discuss the convergence of the perturbative expansion both of PDFs and perturbative observables, and the effect of the inclusion of MHOUs upon it.Further detailed studies of the phenomenological implications of the NNPDF4.0PDFs, including MHOUs and the approximate N 3 LO PDFs, are left for future studies [50].

DFigure 3 . 1 .
Figure 3.1.The MHOU covariance matrix Eq. (2.20) computed with the 7-point prescription at NLO (left) and NNLO (right).Note that range of the color scale is by one order of magnitude wider in the NLO case.

. 2 )Figure 3 . 2 .Figure 3 . 3 .
Figure 3.2.Comparison of the experimental and MHO contributions to the relative uncertainty, defined as the square root of the diagonal element of the covariance matrix normalized to the value of the theory prediction, at NLO (top) and NNLO (bottom), for all datapoints.The experimental, MHO, and total uncertainties are shown in yellow, red and blue respectively.

Figure 3 . 4 .
Figure 3.4.The projection |δ α | Eq. 3.3 of the shift vector along each eigenvector of the MHOU covariance matrix compared to the square root |s α | of the corresponding eigenvalue, for both the 7-and the 9-point prescription.The plots are ordered by decreasing size of the projections and the results are shown both as absolute and as a ratio.In the absolute panel, the length |δ miss | Eq. (3.5) of the missed component Eq. (3.5) is also shown.

Figure 4 . 1 .
Figure 4.1.The NLO and NNLO PDFs with and without MHOUs at Q = 100 GeV determined in this work.The gluon, singlet, valence (V , V 3 , V 8 ), and triplet (T 3 , T 8 , T 15 ) PDFs are shown.All curves are normalized to the NNLO with MHOUs.The bands correspond to one sigma uncertainty.

Figure 4 . 2 .
Figure 4.2.Relative one sigma uncertainties for the PDFs shown in Fig. 4.1.All uncertainties are normalized to the corresponding central NNLO PDFs with MHOUs.

Table 4 .
1.The number of data points and the χ 2 per data point for the NLO and NNLO NNPDF4.0PDF sets without and with MHOUs.Datasets are grouped according to the process categorization of Section 3.1.

Table 4 . 4 .
Same as Tab.4.1, for (from top to bottom) top pair, single-inclusive jet, isolated photon and single top production.

Table 4 .
5. The ϕ estimator Eq. (4.1) for PDFs at NLO and NNLO with and without MHOUs for the process categories of Section 3.1.