Reconstruction of Monte Carlo replicas from Hessian parton distributions

We explore connections between two common methods for quantifying the uncertainty in parton distribution functions (PDFs), based on the Hessian error matrix and Monte-Carlo sampling. CT14 parton distributions in the Hessian representation are converted into Monte-Carlo replicas by a numerical method that reproduces important properties of CT14 Hessian PDFs: the asymmetry of CT14 uncertainties and positivity of individual parton distributions. The ensembles of CT14 Monte-Carlo replicas constructed this way at NNLO and NLO are suitable for various collider applications, such as cross section reweighting. Master formulas for computation of asymmetric standard deviations in the Monte-Carlo representation are derived. A correction is proposed to address a bias in asymmetric uncertainties introduced by the Taylor series approximation. A numerical program is made available for conversion of Hessian PDFs into Monte-Carlo replicas according to normal, log-normal, and Watt-Thorne sampling procedures.


Introduction
Modern parton distribution functions (PDFs) [1][2][3][4][5][6] are provided with estimates of uncertainties from multiple origins. These estimates are essential for understanding of the accuracy of collider predictions, both for precision measurements and for new physics searches. PDFs are determined from a global statistical analysis of diverse experimental measurements, in deep-inelastic scattering, in production of vector bosons and jets, and in other hard-scattering processes. An optimal parametrization of the PDFs is obtained by a minimization of the figure-of-merit function (χ 2 ) that quantifies the level of agreement between experimental data and theoretical predictions. The minimization is performed with respect to the PDF parameters of interest, and with respect to nuisance parameters associated with theory, experiment and the data analysis procedure. Once the best-fit PDF parametrization is found, additional parametrizations are constructed for estimating the total PDF uncertainty. In this paper, we explore the connection between two methods for quantifying PDF uncertainties, one based on the diagonalization of the Hessian error matrix [7], and one on the stochastic (Monte-Carlo) sampling of parton distributions [8,9]. PDF uncertainties can also be determined by the Lagrange multiplier [10] and offset [11] methods, but the PDFs obtained with the Hessian and Monte-Carlo techniques are the most commonly used.
In any method of PDF error analysis, the ultimate goal is to provide information about the probability distribution in the space of PDF parameter values. This information can be presented in several forms. Much of PDF research [1,2,[4][5][6] relies on an analytic JHEP03(2017)099 the symmetric and asymmetric uncertainties of the CT14 Hessian sets, and also generate positive-definite replica parametrizations if desired. The resulting PDF uncertainties with MC replicas are closer to CT14 asymmetric uncertainties, compared to the Watt-Thorne method. Reconstruction of the full probability distribution from the Hessian sets reveals several subtleties, see section 2B. In particular, naive Monte-Carlo sampling of the PDFs using the Taylor series results in biased asymmetric uncertainty bands that can be corrected.
It is also instructive to compare statistical properties of the CT14 MC replica ensemble (obtained by conversion) and NNPDF MC replica ensemble (obtained using a genetic algorithm). Despite drastic differences in the two replica generation methods, we demonstrate in section 3B that the statistical properties of two MC replica ensembles have deep similarities, and explain why.
In section 2 we compare formulas for the estimation of Hessian and MC uncertainties and derive prescriptions for the generation of MC replicas that generalize the Watt-Thorne prescription. These prescriptions account for asymmetry of PDF errors in the same manner as in the CT14 Hessian set. At the end of the section, we show how to construct individual replica sets that reproduce the positivity requirement imposed on the CT14 Hessian sets. Section 3 compares PDFs, parton luminosities, predictions for LHC cross sections, and their uncertainties obtained with the CT14 Hessian and MC PDFs. In section 3.2 we examine statistical properties of the replica ensemble, and consider the agreement of individual replicas with the experiments in the global fit. Section 4 contains concluding remarks and information on the availability of the CT14 replica PDFs. A computer program is presented for generation of MC replicas according to the new prescriptions that optimally reproduce the Hessian uncertainties and positivity.
2 Generation of Monte-Carlo replicas from Hessian error PDFs

Master formulas for Hessian PDFs
The CT14 parton distribution functions (PDFs) were generated in a global analysis of QCD, by fitting theoretical calculations, at LO, NLO and NNLO, to experimental data for a wide variety of high-Q 2 processes [1]. Once the central (most probable) combination of PDF parameters is found by minimization of the log-likelihood function χ 2 , special "error PDF sets" are constructed to characterize and propagate PDF uncertainties. The error PDFs do not specify all features of the probability distribution in the fit, only a part needed for estimating uncertainties in typical applications. In the Hessian method used by CTEQ, the error PDFs, called "eigenvector sets", keep information about the first and second moments of the probability distribution, sufficient for estimation of central values, standard deviations, and PDF-driven correlations. A PDF ensemble based on MC replicas, when obtained directly from the fit, can in principle reproduce the primordial probability distribution with better accuracy; in practice, its size is commonly limited to no more than a thousand replicas, also enough for reproducing up to the second moments.
Therefore, the error sets primarily quantify the lowest two moments of the primordial probability; and a Hessian eigenvector set is often the only published output from the PDF JHEP03(2017)099 fit. For the applications utilizing MC replicas, especially when large numbers of replicas are to be generated on the fly, one may wish to convert the Hessian PDF ensemble into an MC replica one, according to the procedure that will be laid out. [A computer code implementing this procedure is available for downloading.] The converted MC replicas must ideally preserve the statistical features of the Hessian ensemble. They cannot provide more information about the primordial higher moments than what is available in the Hessian ensemble.
Let us focus for a moment on the construction and properties of the CT Hessian sets. The PDFs f n (x, Q) are parametrized at Q = Q 0 = 1.3 GeV by a functional form F n (x, a) which depends on a set of D adjustable parameters, a = {a 1 , a 2 , a 3 , . . . , a D }. A test statistic χ 2 ( a) is defined as in [26,27] to measure the difference between theory and data. Its explicit definition is reproduced in the appendix. The minimum of χ 2 ( a) defines the central fit, and the variation of χ 2 in a neighborhood of the minimum provides the uncertainty in the fit. Thus, the approximation in the vicinity of the minimum, has two parts. First, the central combination of the parameters, which corresponds, without loss of generality, to all zero a i values ( a = {0, 0, . . . 0} ≡ 0), gives the "best" fit between theory and data, for which χ 2 is minimized: χ 2 ( 0) = min χ 2 ≡ χ 2 0 . Second, the approximately quadratic behavior of χ 2 very close to the minimum allows one to define independent directions in a space, corresponding to eigenvectors z i of the Hessian matrix H ij . The eigenvectors are used to determine the uncertainty on PDFs f and on any quantities X(f ) that depend on them, by computing variations of χ 2 along each independent direction [7].
In this scheme, the independent directions z i are determined by relying on the quadratic approximation for χ 2 ; at the same time, extreme displacementsz ±i along each eigenvector direction are found from variations of the true (not perfectly quadratic) χ 2 . This is illustrated in the left inset of figure 1, showing contours of constant values of the exact and approximate χ 2 for a certain pair of parameters, {z i , z j }. The minimum of the exact χ 2 is reached at the axes origin; the solid and dashed contours are drawn for constant increases JHEP03(2017)099 of ∆χ 2 i with respect to the minima of the exact χ 2 and its quadratic approximation. The z directions are along the axes of the approximate χ 2 ellipsoid. The extreme displacements z ±i are found by demanding the increase T 2 in the true, not approximate, χ 2 . They correspond to the (filled) red circles at the intersections of the z axes and the irregular contours of the true χ 2 in the left-hand side of figure 1. Note that both the functional form of the true χ 2 and the value of T 2 depend on the context of the analysis. In CT fits [26,27], the true χ 2 that determines the uncertainties consists of two tiers of conditions, quantifying the global agreement with the experiments, and deviations from individual experiments, cf. the appendix. The parameter T is set to 10 at 90% confidence level (c.l.). The complex nature of true χ 2 implies that, most generally,z +i =z −i ; still, its overall features are captured well by the approximate quadratic χ 2 .
This introduces one source of asymmetry in PDF errors, due to the cubic and higher sign-odd powers of z in the full χ 2 (z). The impact of this source is reduced by rescaling Keep in mind that disagreements of the primordial z-dependent probability with the Gaussian behavior are weak to start with, and that the Hessian sets specify the probability (i.e., χ 2 ) only at three points per an eigenvector direction. Aside from the above conditions at the three points, we will not need to know the rescaling function z i (R i ) in the rest of the discussion. 1 The probability distribution in terms of R is even closer to the Gaussian one: (2. 2) The transformation is illustrated by the right-hand side of figure 1. This approximation is known from experience to hold well in CT fits within the 90% c.l. region around the best fit, even though large |R i | (greater than 1.5) may be excluded by tier-2 penalties on χ 2 that enforce agreement with the individual fitted experiments [26,27]. Next, we consider a function X( R) of the parameters R i , such as the PDF or even the QCD cross section. This function may have asymmetric uncertainties, regardless of symmetry of P( R). Expanding X( R) near the global χ 2 minimum in the Taylor series up to the second derivatives, we have The derivatives can be estimated by finite differences,

JHEP03(2017)099
In these equations, the X values are taken at the global minimum, at the red circles on the axes in figure 1, . . , 0) ; (2.8) and at the blue squares on the diagonals, We see that the 2D Hessian eigenvector sets corresponding to R i = ±1 are sufficient for computing the first and diagonal second derivatives, ∂X/∂R i and ∂ 2 X/∂R 2 i . The mixed second derivatives would require 2D(D − 1) additional error sets for {R i , R j } = {±2 −1/2 , ±2 −1/2 }. Those off-diagonal error sets are not provided in the conventional PDF analyses.
The Taylor expansion of X R can be used to construct various master formulas for the estimation of Hessian PDF uncertainties. Keeping only linear terms on the right-hand side of eq. (2.3), one obtains If R i = ±1 are reached at the boundary of the 68% c.l. region (and the Hessian eigenvector sets are determined according to this convention), the maximal variation of X within the unit hypersphere of R, yields the Hessian symmetric PDF uncertainty at 68% c.l., denoted by δ H 68 X [7]. However, the published CT14 parametrizations correspond to the 90% c.l. Using the same master formula, we find the 90% c.l. uncertainty from the CT14 sets, satisfying δ H 90 X ≈ 1.65 δ H 68 X. If the diagonal second-order derivatives are also included, eqs. (2.10) and (2.11) are modified to Also, asymmetric uncertainties on X can be estimated by [13] δ H> (2.14)

JHEP03(2017)099
The absolute values of the asymmetric uncertainties are generally not the same for the positive (>) and negative (<) displacements [13]. For example, X +i − X 0 and X −i − X 0 must have opposite signs if X varies linearly with R i , but they can have the same sign if the nonlinearity of X is large and hence result in a zero contribution to either δ H> 68 X or δ H< 68 X. In the CT convention, low-number eigenvectors, corresponding to the bestdetermined directions, tend to be more symmetric (with opposite-sign variations), while high-number eigenvectors can be strongly asymmetric.

Probability density distribution for X
We will now formulate a set of requirements for Monte-Carlo replica generation that will be applied in the next subsection to devise our replica generation method. A reader interested in the specific formulas for replica generation may skip this subsection and proceed to the next.
The Monte-Carlo replicas quantify the full probability density distribution P(X) for a QCD observable X, going beyond the intervals of fixed probability available with the Hessian PDFs. A priori, P(X( R)) = P( R). The probability P(X( R)) might be extracted directly from the fit by using a Lagrange multiplier scan or stochastic sampling of the exact log-likelihood χ 2 . But, more often than not, the outcomes of the fit are initially stored in the form of the Hessian PDFs f a (x, Q; R). The complete functional forms for the PDFs may be unavailable or difficult to sample. Then, one relies on constructing the Monte-Carlo replicas from the Taylor series for the Hessian PDF values provided in {x, Q} space, and without invoking the exact parametrization form.
In addition to the central prediction X 0 , the Hessian PDFs specify two values of X per each eigenvector direction, that is, X ±i . This information is generally insufficient for reconstructing the true P(X) by Monte-Carlo sampling. Even the confidence of probability intervals for P(X) may remain undetermined! Recall that, by their construction and using the parameter distribution P( R), the 68% c.l. Hessian values {X −i , X 0 , X +i } correspond to the cumulative probabilities of about 16%, 50%, and 84%. When one naively samples X( R) assuming a multi-Gaussian distribution for R, one is not guaranteed to obtain the same cumulative probabilities for {X −i , X 0 , X +i } from the distribution P(X) of Monte-Carlo replicas, because of probability folding. When folding occurs, the 68% probability interval of the Monte-Carlo replica set distributed according to P(X) deviates from the 68% c.l. Hessian uncertainty interval found using P( R). In the Gaussian case, the central combination R = 0 of the PDF parameters, associated with the central CT14 PDF, is located at the mode, median, and mean of P( R), as those coincide. On the other hand, the central CT14 prediction for X does not automatically correspond to either the mode, median, or mean of P(X), because those can differ.
Indeed, when X( R) is not monotonic, several combinations ( R(X)) k may produce the same value of X. Replicas of such X obey a folded probability P(X), meaning that the cumulative probability for getting X( R) is not equal to that for the corresponding R. In the one-dimensional case, a simple example is X = |R| with a normally distributed R on [−∞, +∞]. Then X obeys a half-normal distribution on X ∈ [0, +∞]; JHEP03(2017)099 X 0 = 0 corresponds to the cumulative probabilities of 50% and 0% according to P(R) and P(X(R)), respectively.
If X(R) depends on one eigenvector parameter R and is continuous and monotonic in K intervals [r 0 , r 1 ],. . . , [r k−1 , r k ], then X(R) is invertible in each interval: R = R k (X) for r k−1 ≤ R ≤ r k . We indicate that x lies within [a, b] using a notation Θ (x ∈ [a, b]) , with Θ(L) equal to 1 or 0 if the statement L is true or false, respectively.
The function X(R) is generally not normally distributed, even when R is. That is, while the probability density distribution P(X) does not need to be Gaussian. It satisfies as follows from the normalization conditions for the probability densities, Going back to the truncated Taylor series with a non-negligible ∂ 2 X/∂R 2 , where the derivatives are estimated by the finite differences from X 0 and X ± , we could solve for R(X) and find up to two solutions for R per each X value. [P(X) is then given by eq. (2.16) with K = 2.] Such truncated Taylor approximation is clearly not monotonic. We expect it to deviate from the true X(R) when the displacements R are large, and to produce a folded probability distribution P(X) with a displaced median value and confidence intervals, as compared to the true P(X). We may even end up with a problematic arrangement when the central CT14 Hessian prediction is outside of the nominal 68% probability interval of the replicas. These observations continue to hold when R has D components.
On the other hand, in a variety of important applications, our group observed that the Lagrange multiplier and Hessian methods render similar probability intervals. Most recently, the two methods were shown to predict compatible 68% and 90% confidence intervals for the common LHC observables, such as jet, Higgs, and tt production cross sections [1,[28][29][30][31]. This indicate that the impact of the second derivatives is mild in constrained regions, credible estimates for P(X) can be found via the sampling of the Hessian PDFs after correcting for the Taylor series' imperfections. At intermediate x, where the nonlinearities in the PDF uncertainties are weak, the Taylor expansion performs well. The discrepancies are more pronounced at small and large Bjorken x with fewer experimental constraints. Nonlinearities and folding effects are small for well-constrained eigenvector parameters R i and can be substantial for poorly constrained ones.

JHEP03(2017)099
With these observations in mind, we design a Monte-Carlo sampling procedure for the PDFs using the Taylor series in the case of mild nonlinearities. First, rather than always sampling the PDFs f a (x, Q; R), we may sample a function X of f a (x, Q; R) that reduces the impact of ∂ 2 X/∂R 2 i . For example, at moderate x, one could just sample the PDFs directly, i.e., set Q)) ∼ R · const may be more desirable for suppressing the nonlinearities; we will show that the log-normal sampling leads to more physical behavior. We explore this possibility when generating positivedefinite PDF replicas in section 2.4.
Second, we need to assign cumulative probabilities to the input X values. Within the accuracy of the Hessian approximation, we can make use of the knowledge that the Hessian central value and Hessian uncertainties are close to the median and 68% central probability intervals for the PDFs, respectively. This "obvious" conclusion follows from the Lagrange Multiplier studies. It can be violated by naive sampling, hence we enforce it as a separate requirement. Furthermore, for mildly asymmetric populations, the median, with the cumulative probability of 0.5, tends to lie close to the mean given by the standard formula [32,33]. In the various sampling scenarios that we had tried, we confirmed that the median of a PDF replica distribution differed little from the mean, except for in the extreme regions. It suffices to assume that the central Hessian PDF coincides with the mean of the PDF replica ensemble, easily computable from the replicas' values and close to the median value. So when the replica ensemble is first generated, the median/mean at each x and Q may disagree from the central CT14 PDF as a consequence of the probability folding caused by the Taylor expansion, as well as because of residual fluctuations. We correct for this mismatch by a uniform shift in all PDF replica values, as explicated in the next section.
In section 3, we show that this sampling procedure reproduces the central CT14 Hessian prediction (coinciding with the mean of the MC replicas by construction), and the 68% probability intervals of the MC replicas approximate both the Hessian symmetric and asymmetric uncertainties.

Generation of Monte-Carlo replicas
We will now present the formulas to generate the Monte-Carlo PDF replicas. We produce N rep sets of the PDF parameters, , which are randomly distributed according to the probability density P( R). We will again rely on the experience that P( R) is typically close to the standard normal distribution, as in eq. (2.2), with the mean (or, central) values R = 0 corresponding to the best-fit PDF set, and the standard deviations that are found from the Hessian PDFs. Therefore the replicas R (k) will be sampled from the standard normal distribution in eq. (2.2).
The Monte-Carlo replica X (k) of a QCD observable, X, will be constructed as (2.19) in terms of the value X 0 for the central Hessian set, a random shift d (k) ≡ d( R (k) ), and a constant shift ∆ applied to all replicas (k = 1, . . . , N rep ). Examples of X( R) include the JHEP03(2017)099 PDF, parton luminosity, cross section, or the logarithm of PDF discussed below. The ∆ parameter can be set to zero, as in the original Watt-Thorne method. But as was argued in section 2.2, we find it helpful to allow a shift of all replica values by a constant amount ∆. When X is a PDF, we expect that X( R) ≈ X( R ) = X 0 within the accuracy of the Hessian approximation. [This relation does not hold for an arbitrary X.] Monte-Carlo sampling of the Taylor series may disagree with this condition. The Taylor expansion may lead to systematic biases, as discussed above. That is, X 0 may differ from the mean X over the replica ensemble, 20) even when N rep is large. The mean X also fluctuates around X 0 , as the number of replicas changes, but these fluctuations are small when N rep is above a few hundred. If our goal is to reproduce the Hessian central value ("truth") and Hessian uncertainties as closely as possible, we may correct such small offsets of the mean, after all d (k) are computed, by shifting all replicas by a constant amount We then get X = X 0 , up to a small numerical uncertainty, when the replicas are generated according to eqs. (2.19) and (2.21). This choice will be adopted as the default. The random displacements d (k) are found using eq.
for the k-th replica. For the PDFs, X( R) = f a (x, Q 0 ; R), this prescription preserves the usual sum rules obeyed by the PDFs, since each replica is a linear combination of the Hessian eigenvector sets, which satisfy the sum rules individually.
Symmetric MC replicas. If only the first derivatives are retained in the Taylor series, then we have On average, X (k) are symmetrically distributed in the positive and negative directions. The mean of the replicas, remains close to the central Hessian value X 0 : X = X 0 with good accuracy, and the global shift ∆ can be neglected. The MC uncertainty on X is then quantified by a standard deviation, It is expected to approach the 68% c.l. symmetric Hessian uncertainty δ H 68 X in eq. (2.11), in the N rep → ∞ limit.
Asymmetric MC replicas. In analogy to eq. (2.12) for the Hessian uncertainties, the diagonal second derivatives ∂ 2 X/∂R 2 i can be included as (2.24) In this case eq. (2.19) must include the shift term ∆ = d to satisfy X = X 0 . Now the asymmetric error estimators are given by the standard deviations that include only positive (negative) displacements from the mean value: These are the MC counterparts of the asymmetric Hessian uncertainties, δ H≷ 68 , provided by eqs. (2.13) and (2.14).
Compared to the previous work on the MC replicas, our prescription thus includes several new features: the generation of MC replicas using asymmetric displacements (2.24) derived from the Taylor series, a constant shift ∆ to ensure agreement between the central Hessian set and the mean of the replicas, and asymmetric standard deviations δ MC≷ X. Watt and Thorne [18] provided the same formulas (2.22), (2.23) for generating the symmetric MC replicas. Their prescription for generating MC replicas with asymmetric displacements is different from the Taylor-series displacements in (2.24) and is given by with ∆ = 0. We advocate using the asymmetric standard deviations (2.25) and (2.26) as estimators for 68% c.l. PDF uncertainties, as they are simple, numerically close to the Hessian asymmetric estimators, cf. the next section, and converge rapidly with the number of replicas. 2 In terms of the relative significance, using δ MC≶ X with ∆ shifts, together with the asymmetric standard deviations, is most important for reproducing the asymmetric Hessian uncertainties. As a secondary effect, mild numerical differences were also noticed between the Taylor-series formula (2.24) and Watt-Thorne formula (2.27) for displacements d (k) .

Monte-Carlo replicas for CT14 parton distributions
The formulas derived in section 2.3 are applicable for generating MC replicas for the PDFs themselves, with an extra modification. We perform a truncated Taylor series expansion of f a (x, Q; R) at every x point, assuming small deviations from the central value X 0 . In unconstrained x regions, the linear expansion is not sufficient -in fact, it may lead to unphysical solutions such as negative cross sections. Parametrizations for Hessian distributions from the CT family satisfy positivity constraints, f a (x, Q) ≥ 0, and render non-negative physical cross sections. To realize the positivity of each MC replica, and eventually to better approximate the non-linear probability distribution, we construct the replicas by Gaussian sampling of ln [f a (x, Q)], rather than by sampling f a (x, Q) directly.

JHEP03(2017)099
The final distribution of CT14 MC replicas includes two families at NNLO in the QCD coupling strength, with 1000 replica sets each, as well as two counterpart families at NLO. Replicas f Replicas of the second type (log MC, or MC2) are generated by sampling of a Gaussian distribution for X = ln [f ] , whose variance is estimated as δ MC X 2 , and the mean value is and var [f ] = e µ+2σ 2 (e σ 2 − 1). Consequently the "log MC" replicas reproduce the Hessian PDF uncertainty in well-constrained x regions and stay non-negative in the poorly constrained regions. The difference between the two types of replicas is illustrated by figure 2, showing PDFs for the CT14 NNLO 56 Hessian eigenvector sets (blue circles) and 1000 MC replicas obtained with linear sampling (orange triangles) and logarithmic sampling (red diamonds). The MC replicas are shifted as described above to have f MC equal to f 0 , the central PDF of the Hessian set. The vertical axis is scaled as |f /f 0 | 0.2 sign (f ) to better visualize relative variations of both signs in an extended magnitude range.
The value of 1 corresponds to the replica PDF coinciding with that of the central set. The majority of replicas produce PDF values that are close to the central PDF set, within the intervals enclosing 1 where many replica symbols overlap. Some replicas for less constrained PDFs produce very large deviations, corresponding to the values far from 1. The Hessian and log-sampled replicas are non-negative by construction, facilitating positivity of physical cross sections. The linearly sampled replicas can take negative values in extrapolation regions (i.e., lie below the horizontal line at zero). In practice, it is desirable to have non-negative cross sections for individual replicas, even though the uncertainty band does extend to zero at some confidence level.

Estimation of PDF uncertainties
To justify the use of the replica sets for estimating the PDF uncertainties, we must first show that the mean and standard deviation of the replica PDFs themselves are consistent with the results obtained directly from the Hessian error PDFs. A complete set of the comparisons is available at [34]. The PDFs are generated by sampling LHAPDF6 grid files using the mcgen code [35]. Figure 3 shows the values and error bands, for the ratios of g(x, Q 0 ), d(x, Q 0 ),ū(x, Q 0 ), ands(x, Q 0 ) PDFs to the respective central CT14 PDFs, at the initial scale Q 0 = 1.  Lastly, in figure 6, we compare the CT14 NNLO Hessian PDFs to a MC1 version that does not shift the replicas by a constant amount ∆. In this case, the whole bands are shifted compared to the Hessian band across all x, by following their wiggly mean PDFs. [The reader can generate the unshifted replica ensemble with the mcgen program if desired.] As was argued in the previous section, the ∆ shift is introduced to eliminate such spurious variations introduced by truncation of the Taylor series.
It is also useful to compare the luminosity functions and their uncertainties, for the MC replicas and the Hessian PDFs. The luminosity function of a parton-parton pair, for production of a final state M 2 at collider energy √ s, is defined as in [36], with an additional constraint that the rapidity of the final state, given by y  y cut in magnitude: The gluon-gluon and quark-antiquark luminosity functions, calculated from the CT14 Hessian and MC PDFs according to this formula, are shown in figure 7 and 8 as functions of M 2 with √ s = 13 TeV. We impose a constraint |y| ≤ y cut = 5 to exclude contributions to the integral from regions x < 10 −5 , which otherwise may bias the shown luminosities at invariant masses below 40 GeV. For x < 10 −5 , the CT14 PDFs are not constrained by the experimental data, and the final state particles are likely to be produced in the forward region outside of the experimental acceptance of the LHC detectors. With the constraint, the comparison of luminosities is more relevant to the LHC measurements. We see that the Hessian and MC uncertainties agree well across most mass range both in gg and qq sectors, with somewhat larger deviations observed at lowest and highest masses.
Taken together, figures 3, 4, and 5 demonstrate that our specifications for the construction of the replica PDFs do yield a satisfactory consistency with the central CT14 PDFs, and with the Hessian error PDFs. Therefore we expect that the uncertainties of  cross sections calculated using the replica PDFs will agree with those based on error propagation via the Hessian PDFs. Comparisons of the most common LHC cross sections and cross section asymmetries, computed with the CT14 Hessian and MC PDFs, are presented at [34]. By our design, in the regions where the PDFs are well constrained, the ensemble average of the predictions from either CT14MC1 or MC2 PDF set is expected to be close to the central prediction given by the central CT14 Hessian PDF set. Similarly, the asymmetric standard deviations given by the CT14MC PDF set are expected to be very close to the 68% c.l. uncertainties of the CT14 (Hessian) PDF error set, except when the x value is very large (more than 0.7) or small (less than 10 −5 ). Some examples of cross sections are computed using the Applgrid fast interface [37] for interpolation of NLO cross sections computed with the help of Mad-Graph aMC@NLO [38], and aMCfast [39]. The Applgrid input cross section tables are available at [40] from a study of cross sections based on PDF4LHC15 PDFs [25,41]. Specifically, we computed CT14 Hessian, MC1 and MC2 rapidity distributions with no or minimal experimental cuts for production W ± , Z 0 , tt, ttγγ, W +c (W − c) production at the of final-state particles. For W +c (W − c) production, we neglect small contributions with initial-state c or b quarks. For NLO single-inclusive jet and dijet production, we use public APPLgrid files in the bins of ATLAS measurements [42], created with the program NLO-JET++ [43,44]. Similarly, the W charge asymmetry in CMS experimental bins [45,46] is computed with APPLgrid from [47]. An example of the cross sections on the website is shown in figure 9. For ease of comparisons, the PDF uncertainties are plotted both for the cross section values and for ratios to the central CT14 prediction.

Only an ensemble of MC replicas is meaningful
The results presented above provide a practical prescription for estimating asymmetric Hessian PDF uncertainties using an ensemble of Monte-Carlo replicas. As was already emphasized, some information about the primordial probability distribution in the global fit is lost in the process; however, the prescription is simple and provides a reasonable estimate in most cases. The method is most reliable and unique when the PDF uncertainty is small, meaning that the Taylor series converges fast. When the uncertainty is large, we see more variations, and several alternatives are conceivable. For example, our MC replicas are constructed so that their mean set coincides with the central set of the Hessian ensemble JHEP03(2017)099  within numerical roundoff errors, as long as x is not too large or too small. Instead, we could choose the central set of the Hessian ensemble to coincide with the median or mode sets of the MC replicas. Similarly, the PDF uncertainty bands may be defined by the positive and negative standard deviations discussed above, or by the boundaries of the 68% c.l. interval centered on the "central PDF set" chosen above. We have checked that the various prescriptions agree well at intermediate x.
It is important to note that most replicas are poor fits to the hadronic data used in the global analysis; however, their averages and standard deviations defined in section 2 provide excellent approximations for the Hessian central PDF set and 68% c.l. uncertainties. This is demonstrated in figure 10, showing histograms of χ 2 values for the global data (3174 data points; left panel) and for the combined HERA-1 data (579 data points [48]; right panel), for the 1000 replicas in the CT14 NNLO MC1 and MC2 ensembles. The vast majority of replicas yield very large χ 2 values for the global data, and even for the single experiment. The random fluctuations of the individual replicas, which result in large χ 2 values for any single replica, will largely cancel in the ensemble averages.
It is straightforward to understand why the result shown in figure 10 occurs. Imagine that we construct a D-dimensional vector a whose coordinates are given by random variates a i sampled from a standard normal distribution. The length of a will often turn out to be significantly larger than 1. If those parameter values are used for the PDFs, χ 2 ( a) will tend to be much larger than the minimal χ 2 in the fit, especially if the number of dimensions D is large. [Recall that the volume of a D-dimensional unit ball vanishes in the limit of large D. The vast majority of replicas will have several parameters far outside of the unit ball, i.e., away from the best fit by many standard deviations.] The expected value of the length of a can be easily found as

JHEP03(2017)099
The final approximation in (3.2) follows from Stirling's formula for the gamma function. For D = 28, eq. (3.2) gives | a| ≈ 5.24; that is, a typical displacement vector of a CT14 replica is more than five standard units in length. As a result, most replicas have a χ 2 value that significantly exceeds the CT14 best-fit value of about 3250 for the global data and 590 for the HERA-1 data. If the standard deviation of the normal distribution is rather about 6 units (corresponding to ∆χ 2 ≈ 100 at 90% c.l.), the average displacement vector corresponds to ∆χ 2 ≈ (5.24·6) 2 ≈ 1000 for the global data and ∆χ 2 ≈ 180 (scaled down by 579/3174) for the HERA-1 data. This is far outside the typical 1σ for the χ 2 distribution, given by T 2 ≈ 36, 2N pts ≈ 80(35), or another common estimator! These estimates are in good agreement with the actual χ 2 averages over the CT14 replicas, denoted as χ 2 rep , and equal in the case of the MC1 (MC2) NNLO ensembles to ≈ 4300 (4200) for the global χ 2 , and ≈ 720 (730) for the HERA-1 χ 2 .
It is interesting to note [49] that the χ 2 distribution for the CT14 MC replicas is quite similar in shape to the χ 2 distributions for the global data of NNPDF replicas, obtained with an entirely different methodology [3,[14][15][16][17]. For example, for the four NNPDF3.0 NLO fits listed in table 1 of ref. [50], the equivalent of our χ 2 rep is 600-1000 units higher than the χ 2 value for the average PDF set of all replicas. 4 This can be qualitatively understood by noticing that the tensions between the individual experiments can be effectively accommodated by introducing a tolerance T exp ∼ 2 on the global χ 2 at the 68% c.l. [26,51,Section 7], and that the neural network parametrizations effectively have of order D = 100−250 free parameters. This predicts χ 2 rep −χ 2 min ∼ T 2 exp D = 400−1000, in fair agreement with the actual NNPDF3.0 outcomes.
Individual replica PDFs thus cannot be considered as alternative PDFs with approximately the same accuracy as the central fit, or even with the accuracy of the Hessian error PDFs. The replica PDFs are meaningful only inside an ensemble, predictions based on them must be calculated by averaging over the ensemble.

Conclusions
The paper explored methods for the conversion of Hessian PDF error sets into Monte-Carlo replica PDFs. We observed that parameters of the PDFs approximately obey a multidimensional normal distribution in the vicinity of the global χ 2 minimum. A statistical sample of MC replicas dependent on the probability distribution in the D−dimensional parameter space can be reconstructed under the Gaussian assumption from the distribution of the Hessian error PDFs on the (D − 1)-dimensional boundary of some confidence region enclosing the best fit. The CT14 Hessian error PDFs allow the user to estimate the PDF errors and account for the asymmetries due to mild deviations from the Gaussian shape and nonlinear nature of PDF parametrization. The CT14 PDFs are parameterized to be explicitly non-negative, as desired for obtaining sensible cross sections and uncertainties. In section 2 we have generalized the algorithms for the generation of MC replicas and 4 The NNPDF table constructs χ 2 rep from the χ 2 values between theoretical predictions for an individual PDF replica and true (not fluctuated) experimental data. The average is over the ensemble of the NNPDF replicas.  for the estimation of PDF uncertainties necessary to reproduce these features. Based on these results, we have generated two replica ensembles, designated as CT14MC1 and CT14MC2 PDFs. The average PDF sets and the asymmetric standard deviations of MC1 and MC2 closely reproduce the central sets and the 68% c.l. asymmetric uncertainties of the respective Hessian ensembles, cf. section 3. With the naive conversion, the mean set over the replicas may deviate from the central Hessian PDF set because of the truncation of Taylor series. After we apply a correction, the MC mean and central Hessian sets coincide, preventing potential discrepancies in the predictions. In addition, the CT14MC2 PDF replicas, obtained by sampling of a log-normal distribution, are explicitly non-negative.
To achieve good agreement between the CT14 Hessian and MC uncertainty bands, we recommend to construct the MC replicas by random sampling according to the algorithm summarized as eqs. ( Our results in section 2 clarify several aspects of the replica generation method that were not addressed in the previous work. Section 3 demonstrates that, in most of the x range, where the PDF uncertainties are small, we observe good agreement between the JHEP03(2017)099 CT14 Hessian, MC1, and MC2 error bands. In the extrapolation regions, where the linear approximations cease to be adequate, differences between the Hessian, MC1 and MC2 error bands are indicative of intrinsic ambiguities in the replica generation methods. At such extreme x, they may affect the combination of the Hessian and MC ensembles, as in the recommended PDF4LHC procedure, or PDF replica re-weighting. These ambiguities require consideration when the converted replicas are utilized in PDF reweighting or PDF combinations. The positivity constraint on CT14 MC2 yields more physical behavior in poorly constrained x intervals.
Once the MC replicas are generated by conversion, we examine their statistical properties in section 3.2. We point out that many individual MC replica sets yield poor χ 2 , so only their statistical combinations, such as the mean, standard deviation, etc., are meaningful in applications. [The majority of MC replicas aren't good fits, their combination is.] On the other hand, the χ 2 distributions of MC replicas obtained by conversion (such as CT14 MC) or genetic algorithm (such as NNPDF) are very similar. This reflects the underlying commonalities of the two methods, leading to comparable PDF errors in the two approaches in spite of the distinct error definitions and fitting procedures.
The CT14 MC1 and MC2 ensembles at NNLO and NLO accuracy, together with a fast standalone driver program for their interpolation, can be downloaded at [52]. They are also distributed as a component of the LHAPDF6 library [53]. A public C++ code mcgen is made available [35] for generation of MC PDF replicas using the normal, log-normal, and Watt-Thorne sampling methods, and with or without including the ∆ shifts. mcgen can be run as a standalone program or together with the Mathematica package MP4LHC for combination of PDF ensembles according to the meta-parametrization method [19]. After N input Hessian error PDF sets are read in the form of LHAPDF6 grids, N rep output replicas are generated by random displacements of the Hessian replicas. Besides the replica generation, mcgen supports various algebraic operations with PDFs in the format of LHAPDF6 grid files, such as addition, averaging, and multiplication of the tables in which the PDF values are stored.

JHEP03(2017)099
The most probable solutions for CT14 PDFs are found by minimization of a global log-likelihood function which sums contributions χ 2 n from N exp fitted experiments, and a contribution χ 2 th specifying theoretical conditions ("Lagrange Multiplier constraints") imposed on some PDF parameters. In turn, the terms χ 2 n are given by The χ 2 n contribution is a function of the PDF parameters a and systematic nuisance parameters λ. For a k-th data point, T k , D k , s k, and β kα are the theoretical prediction, central experimental value, uncorrelated experimental uncertainty, and systematic correlation matrix, respectively.
The minimum of the χ 2 global function is found iteratively by the method of steepest descent using the program MINUIT. The boundaries of the 90% c.l. region around the minimum of χ 2 global , and the eigenvector PDF sets quantifying the associated uncertainty, are found by iterative diagonalization of the Hessian matrix [7,10], which finds independent, or eigenvector, combinations of the PDF parameters a i . These combinations are denoted by z i , with the best-fit parameter combination corresponding to z 1 = z 2 = . . . = 0. The 90% c.l. boundary around the best fit is determined by applying two tiers of criteria, based on the increase in the global χ 2 global summed over all experiments, and the agreement with individual experimental data sets [26,27,54]. The first type of condition demands that the global χ 2 does not increase above the best-fit value by more than ∆χ 2 = T 2 , where the 90% c.l. region corresponds to T ≈ 10. The second condition introduces a penalty term P in χ 2 when establishing the confidence region, which quickly grows when the fit ceases to agree with any specific experiment within the 90% c.l. for that experiment. The effective function χ 2 ef f = χ 2 global + P is scanned along each eigenvector direction until χ 2 eff increases above the tolerance bound, or rapid χ 2 eff growth due to the penalty P is triggered. The penalty term is constructed as P = Nexp n=1 (S n ) p θ(S n ) (A.5) from the equivalent Gaussian variables S n that obey an approximate standard normal distribution independently of the number of data points N pts,n in the experiment. Every

JHEP03(2017)099
S n is a monotonically increasing function of the respective χ 2 n given in [54,55]. The power p = 16 is chosen so that (S n ) p sharply increases from zero when S n approaches 1.3, the value corresponding to 90% c.l. cutoff.
Open Access. This article is distributed under the terms of the Creative Commons Attribution License (CC-BY 4.0), which permits any use, distribution and reproduction in any medium, provided the original author(s) and source are credited.