# Precision determination of the strong coupling constant within a global PDF analysis

- 107 Downloads

## Abstract

We present a determination of the strong coupling constant \(\alpha _s(m_Z)\) based on the NNPDF3.1 determination of parton distributions, which for the first time includes constraints from jet production, top-quark pair differential distributions, and the *Z* \(p_T\) distributions using exact NNLO theory. Our result is based on a novel extension of the NNPDF methodology – the correlated replica method – which allows for a simultaneous determination of \(\alpha _s\) and the PDFs with all correlations between them fully taken into account. We study in detail all relevant sources of experimental, methodological and theoretical uncertainty. At NNLO we find \(\alpha _s(m_Z) = 0.1185 \pm 0.0005^\text {(exp)}\pm 0.0001^\text {(meth)}\), showing that methodological uncertainties are negligible. We conservatively estimate the theoretical uncertainty due to missing higher order QCD corrections (N\(^3\)LO and beyond) from half the shift between the NLO and NNLO \(\alpha _s\) values, finding \(\Delta \alpha ^\mathrm{th}_s =0.0011\).

## 1 Introduction

The value of the strong coupling constant \(\alpha _s\left( m_Z \right) \) is a dominant source of uncertainty in the computation of several LHC processes. This uncertainty is often combined with that on parton distributions (PDFs), with which it is strongly correlated. However, while PDF uncertainties have reduced considerably over the years, as it is clear for example by comparing the 2012 [1] and 2015 [2] PDF4LHC recommendations, the uncertainty on the \(\alpha _s\) PDG average [3] remains substantially unchanged since 2010 [4]. As a consequence, the uncertainty on \(\alpha _s\) is now the dominant source of uncertainty for several Higgs boson production cross-sections [5].

Here we present a determination of \(\alpha _s\) which exploits the most recent PDFs obtained with the NNPDF methodology, namely NNPDF3.1 [16]. This updates a previous determination of \(\alpha _s\) [17, 18] based on NNPDF2.1 [19, 20]. In comparison to this previous PDF set, NNPDF3.1 represents a substantial improvement both in terms of input dataset, theoretical calculations, and fitting methodology. Specifically, NNPDF3.1 is the first PDF set to make such an extensive use of LHC data as to be dominated by them. It is in fact the first global analysis to simultaneously use differential top, inclusive jet, and *Z* \(p_T\) distribution data, all using exact NNLO theory. Indeed, typical PDF uncertainties are of order of one to three percent in the data region for NNPDF3.1, about a factor two smaller than they were for NNPDF2.1.

This greater precision in the PDF determination requires a corresponding improvement in the methodology used for the \(\alpha _s\) extraction. In our previous work [17, 18], PDF replicas were determined for a number of fixed values of \(\alpha _s\), which was then extracted from the \(\chi ^2\) profile versus \(\alpha _s\) of the best fit PDF, obtained as an average over the replicas. Here instead, both \(\alpha _s\) and PDFs are determined from a simultaneous minimization in their combined parameter space. As we will discuss below, this new method corresponds roughly to determining the value and uncertainty on \(\alpha _s\) from the error ellipse of the multivariate measurement in the \(\left( \alpha _s, \mathrm{PDF}\right) \) hyperspace, and the old method corresponds to performing a scan of \(\alpha _s\) along the best-fit PDF line, see Fig. 1 for a schematic illustration. In a situation when the variables are highly correlated, especially if the semi-axes of the ellipse are of very different length, the procedure used in our previous work might lead to an underestimate of the uncertainty in \(\alpha _s\). Hence the new procedure becomes very relevant, now that some PDF uncertainties are rather small.

It turns out that the implementation of this simultaneous minimization within the NNPDF methodology is nontrivial: it requires the development of a suitable generalization of the standard NNPDF approach, which we call the correlated replica method. Using this strategy, \(\alpha _s\) can be treated like any other quantity that depends on the PDFs. In particular, its central value and uncertainty can be determined by performing statistics over a replica sample. This means that, for example, the uncertainty on \(\alpha _s\) is the standard deviation of an ensemble of \(\alpha _s\) values. As we shall see, this allows for a determination of \(\alpha _s\) with small experimental uncertainties, and negligible methodological uncertainties. Having reduced very much the size of all other uncertainties, the problem of accurately estimating theoretical uncertainties becomes quite serious. This is specifically problematic in the case of missing higher-order uncertainties (MHOUs), for which no fully satisfactory method has been developed. Here we will conservatively estimate the theoretical uncertainty due to missing higher order QCD corrections (N\(^3\)LO and beyond) from half the shift between the NLO and NNLO \(\alpha _s\) values.

This paper consists of two main parts. First, in Sect. 2 we present the correlated replica method used for the determination of \(\alpha _s\), explain how it is used to estimate the associated PDF uncertainties, and compare it with the method used in previous NNPDF determinations. Then, in Sect. 3 we present our determination of \(\alpha _s\) at NLO and NNLO together with a careful assessment of all sources of uncertainty. Possible future developments are briefly outlined in Sect. 4.

## 2 The correlated Monte Carlo replica method

As discussed in the introduction, the \(\alpha _s\) determination presented here differs from our previous one [17, 18] because now the value of \(\alpha _s\) and its uncertainty are determined from a correlated fit together with the PDFs. After briefly summarizing the main aspects of the NNPDF methodology and the way it was used to determine \(\alpha _s\) in Ref. [17, 18], we describe the main idea of the new method, and then discuss the details of its implementation. Only the salient aspects of the NNPDF methodology will be recalled here; the reader is referred to the original literature (see Ref. [16], of which we follow the notation, and references therein) and recent reviews [2, 21, 22] for a more detailed discussion.

### 2.1 General strategy

*D*with components \(D_i\), a set of \(N_\mathrm {rep}\) replicas \( D^{(k)}\) of the data is generated by means of:

*i*th datapoint, dependent on the set of parameters \(\theta \), and \(C_{t_0}\) is the covariance matrix used in the fit. Recall that in the presence of multiplicative uncertainties, \(C_{t_0}\) cannot be directly identified with the experimental covariance matrix

*C*used for pseudodata generation Eq. (2.1) lest the fit be biased [23], and must thus be constructed instead using a suitable procedure such as the \(t_0\) method [24] (see also [25]).

*k*th data replica:

In summary, the standard NNPDF methodology produces a set of replicas \( D^{(k)}\) of the original data, and uses them to construct a set of PDF replicas which correspond to parameters \(\theta ^{(k)}\), where *k* runs over the replica sample.

The theory predictions \(T_i\), which enter in the figure of merit of the fit Eq. (2.3) depend not only on the PDF parameters \(\theta \), but also on theory parameters, specifically the value of \(\alpha _s\). Therefore, in general we can view the figure of merit as a function \(\chi ^2(\alpha _s,\theta ,D)\). In standard NNPDF determinations, \(\alpha _s\) is treated as a fixed parameter, along with all other theory parameters, such as quark masses, CKM matrix elements, the fine structure constant, and so on. On the other hand, it is well known (see e.g. Ref. [27] for an early reference) that PDFs are strongly correlated to the value of \(\alpha _s\), so a determination of the combined PDF+\(\alpha _s\) uncertainty on a process which depends on both, requires knowledge of the PDFs as \(\alpha _s\) is varied. With this motivation, NNPDF sets are routinely released for different fixed values of \(\alpha _s\), where the procedure of generating data replicas \(D^{(k)}\) and determining PDF replicas determined by the best-fit parameters \(\theta ^{(k)}\) is repeated several times for different values of \(\alpha _s\).

In our previous work [17, 18], \(\alpha _s\) was determined by first producing PDF fits for a range of values of \(\alpha _s\). The \(\chi ^2(\alpha _s)\) of the mean of all the replicas was then fitted to a parabola as a function of \(\alpha _s\). This methodology has two main drawbacks. The first is that, as mentioned, the PDFs are strongly correlated to the value of \(\alpha _s\). With this method, however, the \(\chi ^2\) profile is determined as a function of \(\alpha _s\) along the line in \(\theta \) space which corresponds to the best-fit \(\theta \) at each particular value of \(\alpha _s\), without taking into account the variations in \(\theta \) space. Hence, as illustrated in Fig. 1, with the methodology of Refs. [17, 18] the resulting uncertainty on \(\alpha _s\) could be somewhat underestimated.

The second drawback is more subtle. In the NNPDF procedure, the PDF uncertainty is determined from statistics over the replica sample, so a one-sigma interval is determined by computing a standard deviation over replicas. Whether or not this corresponds exactly to a one-sigma (i.e. \(\Delta \chi ^2 =1\)) interval in \(\alpha _s\) space is unclear. In fact, in PDF determinations based on Hessian minimization in parameter space, the \(\Delta \chi ^2 =1\) criterion is modified by a suitable tolerance factor [28, 29, 30], which possibly accounts for data inconsistencies or parametrization bias. It is unclear, but certainly possible, that PDF uncertainties estimated in the NNPDF fits also include, at least to some extent, such a tolerance.

Ideally, we would like a method of determining \(\alpha _s\) in which the uncertainty on \(\alpha _s\) is determined on exactly the same footing as the PDF uncertainty, and which thus yields the full probability distribution for \(\alpha _s\), marginalised with respect to the PDF parameters. The goal is to treat \(\alpha _s\) on the same footing as the vector of parameters \(\theta \) that determine the PDFs, i.e. to simultaneously minimize the figure of merit with respect to both \(\alpha _s\) and \(\theta \). This is difficult in practice, because the dependence on \(\alpha _s\) appears in the theoretical predictions, which, for reasons of computational efficiency, are provided in the form of pre-computed grids determined before the fit using the APFELgrid framework [31, 32].

This difficulty can be overcome through the correlated replica method, as we now explain. The method relies on the concept of “correlated replica”, or c-replica for short. A c-replica is a correlated set of PDF replicas, all obtained by determining the best-fit \(\theta ^{(k)}\) Eq. (2.4) but with different (fixed) values of \(\alpha _s\): given the data replica \(D^{(k)}\), the minimization Eq. (2.4) is performed several times, for a range of fixed values of \(\alpha _s(m_Z)\). A c-replica thus corresponds to as many standard NNPDF replicas as the number of values of \(\alpha _s\) for which the minimization has been performed, all obtained by fitting to the same underlying data replica \(D^{(k)}\).

*k*th c-replica by minimizing as a function of \(\alpha _s\) the figure of merit \(\chi ^2\) Eq. (2.3) computed with \(\theta ^{(k)}(\alpha _s)\) as \(\alpha _s\) is varied for fixed

*k*. Namely, we first define the figure of merit computed for the

*k*th c-replica,

*T*and thus of the figure of merit Eq. (2.3) on \(\alpha _s\) is both explicit, and implicit through the best-fit parameters \(\theta ^{(k)}(\alpha _s)\). We then determine the best-fit value of \(\alpha _s\) for the

*k*th c-replica as

*k*th c-replica by first minimizing with respect to \(\theta \) and then minimizing with respect to \(\alpha _s\) is equivalent to simultaneously minimizing in the \((\alpha _s,\theta )\) hyperspace, provided the same figure of merit is used for PDF and \(\alpha _s\) determination. For instance, the absolute minimum in \((\alpha _s,\theta )\) is the solution to the coupled equations

One thus ends up, for each data replica \(D^{(k)}\), with a best fit value \((\alpha _s^{(k)},\theta ^{(k)})\) of both \(\alpha _s\) and the PDF parameters. That is, from each c-replica we extract a single best fit value \(\alpha _s^{(k)}\) – an “\(\alpha _s\) replica” – exactly on the same footing as all the other fit parameters. The ensemble of values \(\alpha _s^{(k)}\) obtained from all the c-replicas then provides a representation of the probability density of \(\alpha _s\) from which we can perform statistics in the usual way. Interestingly, this means that we can now not only compute the best fit \(\alpha _s\) and its uncertainty as the mean and standard deviation (or indeed 68% confidence interval) using the \(\alpha _s\) replicas, but also the correlation between \(\alpha _s\) and the PDFs or indeed any PDF-dependent quantity.

In summary, the correlated replica method is akin to the standard NNPDF methodology in that it starts by producing a set of replicas of the original data, but uses these to construct a set of correlated \(\alpha _s\)-dependent PDF replicas, the c-replicas, which correspond to parameters \(\theta ^{(k)}(\alpha _s)\) when *k* runs over the replica sample and \(\alpha _s\) takes a number of discrete values. From each c-replica a best-fit \(\alpha _s^{(k)}\) can then be determined, so each c-replica yields an \(\alpha _s\) replica, with \(\alpha _s^{(k)}\) defined by Eq. (2.6).

Hence, the correlated replica method exploits the fact that in the NNPDF approach it is sufficient to know the best-fit set of parameters for each replica, and all other information is contained in the replica sample. The price to pay for this is that the statistics of the \(\alpha _s\) fitting is inevitably more demanding than with the method of Refs. [17, 18] because we have now have to fit a different parabola for each c-replica. The issues arising from this will be discussed in the next section.

### 2.2 Implementation

Building on the conceptual strategy described above, we now present the practical implementation of the correlated replica method. As already mentioned, the best-fit \(\alpha _s^{(k)}\) Eq. (2.6) for the *k*th c-replica is determined by fitting a parabola to the figure of merit \(\chi ^2(\alpha _s)\), viewed as a function of \(\alpha _s\), known at the discrete set of \(\alpha _s\) values for which best-fit \(\theta ^{(k)}(\alpha _s)\) are available. The reliability of the quadratic approximation to \(\chi ^{2(k)}\) Eq. (2.5) and the stability of the position of the minimum upon inclusion of higher order terms can be studied using standard methods and will be discussed in Sect. 3.2 below.

*i*th resample. The bootstrapping estimate of the finite-size uncertainty on the central value of \(\alpha _s\) is then the standard deviation of the set of \(\alpha _s^{(\mathrm{res}, i)}\)

It turns out that, when determining the best-fit \(\theta ^{(k)}(\alpha _s)\) through the standard NNPDF minimization algorithm, a certain amount of fluctuation of individual values of \(\chi ^2(\alpha _s)\) about the parabolic best-fit is observed. In other words, the \(\chi ^2\) profiles as a function \(\alpha _s\) are not very smooth. It is therefore advantageous to introduce an improvement of the algorithm, called batch minimization, which increases its accuracy at the cost of increasing the time required for fitting.

Furthermore, when using the standard NNPDF minimization, occasionally the fit fails to satisfy a number of convergence and quality criteria (see Sect. 3.3.2 of Ref. [26]), in which case it is discarded. Consequently, for some c-replicas \(\chi ^2(\alpha _s)\) is not available for all \(\alpha _s\) values. One must then decide on a sensible criterion for c-replica selection, with the most restrictive criterion being to only accept c-replicas for which all \(\chi ^2(\alpha _s)\) values are available, and the least restrictive one to accept c-replicas for which at least three \(\chi ^2(\alpha _s)\) values are available so a parabola can be fitted. We now discuss batch minimization and replica selection criteria in turn.

The idea of batch minimization is to refit a given set of data replicas more than once. In order to improve the smoothness of the \(\chi ^2\) profiles obtained by the direct use of NNPDF minimization, we exploit the fact that the minimization algorithm is not deterministic, and thus simply rerunning the minimization from a different random seed leads to a slightly different answer. Each of these refits is called a batch. For each c-replica *k* and each \(\alpha _s\) value we then end up with several best-fit results \(\theta ^{(k)}_i(\alpha _s)\), where *i* runs over batches.

We then pick for each c-replica *k* and for each \(\alpha _s\) value the batch which gives the best \(\chi ^2\). We also impose the condition that at least two of the batches for the given c-replica and \(\alpha _s\) value have converged, in order to mitigate the influence of outliers that narrowly pass the post-selection fit criteria. The dependence of results on the number of batches used can then be assessed a posteriori by comparing results found with different numbers of batches.

After batch minimization, we end up with a set of c-replicas \(\theta ^{(k)}(\alpha _s)\) where, however, for several c-replicas, results may be missing for one or more \(\alpha _s\) values because convergence was not achieved. We must thus determine the minimum number of \(\alpha _s\) values \(N_\mathrm {min}\) such that a c-replica is accepted. The threshold \(N_\mathrm {min}\) is chosen to ensure the stability of results. Curves with too few points lead to an unreliable parabolic fit, and thus an unreliable best-fit \(\alpha _s^{(k)}\) for that c-replica. This then leads to outlier values of \(\alpha _s^{(k)}\) and a spuriously large value of the uncertainty on the \(\alpha _s^{(k)}\) determination. On the other hand, once the number of points is sufficient for a reliable parabolic fit, requiring more points does not improve the determination of \(\alpha _s^{(k)}\), but it reduces the number of c-replicas which are retained in the final sample, which in turn increases the finite-size uncertainty.

Therefore, the optimal value of \(N_\mathrm {min}\) arises from a trade-off between the uncertainty on \(\alpha _s^{(k)}\) from the parabolic fitting, and the finite-size uncertainty. In order to keep both criteria into account, we fix \(N_\mathrm {min}\) by minimizing the bootstrapping uncertainty \(\Delta _{\sigma }\) Eq. (2.13). However, in order to make sure that the selection is not too tight, we do not minimize \(\Delta _{\sigma }\) itself. Rather, we first multiply it by a penalty factor that depends on the number of points. This is in turn determined as the 99% confidence level factor from a two sided Student *t* distribution. Indeed, if the distribution of best-fit \(\alpha _s^{(k)}\) over replicas is Gaussian, then the difference between the sampled and true central value follows a Student *t* distribution with \(N_\mathrm {rep}-1\) degrees of freedom, zero mean and scale parameter \(\Delta _{\sigma }/\sqrt{N_\mathrm {rep}}\). A given confidence level around the mean is equal to the standard deviation \(\Delta _{\sigma } T_{\mathrm{CL},(N_\mathrm {rep}-1)}\), where \( T_{\mathrm{CL},N}\) is the percentile at CL confidence level for the two-sided confidence factor of the Student *t* distribution with *N* degrees of freedom. Hence, we choose a \(99\%\) confidence level, and we determine \(N_\mathrm {min }\) as the value yielding the minimum of \(\Delta _{\sigma } T_{0.99,(N_\mathrm {rep}-1)}\). Also in this case, the dependence of results on the choice of selection criteria can be studied a posteriori, and will be discussed in Sect. 3.2.

## 3 The strong coupling constant from NNPDF3.1

We now present the main result of this work, namely the determination of \(\alpha _s\left( m_Z \right) \) based on the methodology discussed in Sect. 2. We first present the best-fit result for \(\alpha _s\) and its experimental uncertainty, determined through the correlated replica method. We then discuss methodological and theoretical uncertainties. We finally collect our final result and briefly compare it to other recent determinations from PDF fits and to the PDG average.

### 3.1 Best-fit results for \(\alpha _s\) and statistical uncertainty

We have determined \(\alpha _s\left( m_Z \right) \) both at NLO and NNLO using the methodology and dataset of the NNPDF3.1 global analysis [16]. The only difference in the fit settings is the theoretical description of the inclusive jet production datasets at NNLO. Here we use exact NNLO theory [33] for the ATLAS [34] and CMS [35] inclusive jet measurements at 7 TeV, and discard the other jet datasets used in NNPDF3.1 for which the NNLO calculation is not available (note that, as in NNPDF3.1, only ATLAS data in the central rapidity bin are included). To ensure a consistent comparison, the input datasets of the NLO and NNLO fits are identical, up to small differences in the kinematical cuts as explained in [16].

Specifically, we determine \(\alpha _s\) by generating a set of 400 data replicas, and from them a set of 400 c-replicas each with 21 values of \(\alpha _s\), thus corresponding to a total of 8400 PDF replicas correlated as discussed in Sect. 2.1. These c-replicas are generated for \(\alpha _s\left( m_Z \right) \) ranging between 0.106 and 0.130, varied in steps of \(\Delta _{\alpha _s}=0.002\) between 0.106 and 0.112 and between 0.128 and 0.130, and in steps of \(\Delta _{\alpha _s}=0.001\) between 0.112 and 0.128, adding up to the total of 21 values. From these we determine \(\alpha _s\) replicas, which form a representation of the probability distribution of \(\alpha _s\).

The 379 c-replicas selected for the NNLO determination are shown in Fig. 2. The color scale of each curve indicates the best-fit \(\alpha _s\) value. It is apparent that the vast majority of the curves exhibit an approximately parabolic behaviour. The probability distributions of the best-fit values \(\alpha _s^{(k)}\) Eq. (2.6) which correspond to each c-replica, both at NLO and at NNLO, are shown in Fig. 3, where the markers indicate the value of \(\alpha _s^{(k)}\) for each specific c-replica. These probability densities have been determined using the Kernel Density Estimate method, see [36]. We find that the probability distribution for \(\alpha _s\left( m_Z \right) \) is both shifted to higher values and broadened when going from NNLO to NLO. The decrease of the best-fit value of \(\alpha _s\left( m_Z \right) \) when going from NLO to NNLO has been repeatedly observed before (see Table 1 of Ref. [37] for an extensive set of examples), also in our previous determination [17, 18], while the broadening is due to the poorer quality of the NLO fit.

*p*, defined as the figure of merit Eq. (2.3) with the summation over

*i*,

*j*now restricted to data which belong to the specific subset

*p*. The \(\alpha _s\) fit procedure through the correlated replica method is then just repeated but using this partial \(\chi ^2_p\). Namely, for each c-replica the partial \(\chi ^{2(k)}_p\) for process

*p*is computed, a parabola is fitted to it, the corresponding minimum \(\alpha _{s,p}^{(k)}\) of the parabola is determined, and the resulting set of minima is used to find the value of \(\alpha _s\left( m_Z \right) \) and its uncertainty.

Number of data points at NLO and NNLO corresponding to the different subsets of the input experimental data considered here. These eight subsets add up to the total dataset

NLO | NNLO | |
---|---|---|

Fixed-target charged lepton DIS | 973 | 973 |

Fixed-target neutrino DIS | 908 | 908 |

Collider DIS (HERA) | 1221 | 1211 |

Fixed Target Drell–Yan | 189 | 189 |

Collider Drell–Yan | 378 | 388 |

Inclusive jets | 164 | 164 |

| 120 | 120 |

Top quark pair production | 26 | 26 |

Total | 3979 | 3979 |

*p*: top production, the

*Z*\(p_T\) distributions, collider and fixed target Drell–Yan, inclusive jets, and deep-inelastic scattering (DIS) either at HERA or at fixed-target experiments, in the latter case separating charged lepton and neutrino beams. The number of data points corresponding to each of these data subsets is shown in Table 1. Not unexpectedly, the \(\chi ^{2(k)}_p\) profiles for data subsets turn out to be rather less parabolic than the total \(\chi ^2\), especially for processes such as neutrino DIS or fixed target Drell–Yan that have weak sensitivity to \(\alpha _s\).

When determining \(\alpha _s\left( m_Z \right) \) from the partial \(\chi ^{2(k)}_p\), we do not repeat the replica selection and simply use the same replicas selected for the total dataset. Consequently, we must apply a form of post-selection, whereby each time a parabola for \(\chi ^{2(k)}_p\) has no minimum the corresponding c-replica is ignored. At NNLO, for five out of eight data subsets we retain all 379 c-replicas, while for jets, neutrino DIS, and fixed-target Drell–Yan, we retain only 376, 366, and 302 c-replicas respectively. The results for the partial \(\alpha _s\left( m_Z \right) \) determined from \(\chi ^{2}_p\) for the various families of processes are collected in Fig. 4. The central value and uncertainty shown are respectively determined as the median and 68% symmetric confidence level interval from the corresponding partial \(\alpha _{s,p}^{(k)}\). This is because the analogue of Fig. 3 for individual processes turns out to be rather non-gaussian, especially for processes such as fixed-target Drell–Yan that only have a weak handle on \(\alpha _s\).

The values of \(\alpha _s\left( m_Z \right) \) shown in Fig. 4 should be interpreted with some care. Indeed, the partial \(\chi ^2_p\) is in each case computed using PDF c-replicas determined from the minimization of the global \(\chi ^2\). These are in general different from the c-replicas that would be determined by simultaneous minimization of \(\chi ^2_p\) with respect to \(\alpha _s\) and the PDFs. Therefore, the values of \(\alpha _{s,p}\) in Fig. 4 cannot be interpreted as the best-fit values of \(\alpha _s\left( m_Z \right) \) for a given subset *p*. They instead provide an estimate of the pull on the best-fit \(\alpha _s\left( m_Z \right) \) value that specific families of processes have within the global fit subject to the constraints from the rest of the data.

Moreover, even their interpretation as pulls is only approximate. Firstly, the replica selection is applied to the total \(\chi ^2\) rather than to each partial \(\chi ^2_p\), so that several partial \(\chi ^{2(k)}_p\) profiles turn out not to have a minimum. Furthermore, the total \(\chi ^2\) includes cross-correlations which are lost when determining partial \(\chi ^2_p\), because the covariance matrix \(C_{t_0}\) in Eq. (2.3) is generally nonzero even when *i* and *j* belong to different data subsets. For instance, inclusive jet, *Z* \(p_T\), and Drell–Yan measurements from the same experiment (ATLAS, or CMS) are correlated amongst themselves by the common luminosity uncertainty. Finally, partial \(\alpha _s\) values are correlated through the underlying PDFs, implying that the pulls should not be expected to combine additively into the final result.

As expected, and discussed in the introduction and in Sect. 2.1, we find that the best-fit values of \(\alpha _s\left( m_Z \right) \) determined with the old method [17, 18] and with the new correlated replica method are in good agreement, both for the global dataset and for the data subsets. The small differences in central values are most likely due to uncertainties related to the finite size of the replica sample, which, as discussed in [17, 18], can be non-negligible when the old method is used. On the other hand, also as expected, neglecting the correlation between \(\alpha _s\) and PDFs as in the old method leads in general to an underestimate of the uncertainty on \(\alpha _s\). This effect is more marked for processes such as fixed-target Drell–Yan and neutrino DIS that have a limited sensitivity to \(\alpha _s\), because in this case the difference in length of the semi-axes of the error ellipse in Fig. 1 is large.

This determination of \(\alpha _s\left( m_Z \right) \) from the total \(\chi ^2\) also offers a complementary way of quantifying how much each family of processes constrains the final best-fit value, by plotting the contribution of each data subset to the total \(\chi ^2\). Specifically, we show in Fig. 6 the cumulative differences at NNLO, \(\chi ^2_p(\alpha _s)-\chi ^2_p(0.1185)\), between each partial \(\chi ^2_p\) and its value computed at the global best-fit \(\alpha _s\left( m_Z \right) \) value, neglecting cross-correlations between different data subsets. The plot is divided into two halfs: above zero, only positive differences are shown, and below zero, only negative ones. Thus, when all differences are positive the plot shows the breakdown of the total \(\chi ^2\) into the contribution of different experiments (up to neglected cross-correlations), while when some of them are negative the lower part of the plot shows by how much the \(\chi ^2\) of the individual experiments shown has improved in comparison to their value at the global minimum \(\alpha _s(M_z)=0.1185)\). In order to increase readability, the plot is displayed twice, with two different choices of scale on the *y* axis.

From this comparison, we observe that the LHC data significantly contribute to constraining \(\alpha _s\). In particular, it is interesting to note that the 13 data points from top-quark pair production lead to a significant contribution to the total \(\chi ^2\) away from the best-fit, even though the global dataset contains almost 4000 data points. Similar considerations apply to the *Z* \(p_T\) distributions. This means that there is a small range of values of \(\alpha _s\) where these two groups of processes are consistent with the rest of the data entering the fit, thereby providing a tight constraint on \(\alpha _s\).

### 3.2 Methodological uncertainties

In view of the rather small experimental uncertainty on the final value of \(\alpha _s\left( m_Z \right) \), Eqs. (3.1)–(3.2), we need to assess possible uncertainties associated to the various aspects of our methodology described in Sect. 2. Specifically, we discuss here the methodological uncertainties associated to c-replica selection, batch minimization, the quadratic approximation to \(\chi ^2\) profiles, and the treatment of correlated systematics.

The replica selection algorithm determines an optimal value of \(N_\text {min}\), the minimal number of \(\alpha _s\) for which results must be available for a given c-replica to be selected. We have varied this value from its minimum \(N_\text {min}=3\) (needed in order to fit a parabola) to a high value \(N_\text {min}=18\) (meaning that at most three values \(\alpha _s\) can be missing in order for a c-replica to be retained). Results for the number of c-replicas passing the criterion and the ensuing value of \(\alpha _s\) are collected in Table 2 for a number of choices. In each case we also show the finite-size uncertainty \(\Delta _{\alpha _s}\) on the best-fit \(\alpha _s\) estimated by bootstrapping, Eq. (2.12).

The number of surviving c-replicas varies significantly; all the starting 400 c-replicas pass the loosest criterion (i.e., it is always possible to fit a parabola to any c-replica), but only \(N_\text {rep}=12\) c-replicas pass the most restrictive criterion. However, even with this most restrictive criterion the finite-size uncertainty is below the permille level. For the value selected by the algorithm, the finite-size uncertainty is of order \(0.03\%\), i.e. by almost a factor 20 smaller than the experimental uncertainty Eq. (3.1) and it does not decrease further even when all c-replicas are kept. The finite-size uncertainty on the \(\alpha _s\) uncertainty \(\Delta _{\sigma }\) itself Eq. (2.13) is comparable in all cases.

Dependence of the NNLO determination of \(\alpha _s\left( m_Z \right) \) on the minimum number of \(\alpha _s\) values per c-replica \(N_\mathrm{min}\) (see Sect. 2.2). In each case, the best fit value and statistical uncertainty on \(\alpha _s\) are shown, together with the number of surviving c-replicas \(N_\mathrm{rep}\) and the bootstrapping uncertainty \(\Delta _{\alpha _s}\) Eq. (2.12). The value chosen using the selection criterion of Sect. 2.2, which leads to the final vale of \(\alpha _s\left( m_Z \right) \) Eq. (3.1), is \(N_\text {min}=6\) (third row of the table, in boldface)

\(N_\text {min}\) | \(\alpha _s\left( m_Z \right) \) | \(N_\mathrm{rep}\) | \(\Delta _{\alpha _s}\) |
---|---|---|---|

18 | \(0.11842\pm 0.00031~(0.3\%)\) | 12 | 0.00009 |

15 | \(0.11844\pm 0.00044~(0.4\%)\) | 92 | 0.00005 |

| \( \mathbf{0.11845\pm 0.00052~(0.5\%)}\) | | |

3 | \(0.11844 \pm 0.00056~(0.5\%)\) | 400 | 0.00003 |

Results for the NNLO determinations of \(\alpha _s\left( m_Z \right) \) using different combinations of the three available batches. In each case we show both the best-fit value of \(\alpha _s\left( m_Z \right) \), the minimum number of \(\alpha _s\) values per c-replica \(N_\mathrm{min}\), and the corresponding number surviving c-replicas \(N_\mathrm{rep}\). The last row (in boldface) corresponds to our final result Eq. (3.1)

Batches | \(\alpha _s\left( m_Z \right) \) | \(N_\mathrm{min}\) | \(N_\mathrm{rep}\) |
---|---|---|---|

I | \(0.11831\pm 0.00065~(0.5\%)\) | 9 | 310 |

II | \(0.11828\pm 0.00062~(0.5\%)\) | 14 | 216 |

III | \(0.11822 \pm 0.00072~(0.6\%)\) | 13 | 369 |

I + II | \(0.11844 \pm 0.00054~(0.5\%)\) | 11 | 225 |

I + III | \(0.11841 \pm 0.00058~(0.5\%)\) | 13 | 158 |

II + II | \(0.11841 \pm 0.00060~(0.5\%)\) | 14 | 288 |

| \(\mathbf{0.11845 \pm 0.00052~(0.4\%)}\) | | |

We next turn to discuss batch minimization. The results shown in Table 2 all correspond to the NNLO baseline which uses batch minimization with three batches. In order to assess the impact of batch minimization, in Table 3 we compare results obtained with each of the three batches, with the three possible pairs, and combining the three batches. In each case we show the final best-fit \(\alpha _s\left( m_Z \right) \) and experimental uncertainty, the value of \(N_\mathrm{min}\), the minimum number of \(\alpha _s^{(k)}\) values per c-replica, and the number of surviving c-replicas \(N_\mathrm{rep}\).

It is clear from this comparison that as more batches are combined, results become more stable. The values of \(N_\mathrm{min}\) are on average larger with two batches, and larger still with three, but without a reduction of the number of surviving c-replicas \(N_\mathrm{rep}\) as was observed in Table 2. With three batches, \(N_\mathrm{rep}\) is largest even though \(N_\mathrm{min}\) is also largest. This means that, thanks to batch minimization, the number of available \(\alpha _s^{(k)}\) values per replica is on average higher. It follows that the finite-size uncertainty is reduced by batch minimization, thus leading to the very small uncertainties shown in Table 2.

Results for the NNLO determinations of \(\alpha _s\left( m_Z \right) \) when the \(N_\mathrm{trim}\) outer values of \(\alpha _s\) are not used and the fit is restricted to a smaller range. In the bottom part of the table we also show results found discarding values asymmetrically, at the upper or lower edge of the range. In each case we show the number of discarded \(\alpha _s\) values, the best-fit value of \(\alpha _s\left( m_Z \right) \), and the number of surviving c-replicas \(N_\mathrm{rep}\). The first row (in boldface) corresponds to our final result Eq. (3.1)

\(N_\mathrm{trim}\) | Fitted \(\alpha _s\left( m_Z \right) \) range | \(\alpha _s\left( m_Z \right) \) | \(N_\mathrm{rep}\) |
---|---|---|---|

| \(\mathbf{[0.106,0.130]}\) | \( \mathbf{0.11845\pm 0.00052~(0.4\%)}\) | |

2 | [0.108, 0.128] | \(0.11846 \pm 0.00045~(0.4\%)\) | 218 |

5 | [0.110, 0.126] | \(0.11852\pm 0.00051~(0.4\%)\) | 290 |

10 | [0.114, 0.124] | \(0.11869 \pm 0.00046~(0.4\%)\) | 32 |

15 | [0.115, 0.120] | \(0.11822 \pm 0.00079~(0.7\%)\) | 10 |

4 | [0.113, 0.130] | \(0.11850 \pm 0.00058~(0.5\%)\) | 296 |

5 | [0.106, 0.124] | \(0.11855 \pm 0.00059~(0.5\%)\) | 197 |

We conclude that the value of \(\alpha _s\left( m_Z \right) \) found using three batches is the most accurate. We observe that even the shift between the three-batch value and the single-batch value which differs most from it is about a third of the finite-size uncertainty. We take this as further evidence that there is no extra contribution of methodological origin due to batch minimization to be added to the statistical uncertainty. We finally observe that the two-batch result is in fact consistent within its very slightly larger uncertainty, thus justifying the use of only two batches at NLO.

We next turn to the methodological uncertainties related to the quadratic fitting of \(\chi ^2\) profiles. We have studied this in three different ways: by removing outer values of \(\alpha _s\left( m_Z \right) \) from the fit; by adding higher order terms to the fitting function; and by changing the fitting variable. We discuss each in turn.

First, we have repeated the NNLO determination removing \(\alpha _s\) values that are farthest from the best-fit value \(\alpha _s\left( m_Z \right) =0.1185\), fitting a smaller range of values around the minimum. As a further consistency check, we have removed \(\alpha _s\) values asymmetrically. Results are shown in Table 4; in each case we show the number of discarded \(\alpha _s\) values \(N_\mathrm{trim}\), the resulting fitted range, the best fit \(\alpha _s\left( m_Z \right) \) and uncertainty, and the number of surviving c-replicas \(N_\mathrm{rep}\). Here too, the behaviour is consistent with expectations. As the fitted range is reduced, the experimental uncertainty increases and the number of surviving c-replicas decreases (thereby also increasing the finite-size uncertainty). The central value, however, is extremely stable; the shift in central value when restricting the range is always more than a factor two smaller than the experimental uncertainty. In fact, the shift is never larger than \(\Delta =0.00010~(0.08\%)\) unless the number of surviving c-replicas becomes of order ten, in which case the finite-size uncertainty (recall Table 2) is of the same order or larger.

*r*is the number of degrees of freedom of the model,

*n*is the number of fitted points, and \(\ln (L)\) is the log-likelihood associated with the model.

Same as Table 2, comparing the default parabolic fitting (in boldface) of the \(\chi ^2(\alpha _s)\) profiles with those with a transformed input, both \(\chi ^2\left( \ln (1+\alpha _s)\right) \) and \(\chi ^2\left( \exp (\alpha _s)\right) \)

\(\alpha _s\left( m_Z \right) \) | \(N_\mathrm{rep}\) | |
---|---|---|

| \( \mathbf{0.11845 \pm 0.00052~(0.4\%)}\) | |

ln | \(0.11845 \pm 0.00052~(0.4\%)\) | 379 |

exp | \(0.11849 \pm 0.00052~(0.4\%)\) | 379 |

*f*is sufficiently smooth and monotonic, should lead to the same best-fit value of \(\alpha _s\). The results of fitting \(\alpha _s\) from the transformed profiles Eq. (3.5) with \(f(\alpha _s)=\exp (\alpha _s)\) and \(f(\alpha _s)=\ln (1+\alpha _s)\) are shown in Table 5. The argument of the log is shifted so that \(f(\alpha _s)\) admits a Taylor expansion in powers of \(\alpha _s\).

Reassuringly, we find extreme stability with respect to these transformations of the fitting argument.

Best-fit value of \(\alpha _s\left( m_Z \right) \) and experimental uncertainty found using three different forms of the \(t_0\) covariance matrix (see text); the second row corresponds to the central result Eq. (3.1). The number of c-replicas selected in each case is also shown

\(t_0\) | \(\alpha _s\left( m_Z \right) \) | \(N_\mathrm{rep}\) |
---|---|---|

I | \(0.11844 \pm 0.00052 (0.4\%)\) | 379 |

II | \(0.11845 \pm 0.00052 (0.4\%)\) | 379 |

III | \(0.11841 \pm 0.00051 (0.4\%)\) | 356 |

The use of the \(t_0\) procedure in principle leads to a further methodological uncertainty related to the choice of the prior used for the construction of the \(t_0\) matrix, which should therefore be assessed. In order to determine the final result Eq. (3.1) the \(t_0\) matrix was constructed using the best-fit PDF set from batch II of Table 3. We have repeated the determination constructing the \(t_0\) matrix from the best-fit PDF set of either of the other two batches. Results are collected in Table 6. It is clear that, using the consistent \(t_0\) method, results are extremely stable. We can conservatively estimate the uncertainty due to the choice of \(t_0\) from the largest shift seen in Table 6 as \(\Delta _{t_0}=0.00004~(0.03\%)\).

### 3.3 Theoretical uncertainties from missing higher orders

A determination of \(\alpha _s\left( m_Z \right) \) is dependent on the perturbative order of the QCD calculations on which it relies. Therefore, at any fixed order it is affected by a missing higher order uncertainty (MHOU). In older, and also some more recent determinations of \(\alpha _s\left( m_Z \right) \) (specifically for determination in PDF fits see Refs. [17, 42, 43]) no attempt was made to estimate the MHOU, and sometimes NLO or NNLO values of \(\alpha _s\left( m_Z \right) \) were quoted with the understanding that they might differ by an amount greater than the quoted uncertainty due to this missing uncertainty. However, as the experimental uncertainty decreases, an estimate of the MHOU becomes mandatory, and in the context of PDF fits it was done e.g. in Ref. [18]. Indeed, this uncertainty, usually estimated by scale variation, is typically dominant in more recent determinations [9, 10, 11, 12, 13, 14, 15].

The NLO estimate of the MHOUs in Eq. (3.9) is reassuringly in good agreement with the observed shift Eq. (3.8). The NNLO uncertainty Eq. (3.10) is also consistent with expectations based on the CH uncertainty estimate of Ref. [18], where the value of \(\alpha _s\left( m_Z \right) \) determined using the NNPDF2.1 set was found to lead to \(\Delta ^\text {CH, NNLO}=0.0009\). Indeed, PDF uncertainties in the NNPDF3.1 set are generally smaller than those on NNPDF2.1 by a factor of two or more, due to significant impact of LHC data in the more recent determination.

In addition, the shift between NLO and NNLO PDFs is found to be smaller in NNPDF3.1 than in previous NNPDF sets [45], presumably because MHO terms pull in different directions and thus partly cancel each other to a greater extent in a more global fit. Indeed, we find a similar increase of perturbative stability of PDFs and of the associated \(\alpha _s\left( m_Z \right) \) by repeating the analysis presented here for reduced datasets [46]. Therefore, the reduction of the MHOU by a comparable factor in Eq. (3.8) in comparison to Ref. [18] is expected.

Nevertheless, the very small value of the MHOU at NNLO, Eq. (3.10), even smaller than the already small experimental uncertainty Eq. (3.1), may seem rather too optimistic. There are furthermore several reasons of principle and practice why the reliability of the CH method in the present case is dubious. The main one is that the implementation of the method suggested in Ref. [18] relies on a guess for an underlying “true” value \(\alpha _s^{(0)}\), and for a leading-order value \(\alpha _s^\text {LO}\), neither of which is known. The result Eqs. (3.9–3.10) is obtained by varying \(\alpha _s^\text {LO}\in [0.10,0.14]\). and \(\alpha _s^{(0)}\in [0.110,0.125]\). These are, however, largely arbitrary choices, and the final answer relies on them.

On top of the missing higher fixed-order QCD corrections, several other aspects of the theory used in the simultaneous determination of \(\alpha _s\left( m_Z \right) \) and PDFs also lead to uncertainties. These include the values of the heavy quark masses, standard model parameters (specifically CKM matrix elements and electroweak couplings), electroweak corrections, QCD resummation corrections [47, 48], QCD power corrections, and nuclear corrections. Many of these uncertainties were assessed in the NNPDF3.1 PDF determination that we are relying upon [16], and found to be smaller than PDF uncertainties. In particular, the dependence on the charm mass in previous PDF determinations is substantially reduced in NNPDF3.1 and likely rather smaller than the MHOU, thanks to the presence of an independently parametrized charm PDF [49], and electroweak corrections are carefully kept under control thanks to the choice of suitable kinematic cuts. But PDF uncertainties mix with the experimental uncertainty on \(\alpha _s\left( m_Z \right) \), with which they are strongly correlated, and are in fact indistinguishable from it, as discussed in Sect. 2.1, so the hierarchy of uncertainties on PDFs and \(\alpha _s\left( m_Z \right) \) is the same. We conclude that we have evidence that most of these theoretical uncertainties are sub-dominant in comparison to the experimental uncertainty Eq. (3.1), and thus even more so in comparison to the MHOU Eq. (3.11).

### 3.4 Final results and comparisons

This result can be compared to the previous one [18] based on NNPDF2.1, \(\alpha _s^\text {NNLO}(m_Z) = 0.1173 \pm 0.0007^\text {exp}\pm 0.0009^\text {th}\). In comparison to this older result, the central value of \(\alpha _s(m_Z)\) has increased by \(\Delta \alpha _s=+0.0012\) . As far as uncertainties are concerned, both the theoretical and experimental uncertainties on this previous result are larger, if one compares like with like. The experimental uncertainty should actually be compared to Eq. (3.3) as it was obtained with the same method. The uncertainty is somewhat underestimated because it neglects the correlation between PDFs and \(\alpha _s\), while the theory uncertainty should be compared to Eq. (3.10) which is also based on the CH method.

We conclude that, in comparison to Ref. [18], the current result is more precise, though with more conservatively estimated uncertainties.

In Fig. 8 we compare the NNLO result of Eq. (3.12) to our previous result [18], to the current PDG average [3], and to two recent determinations obtained from simultaneous fit of PDFs and \(\alpha _s\left( m_Z \right) \), ABMP16 [43] and MMHT2014 [42]. We find good agreement with the PDG average as well as with the MMHT14 and NNPDF2.1 determinations. It has been suggested [50, 51] that the lower ABMP16 value can be partly explained by the use of a fixed-flavour number scheme with \(N_f=3\) for the treatment of DIS data. It is interesting to observe that the current AMBP16 value is higher than previous values of \(\alpha _s\left( m_Z \right) \) obtained by the same group [52], from which the ABMP16 analysis in particular differs because of inclusion in Ref. [43] of LHC top production and *W* and *Z* production data (described with \(N_f=5\)).

*Z*\(p_T\) distribution) but also for top and jet production.

## 4 Summary and outlook

In this work we have presented a new determination of the strong coupling constant \(\alpha _s\left( m_Z \right) \) jointly with a global determination of PDFs which, by relying on NNPDF3.1, for the first time includes a large amount of LHC data using exact NNLO theory in all cases. In comparison to a previous determination based on NNPDF2.1, our results exploit the new correlated replica method that is equivalent to the simultaneous fit of PDFs and \(\alpha _s\). This new method thus fully accounts for the correlations between PDFs and \(\alpha _s\) in the determination of the best-fit value of \(\alpha _s\) and of the associated uncertainty.

We find that the determination of \(\alpha _s\left( m_Z \right) \) is considerably stabilized by the use of a wide set of different processes and data, and we provide evidence that a global simultaneous determination of \(\alpha _s\left( m_Z \right) \) and PDFs leads to a more stable and accurate result than the one obtained from subsets of data. We thus obtain a value of \(\alpha _s\left( m_Z \right) \) which is likely to be more precise and more accurate than previous results based on similar techniques. We find that the LHC data consistently lead to an increase in the central value of \(\alpha _s\left( m_Z \right) \), and observe good overall consistency between the datasets entering the global fit. Our NNLO determination turns out to be in agreement within uncertainties with previous results from global fits and with the PDG average.

The main limitation of our result comes from the lack of a reliable method to estimate the uncertainties related to missing higher order perturbative corrections. Theoretical progress in this direction is needed, and perhaps expected, and would be a major source of future improvement. For the time being, even with a very conservative estimate of the theoretical uncertainty, our result provides one of the most accurate determinations of \(\alpha _s\left( m_Z \right) \) available, and thus provides valuable input for precision tests of the Standard Model and for searches for new physics beyond it.

## Notes

### Acknowledgements

E. S. and J. R. are supported by an European Research Council Starting Grant “PDF4BSM”. J. R. is also supposed by the Netherlands Organization for Scientific Research (NWO). The work of Z.K. is supported by the European Research Council Consolidator Grant “NNLOforLHC2” and by the Executive Research Agency (REA) of the European Commission under the Grant Agreement PITN-GA-2012-316704 (HiggsTools). S. C. is supported by the HICCUP ERC Consolidator Grant (614577). S. C. and S. F. are supported by the European Research Council under the European Union’s Horizon 2020 research and innovation Programme (Grant agreement no 740006). R. D. B and L. D. D. are supported by UK STFC Grants ST/L000458/1 and ST/P0000630/1.

## References

- 1.M. Botje et al., The PDF4LHC Working Group Interim Recommendations. arXiv:1101.0538
- 2.J. Butterworth et al., PDF4LHC recommendations for LHC Run II. J. Phys. G
**43**, 023001 (2016). arXiv:1510.03865 ADSCrossRefGoogle Scholar - 3.Particle Data Group Collaboration, C. Patrignani et al., Review of particle physics. Chin. Phys. C
**40**(10), 100001 (2016)Google Scholar - 4.Particle Data Group Collaboration, K. Nakamura, Review of particle physics. J. Phys. G
**37**, 075021 (2010)Google Scholar - 5.LHC Higgs Cross Section Working Group Collaboration, D. de Florian et al., Handbook of LHC Higgs Cross Sections: 4. Deciphering the Nature of the Higgs Sector. arXiv:1610.07922
- 6.G. Altarelli, The QCD Running Coupling and its Measurement. PoS
**Corfu2012**, 002 (2013). arXiv:1303.6065 Google Scholar - 7.G.P. Salam, The strong coupling: a theoretical perspective. arXiv:1712.05165
- 8.J. de Blas, M. Ciuchini, E. Franco, S. Mishima, M. Pierini, L. Reina, L. Silvestrini, Electroweak precision observables and Higgs-boson signal strengths in the Standard Model and beyond: present and future. JHEP
**12**, 135 (2016). arXiv:1608.01509 ADSCrossRefMATHGoogle Scholar - 9.M. Johnson, D. Matre, Strong coupling constant extraction from high-multiplicity Z+jets observables. arXiv:1711.01408
- 10.H1 Collaboration, V. Andreev et al., Determination of the strong coupling constant \(\alpha \_s(m\_Z)\) in next-to-next-to-leading order QCD using H1 jet cross section measurements. Eur. Phys. J. C
**77**(11), 791 (2017). arXiv:1709.07251 - 11.T. Klijnsma, S. Bethke, G. Dissertori, G.P. Salam, Determination of the strong coupling constant \(\alpha _s(m_Z)\) from measurements of the total cross section for top-antitop quark production. Eur. Phys. J. C
**77**(11), 778 (2017). arXiv:1708.07495 ADSCrossRefGoogle Scholar - 12.ATLAS Collaboration, M. Aaboud et al., Determination of the strong coupling constant \(\alpha \_s\) from transverse energy-energy correlations in multijet events at \(\sqrt{s} = 8\) TeV using the ATLAS detector. arXiv:1707.02562
- 13.B. Bouzid, F. Iddir, L. Semlala, Determination of the strong coupling constant from ATLAS measurements of the inclusive isolated prompt photon cross section at 7 TeV. arXiv:1703.03959
- 14.C.M.S. Collaboration, S. Chatrchyan et al., Determination of the top-quark pole mass and strong coupling constant from the t t-bar production cross section in pp collisions at \(\sqrt{s}\) = 7 TeV. Phys. Lett. B
**728**, 496 (2014). arXiv:1307.1907 ADSCrossRefGoogle Scholar - 15.D. Britzger, K. Rabbertz, D. Savoiu, G. Sieber, M. Wobisch, Determination of the strong coupling constant from inclusive jet cross section data from multiple experiments. arXiv:1712.00480
- 16.NNPDF Collaboration, R.D. Ball et al., Parton distributions from high-precision collider data. Eur. Phys. J. C
**77**(10), 663 (2011). arXiv:1706.00428 - 17.S. Lionetti et al., Precision determination of \(\alpha _s\) using an unbiased global NLO parton set. Phys. Lett. B
**701**, 346–352 (2011). arXiv:1103.2369 ADSCrossRefGoogle Scholar - 18.R.D. Ball, V. Bertone, L. Del Debbio, S. Forte, A. Guffanti et al., Precision NNLO determination of \(\alpha _s(M_Z)\) using an unbiased global parton set. Phys. Lett. B
**707**, 66–71 (2012). arXiv:1110.2483 ADSCrossRefGoogle Scholar - 19.The NNPDF Collaboration, R.D. Ball et al., Impact of heavy quark masses on parton distributions and LHC phenomenology. Nucl. Phys. B
**849**, 296 (2011). arXiv:1101.1300 - 20.The NNPDF Collaboration, R.D. Ball et al., Unbiased global determination of parton distributions and their uncertainties at NNLO and at LO. Nucl. Phys. B
**855**, 153 (2012). arXiv:1107.2652 - 21.S. Forte, Parton distributions at the dawn of the LHC. Acta Phys. Polon. B
**41**, 2859 (2010). arXiv:1011.5247 Google Scholar - 22.J. Gao, L. Harland-Lang, J. Rojo, The Structure of the Proton in the LHC Precision Era. arXiv:1709.04922
- 23.G. D’Agostini, On the use of the covariance matrix to fit correlated data. Nucl. Instrum. Methods A
**346**, 306–311 (1994)ADSCrossRefGoogle Scholar - 24.The NNPDF Collaboration, R.D. Ball et al., Fitting parton distribution data with multiplicative normalization uncertainties. JHEP
**05**, 075 (2010). arXiv:0912.2276 - 25.R.D. Ball, S. Carrazza, L. Del Debbio, S. Forte, J. Gao et al., Parton distribution benchmarking with LHC data. JHEP
**1304**, 125 (2013). arXiv:1211.5142 ADSCrossRefGoogle Scholar - 26.NNPDF Collaboration, R.D. Ball et al., Parton distributions for the LHC Run II. JHEP
**04**, 040 (2015). arXiv:1410.8849 - 27.A.D. Martin, W.J. Stirling, R.G. Roberts, The alpha-s dependence of parton distributions. Phys. Lett. B
**356**, 89–94 (1995). arXiv:hep-ph/9506423 ADSCrossRefGoogle Scholar - 28.J. Pumplin et al., Uncertainties of predictions from parton distribution functions. 2. The Hessian method. Phys. Rev. D
**65**, 014013 (2001). arXiv:hep-ph/0101032 ADSCrossRefGoogle Scholar - 29.A.D. Martin, R.G. Roberts, W.J. Stirling, R.S. Thorne, Uncertainties of predictions from parton distributions. I: experimental errors. Eur Phys. J. C
**28**, 455–473 (2003). arXiv:hep-ph/0211080 ADSCrossRefGoogle Scholar - 30.A.D. Martin, W.J. Stirling, R.S. Thorne, G. Watt, Parton distributions for the LHC. Eur. Phys. J. C
**63**, 189 (2009). arXiv:0901.0002 ADSCrossRefMATHGoogle Scholar - 31.V. Bertone, S. Carrazza, J. Rojo, APFEL: a PDF Evolution library with QED corrections. Comput. Phys. Commun.
**185**, 1647 (2014). arXiv:1310.1394 ADSMathSciNetCrossRefMATHGoogle Scholar - 32.V. Bertone, S. Carrazza, N.P. Hartland, APFELgrid: a high performance tool for parton density determinations. Comput. Phys. Commun.
**212**, 205–209 (2017). arXiv:1605.02070 ADSCrossRefGoogle Scholar - 33.J. Currie, E.W.N. Glover, J. Pires, NNLO QCD predictions for single jet inclusive production at the LHC. Phys. Rev. Lett.
**118**(7), 072002 (2017). arXiv:1611.01460 ADSCrossRefGoogle Scholar - 34.ATLAS Collaboration, G. Aad et al., Measurement of the inclusive jet cross-section in proton-proton collisions at \( \sqrt{s}=7 \) TeV using 4.5 fb\(^{1}\) of data with the ATLAS detector. JHEP
**02**, 153 (2015). arXiv:1410.8857. [Erratum: JHEP**09**, 141 (2015)] - 35.CMS Collaboration, S. Chatrchyan et al., Measurements of differential jet cross sections in proton-proton collisions at \(\sqrt{s}=7\) TeV with the CMS detector. Phys. Rev. D
**87**, 112002 (2013). arXiv:1212.6660 - 36.S. Carrazza, S. Forte, Z. Kassabov, J. Rojo, Specialized minimal PDFs for optimized LHC calculations. Eur. Phys. J. C
**76**(4), 205 (2016). arXiv:1602.00005 ADSCrossRefGoogle Scholar - 37.A.D. Martin, W.J. Stirling, R.S. Thorne, G. Watt, Uncertainties on \(\alpha _S\) in global PDF analyses. Eur. Phys. J. C
**64**, 653–680 (2009). arXiv:0905.3531 ADSCrossRefGoogle Scholar - 38.H. Akaike, A new look at the statistical model identification. IEEE Trans. Autom. Control
**19**(6), 716–723 (1974)ADSMathSciNetCrossRefMATHGoogle Scholar - 39.K.P. Burnham, D.R. Anderson, Multimodel inference: understanding aic and bic in model selection. Sociol. Methods Res.
**33**(2), 261–304 (2004)MathSciNetCrossRefGoogle Scholar - 40.The NNPDF Collaboration, R.D. Ball et al., A first unbiased global NLO determination of parton distributions and their uncertainties. Nucl. Phys. B
**838**, 136 (2010). arXiv:1002.4407 - 41.G. D’Agostini,
*Bayesian reasoning in data analysis: A critical introduction*(World Scientific, Singapore, 2003)Google Scholar - 42.L.A. Harland-Lang, A.D. Martin, P. Motylinski, R.S. Thorne, Uncertainties on \(\alpha _S\) in the MMHT2014 global PDF analysis and implications for SM predictions. Eur. Phys. J. C
**75**(9), 435 (2015). arXiv:1506.05682 ADSCrossRefGoogle Scholar - 43.S. Alekhin, J. Blmlein, S. Moch, R. Placakyte, Parton distribution functions, \(\alpha \_s\) and heavy-quark masses for LHC run II. Phys. Rev. D
**96**(1), 014011 (2017). arXiv:1701.05838 ADSCrossRefGoogle Scholar - 44.M. Cacciari, N. Houdeau, Meaningful characterisation of perturbative theoretical uncertainties. JHEP
**1109**, 039 (2011). arXiv:1105.5152 ADSCrossRefGoogle Scholar - 45.S. Forte, Theory systematics in PDFs, Talk at the CERN theory institute (2017)Google Scholar
- 46.S. Forte, Z. Kassabov, J. Rojo, L. Rottoli, Theoretical Uncertainties and Dataset Dependence of Parton Distributions. in
*Les Houches 2017: Physics at TeV Colliders Standard Model Working Group Report*(2018). arXiv:1803.07977 - 47.M. Bonvini, S. Marzani, J. Rojo, L. Rottoli, M. Ubiali, R.D. Ball, V. Bertone, S. Carrazza, N.P. Hartland, Parton distributions with threshold resummation. JHEP
**09**, 191 (2015). arXiv:1507.01006 ADSCrossRefGoogle Scholar - 48.R.D. Ball, V. Bertone, M. Bonvini, S. Marzani, J. Rojo, L. Rottoli, Parton distributions with small-x resummation: evidence for BFKL dynamics in HERA data. arXiv:1710.05935
- 49.NNPDF Collaboration, R.D. Ball, V. Bertone, M. Bonvini, S. Carrazza, S. Forte, A. Guffanti, N.P. Hartland, J. Rojo, L. Rottoli, A determination of the charm content of the proton. Eur. Phys. J. C
**76**(11), 647 (2016). arXiv:1605.06515 - 50.The NNPDF Collaboration, R.D. Ball et al., Theoretical issues in PDF determination and associated uncertainties, Phys. Lett. B
**723**, 330 (2013). arXiv:1303.1189 - 51.R. Thorne, The effect on PDFs and \(\alpha _S(M_Z^2)\) due to changes in flavour scheme and higher twist contributions. Eur. Phys. J. C
**74**(7), 2958 (2014). arXiv:1402.3536 ADSCrossRefGoogle Scholar - 52.S. Alekhin, J. Bluemlein, S. Moch, The ABM parton distributions tuned to LHC data. Phys. Rev. D
**89**, 054028 (2014). arXiv:1310.3059 ADSCrossRefGoogle Scholar

## Copyright information

**Open Access**This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.

Funded by SCOAP^{3}