Uncertainties on $\alpha_S$ in the MMHT2014 global PDF analysis and implications for SM predictions

We investigate the uncertainty in the value of the strong coupling $\alpha_S(M_Z^2)$ when allowing it to be a free parameter in the recent MMHT global analyses of deep-inelastic and related hard scattering data that was undertaken to determine the parton distribution functions (PDFs) of the proton. The analysis uses the standard framework of leading twist fixed--order collinear factorisation in the ${\overline {\rm MS}}$ scheme. We study the constraints on the value of $\alpha_S(M_Z^2)$ coming from the individual data sets by repeating the NNLO and NLO global analyses at various fixed values of $\alpha_S(M_Z^2)$ about its optimum values, spanning the range $\alpha_S(M_Z^2)=0.108$ to $0.128$ in units of $0.001$, and making all PDFs sets available. The inclusion of the measurements of the cross section for inclusive $t\bar{t}$ production in the global fit allows us to explore the correlation between the values taken for the mass $m_t$ of the top quark and $\alpha_S(M_Z^2)$. We find that the best fit values are $\alpha_S(M_Z^2)=0.1201\pm0.0015$ and $0.1172\pm 0.0013$ at NLO and NNLO respectively, with the central values changing to $\alpha_S(M_Z^2)=0.1195$ and $0.1178$ when the world average of $\alpha_S(M_Z^2)$ is used as a data point. We investigate the interplay between the uncertainties on $\alpha_S(M_Z^2)$ and the uncertainties on the PDFs. In particular we calculate the cross sections for key processes at the LHC and show how the uncertainties coming from the standard MMHT2014 PDFs and from $\alpha_S(M_Z^2)$ can be provided independently and can be combined.


Introduction
There has been a continual improvement in the precision and in the variety of the data for deep-inelastic and related hard-scattering processes. Noteworthy additions in the years since the MSTW2008 analysis [1] have been the HERA combined H1 and ZEUS data on the total [2] and charm structure functions [3], and the variety of new data sets obtained at the LHC, as well as updated Tevatron data (for full references see [4]). Moreover the procedures used in the global PDF analyses of these data have been refined, allowing the partonic structure of the proton to be determined with ever-increasing accuracy and reliability. These improvements are important as it is necessary to quantify the Standard Model background as accurately as possible in order to isolate possible experimental signals of New Physics. One area that needs careful attention, at the present level of accuracy, is the treatment of the strong coupling, α S itself, in the global analyses. Here we extend the recent MMHT2014 global PDF analysis [4] to study the uncertainties on α S and their implications for predictions for processes at the LHC.
2 Treatment of α S (M 2 Z ) in the MMHT2014 analysis We refer to Fig. 1 for an overview of the treatment and of the values of α S obtained in the MMHT2014 global PDF analysis [4]. At both NLO and NNLO the value of α S (M 2 Z ) is allowed to vary just as another free parameter in the global fit. The best fit values are found to be as indicated by the dark arrows in Fig. 1. The corresponding total χ 2 profiles versus α S (M 2 Z ) are shown in Fig. 2. These plots clearly show the reduction in the optimum value of α S (M 2 Z ) as we go from the NLO to the NNLO analysis. In the next section we show how the individual data sets contribute to make up this χ 2 profile versus α S (M 2 Z ). It is sometimes debated whether one should attempt to extract the value of α S (M 2 Z ) from the PDF global fits or to simply use a fixed value taken from elsewhere -for example, to use the world average value [5]. However, we believe that very useful information on the coupling can be obtained from PDF fits, and hence have performed fits where this is left as a free parameter. As the extracted values of α S (M 2 Z ) in the NLO and NNLO MMHT2014 analyses [4] reasonably bridge the world average of α S (M 2 Z ) = 0.1185 ± 0.0006 [5], we regard these as our best fits. We note it is a common result in PDF analyses, and in other extractions of the strong coupling, for the best fit value to fall slightly as the order of the theoretical calculations increases. However, in order to explore further, as well as leaving α S (M 2 Z ) as a completely independent parameter, the MMHT2014 analyses were repeated including the world average value (without the inclusion of DIS data to avoid double counting) of α S (M 2 Z ) = 0.1187 ± 0.0007 as a data point in the fit. This changed the preferred values to as indicated by the short arrows in Fig 1. Each of these is about one standard deviation away from the world average, so our PDF fit is entirely consistent with the independent determinations of the coupling. Moreover, the quality of the fit to the data (other than the single 'data'  [4]. The dashed arrows indicate the values found in the MSTW2008 analysis [1]. We also show the world average value, which we note was obtained assuming, for simplicity, that the NLO and NNLO values are the same -which, in principle, is not the case. The short arrows are also of interest as they indicate the NLO and NNLO values which would have been obtained from the MMHT2014 global analyses if the world average value (obtained without including DIS data) were to be included in the fit. However, the default values α S,NLO = 0.120 and α S,NNLO = 0.118 were selected for the final MMHT2014 PDF sets 'for ease of use'; indeed, the small values of ∆χ 2 are the minute changes in χ 2 global in going from the optimal to these default fits.
point on α S (M 2 Z )) increases by about 1.5 units in χ 2 at NLO and just over one unit at NNLO when the coupling was included as a data point.
However, ultimately for the use of PDF sets by external users it is preferable to present the sets at common (and hence 'rounded') values of α S (M 2 Z ) in order to compare and combine with PDF sets from other groups, for example as in [6,7,8,9]. At NLO in the MMHT2014 analysis [4] we hence chose α S (M 2 Z ) = 0.120 as the default value for which the PDF sets with full error eigenvectors are made available. This is essentially identical to the value for the best PDF fit when the coupling is free, and still very similar when the world average was included as a constraint. At NNLO, α S (M 2 Z ) = 0.118 was chosen as a rounded value, very near to both the best fit value and the world average, and the fit quality is still only 1.3 units in χ 2 higher than that when the coupling was free. This is extremely close to the value determined when the world average is included as a data point. Hence, in MMHT2014 [4], we chose to use α S (M 2 Z ) = 0.118 as the default for NNLO PDFs, a value which is very consistent with the world average. At NLO we also made a set available with α S (M 2 Z ) = 0.118, but in this case Global fit: χ 2 0 = 3268.6 for 3040 pts.
Global fit: χ 2 0 = 2719 for 2707 pts. the χ 2 increases by 17.5 units from the best fit value. In [4] we also made available PDF sets corresponding to the best fit for α S (M 2 Z ) values ±0.001 relative to the default values in order for users to determine the α S (M 2 Z )-uncertainty in predictions if so desired. We will return to the issue of PDF+α S (M 2 Z ) uncertainty later. Before we continue we should specify how the running of α S (Q 2 ) is treated. There is more than one definition of the coupling commonly used in QCD phenomenology. Although the various prescriptions are all formally equivalent since they differ only at higher orders, numerical differences of the order of up to 1% can occur. We use the definition based on the full solution of the renormalisation group equation, in MS scheme, at the appropriate order, with boundary condition defined by the value of α S (M 2 Z ). This is identical to the definition in public codes such as pegasus [10] and hoppet [11], and is now effectively the standard in PDF analyses. 1 It differs, for example, from solutions to the renormalisation group equations truncated at a particular order.

Description of data sets as a function of α S
The NNLO MMHT2014 global analysis [4] was based on a fit to 40 different sets of data on deep-inelastic and related hard scattering processes. There were 10 different data sets of structure functions from the fixed-target charged lepton-nucleon experiments of the SLAC, BCDMS, NMC and E665 collaborations, six different neutrino data sets on F 2 , xF 3 and dimuon production from the NuTeV, CHORUS and CCFR collaborations, two Drell-Yan data sets from E886/NuSea, six different data sets from HERA involving the combined H1 and ZEUS structure function data, seven data sets from the Tevatron giving the measurements of inclusive jet, W and Z production by the CDF and D0 collaborations and, finally, nine data sets from the ATLAS, CMS and LHCb collaborations at the LHC. In addition, the NLO fit also used jet data from the ATLAS, CMS and H1 and ZEUS collaborations, which were not used at NNLO because it was judged that at present there is not sufficient knowledge of the full jet NNLO cross section; jet production at the Tevatron, on the other hand, is much nearer to threshold than at the LHC, so the threshold approximation to the full NNLO calculation is much more likely to provide a reliable estimate in this case. The goodness-of-fit quantity, χ 2 n , for each of the data sets, n = 1, ...40, is given for the NLO and NNLO global fits in the Table 5 of [4], and the χ 2 definition is explained in Section 2.5 of the same article. The references to all the data that are fitted are also given in [4].
In the NNLO global fit of [4], let us denote the contribution to the total χ 2 from data set n by χ 2 n,0 . Here we explore the χ 2 n profiles as a function of α S (M 2 Z ) by repeating the global fit for different fixed values of α S (M 2 Z ) in the neighbourhood of the optimum value given in (2). The results are shown in Figs. 3-7, where we plot the χ 2 n profiles for each data set n as the difference from the value at the global minimum, χ 2 n,0 , when varying α S (M 2 Z ). The points (•) in Figs. 3-7 are generated for fixed values of α S (M 2 Z ) between 0.108 and 0.128 in steps of 0.001. These points are then fitted to a quadratic function of α S (M 2 Z ) shown by the continuous curves. By definition we expect the profiles to satisfy (χ 2 n − χ 2 n,0 ) = 0 at α S (M 2 Z ) = 0.1172, corresponding to the value of α S (M 2 Z ) at the NNLO global minimum. Ideally, a data set should show a quadratic minimum about this point. Of course, in practice, the various data sets may pull, in varying degrees, to smaller or larger values of α S (M 2 Z ). There is a small amount of point-to-point fluctuation for the values of (χ 2 n − χ 2 n,0 ), even near the minimum, but near the minimum this is generally only at the level of fractions of a unit in χ 2 for a given data set. The fluctuations become larger as we go to values of α S (M 2 Z ) far from the minimum, particularly for lower α S (M 2 Z ), mainly because changes in χ 2 with small changes in α S (M 2 Z ) are becoming much greater. However, some of the "jumps" for individual sets near α S (M 2 Z ) = 0.108 imply that the global minimum in χ 2 for this choice of α S (M 2 Z ) is rather flat in certain parameter directions, with some relatively easy trade-off between the data sets which are poorly fit, and a transition to a different, approximately degenerate global minimum occurring with a small change in α S (M 2 Z ). Indeed, we have verified that at α S (M 2 Z ) = 0.108 there is a local minimum where "jumps" are eliminated, but with slightly higher global χ 2 than the result where there are "jumps". This highlights the fact that the PDF uncertainty is difficult to define properly for a value of α S (M 2 Z ) which is far from optimal and leads to many data sets being badly fit. We repeat this exercise at NLO. Then the profiles will satisfy (χ 2 n − χ 2 n,0 ) = 0 at α S (M 2 Z ) = 0.1201. We include in the plots the NLO points (as triangles) and show the corresponding quadratic fit by a dashed curve. In Fig. 8 we show the χ 2 n profiles for the LHC and HERA jet data that were included in the NLO fit. Here the bullet points and profile curve correspond to the NLO fit. These data were not included in the NNLO fit. Z ) for the different data sets is somewhat reduced. Thus the overall fit to this subset of the data is marginally better at NNLO. The difference α S,NNLO < α S,NLO is clearly evident in the majority of the corresponding plots.
The recent combined H1 and ZEUS structure function data from HERA prefer a value of α S (M 2 Z ) of about 0.120 at NNLO. Perhaps the only surprising result is the α S (M 2 Z ) behaviour of the combined data for F charm 2 which prefers a very low value of α S (M 2 Z ) at NNLO, whereas the uncombined data had a perfect quadratic behaviour about 0.118, see Fig.5 of [13]. Note, however, that the combined data contains some points at the lowest Q 2 which were not available as an individual data set. These data, particularly at low Q 2 , are sensitive to the value of the charm mass m c , and there is a correlation between its value and α s (M 2 Z ) [14]. This will be studied again with the up-to-date data in a future article.
The longitudinal structure function F L leads off with an α S term, and so the value of (χ 2 n − χ 2 n,0 ) depends more sensitively on α S (M 2 Z ). The NNLO plot shows an excellent quadratic dependence on α S (M 2 Z ), centred at 0.118. The NNLO coefficient functions for F L (x, Q 2 ) [15,16] are positive and significant, and the NLO fit tries to mimic these with a higher value of α S (M 2 Z ). Indeed, the data for F L , and also the E866/NuSea pp Drell-Yan cross sections data, are clearly more quadratic at NNLO than at NLO, with minima closer to the best-fit values. This indicates a strong preference for the NNLO description, which is not so apparent if only the global bestfit values χ 2 n,0 are known. As the E866/NuSea data for pd/pp Drell-Yan production are a ratio of cross sections, the sensitivity to the value of α S (M 2 Z ) is small. The Tevatron data, as well as the ATLAS W ± , Z production data and the ATLAS highmass Drell-Yan data, show, at NNLO, α S (M 2 Z ) profiles with quadratic behaviour with minima HERA e + p NC 920 GeV NLO NNLO CMS jets (7 TeV): χ 2 n,0 = 138.1 for 133 pts.
H1 99-00 e + p incl. jets: χ 2 n,0 = 14.2 for 24 pts. close to the best fit values. Again, the profiles are improved to those at NLO. The counter example are the LHCb data, which have profiles which are more reasonable at NLO than at NNLO. In general, the charge-lepton asymmetry measurements arising from W ± production at the Tevatron and the LHC, which are a ratio of cross sections, have much less constraint on the value of α S (M 2 Z ). Judging from the values of (χ 2 n − χ 2 n,0 ) away from the different minima of the various data sets or, rather, the steepness of the quadratic forms in α S (M 2 Z ), we see that there is a tendency for data at lower energies or lower Q 2 to have more constraint on the optimum global value of α S (M 2 Z ). This is to be anticipated, as we will see in Section 6.
There is a particularly strong, but also complicated, relationship between the value of α S (M 2 Z ) and the fit to data on the inclusive cross section for tt production, σ tt . We show the χ 2 profiles at NLO and NNLO in Fig. 9. Clearly there is a preference for a lower value of α S (M 2 Z ) at NNLO than at NLO, and a strong constraint in both cases, with χ 2 increasing by a large number of units, certainly compared to the number of data points, for small changes in α S (M 2 Z ). Indeed, nominally σ tt provides one of the strongest constraints of any data set for the lower limit of α S (M 2 Z ) at NLO and the upper limit of α S (M 2 Z ) at NNLO. However, the picture is more complicated than for other data sets due to the very strong correlation with the value of the mass m t of the top quark.
In the global fits the theory calculation of σ tt is performed with a preferred value of the top quark pole mass of 172.5 GeV, since this is the default in PYTHIA, used to extract the cross section in many of the measurements. Moreover, the majority of the cross sections are quoted for this value of m t . This value is also consistent with the world average of the measured value of 173.34 GeV [5]. However, we allow a 1 GeV uncertainty on the value of m t which can be thought of as accounting for the uncertainty in the value of m t itself and also for the small variation in the extracted cross sections with m t used; in general this is about a third the size of the variation of the calculation of σ tt with m t , and the net effect is an effective uncertainty a little lower than 1 GeV. To be specific, m t is left as a free parameter in the fit, but there is a χ 2 penalty of χ 2 mt = (m t − 172.5 GeV) 2 applied to keep the value close to the preferred value. This penalty is included in the values in Fig. 9. The allowed variation in m t away from the preferred central value of 172.5 GeV results in the NLO fit preferring a low value of m t = 171.7 GeV and the NNLO fit preferring a high value of m t = 174.2 GeV. The low value of m t in the global fit and the high value of α S (M 2 Z ) preferred by σ tt when α S (M 2 Z ) is varied, both occur for the same reason. That is, the NLO cross section tends to undershoot the data, and raising α S (M 2 Z ) and lowering m t both raise the cross section, leading to better agreement.
The NNLO correction to the cross section in the pole mass scheme is moderate, but large compared to the most precise data, and hence the NNLO cross section tends to be too high. This Tevatron, ATLAS, CMS σ tt : χ 2 n,0 = 6.6 for 13 pts.
Tevatron, ATLAS, CMS σ tt : χ 2 n,0 = 7.9 for 13 pts. leads to the opposite pulls to those at NLO, i.e NNLO prefers α S (M 2 Z ) low and m t high. Within the global fit we find that the allowed variation with accompanying penalty for deviations from m t = 172.5 GeV results in m t values at the best fit values of α S (M 2 Z ) which are of order 1 − 2σ away from either our preferred value or the world average, so have no particular inconsistency, but it is useful to examine the interplay between α S (M 2 Z ) and m t in rather more detail. Z ) of only ∼ 0.0005. We do note, however, that without a penalty for m t variation the best global fits are at m t = 168 GeV and m t = 180 GeV at NLO and NNLO respectively, so some penalty is clearly necessary. Ultimately, the value of α S (M 2 Z ) determined by the global fit is very insensitive to the value of m t used, and indeed, to the σ tt data, because these correspond to relatively few data points. Indeed, if these are left out of the global fit the change in the optimum value of α S (M 2 Z ) is only of order 0.0001−2 at NLO and NNLO. However, the interplay between α S (M 2 Z ) and m t is more dramatic for the σ tt data alone, as we will now show. . As α S (M 2 Z ) decreases there is a preference for a smaller mass, hence if the central value of m t had been chosen higher than 172.5 GeV for example, the best fit to σ tt would be for a higher value of α S (M 2 Z ). The constraint on α S (M 2 Z ) in the upper direction would be weakened slightly, however this data set does provide a significant constraint in this direction. If the penalty had been less severe, e.g. a increase in χ 2 tt for ∆m t = 2 GeV rather than ∆m t = 1 GeV, the best value of m t and α S (M 2 Z ) would not change significantly, as the fit quality does not improve for masses of m t < 171.7 for any α S (M 2 Z ), even discounting the penalty. However, the χ 2 tt curves for lower values of α S (M 2 Z ), i.e. 0.119 and 0.118 are falling quite steeply as m t decreases in the vicinity of m t = 172 GeV, so the increase in χ 2 tt with decreasing α S (M 2 Z ) seen in Fig. 9 (left) would be less severe if m t was allowed to choose smaller values, and the constraint on the lower values of α S (M 2 Z ) would be reduced somewhat. Hence, at NLO, alternative treatments of m t would allow a slightly higher best fit α S (M 2 Z ) than the default treatment, and a little scope for a relaxation of the lower limit on α S (M 2 Z ). At NNLO it is again clear that higher values of α S (M 2 Z ) prefer higher values of m t . However, for α S (M 2 Z ) = 0.118 or 0.119 the value of m t corresponding to the best fit is m t = 180 GeV or more. Again, there is little variation in the best value of χ 2 tt for 168 GeV< m t < 178 GeV,  Fig. 11 (right) to be less steep. Hence, at NNLO alternative treatments of m t would allow a slightly higher best fit value of α S (M 2 Z ) than the default treatment, and a little scope for a relaxation of the upper limit on α S (M 2 Z ). Hence, the overall conclusion is that some added freedom in m t would lead to potentially rather small changes in the minima of the χ 2 curves in Fig. 11, but a reduced rate of increase of χ 2 away from the minima. The implications of this will be discussed in the next section.
5 Uncertainty on α S (M 2 Z ) and calculation of PDF+α S (M 2 Z ) uncertainty First, recall that in the MMHT2014 analysis [4] we determined the uncertainties of the PDFs using the Hessian approach with a dynamical tolerance procedure. We obtained PDF 'error' eigenvector sets, each corresponding to 68% confidence level uncertainty, where the vectors are orthogonal to each other and span the PDF parameter space.
In order to determine the uncertainty on α S (M 2 Z ) at NLO and NNLO we begin by using the same technique as in the MSTW study of Ref. [13]; that is, for the 'error' eigenvectors we apply the tolerance procedure to determine the uncertainty in each direction away from the value at the best fit when one data set goes beyond its 68% confidence level uncertainty. The values at which each data set does reach its 68% confidence level uncertainty, plus the value of α S (M 2 Z ) for which each data set has its best fit (within the context of a global fit) are shown at NLO and NNLO in Fig. 12. However, unlike Fig. 7 of [13] we do not show all data, as with the increased number of sets there are now too many to show clearly on a single figure. Moreover, as seen earlier, many data sets have very little dependence, and hence produce very little constraint. Hence, we show those where both limits are within the range of α S (M 2 Z ) explicitly studied, i.e. 0.108 − 0.128 or where one limit is within 0.005 of the best fit value of α S (M 2 Z ). None of the data sets omitted using these criteria have a significant pull on α S (M 2 Z ). The dominant constraint on α S (M 2 Z ) in the downwards direction at NLO is from the top pair cross section data and, using the dynamical tolerance procedure, gives an uncertainty of ∆α S (M 2 Z ) = −0.0014. In the upwards direction it is the BCDMSp data with an uncertainty of ∆α S (M 2 Z ) = +0.0012. At NNLO the dominant downward constraint comes from NuTeV F 3 (x, Q 2 ) data which gives ∆α S (M 2 Z ) = −0.0012 and in the upwards direction it is the top pair cross section data, where the uncertainty is ∆α S (M 2 Z ) = +0.0008. There are a number of other data sets which give almost as strong constraints. For instance, at NLO in the downwards direction we find that SLAC deuterium data give ∆α S (M 2 Z ) = −0.0018 and in the upwards direction H1 jets give ∆α S (M 2 Z ) = +0.0019. At NNLO in the downwards direction SLAC deuterium data and CDF jet data give ∆α S (M 2 Z ) ≈ −0.0014, and in the upwards direction, at NNLO, the BCDMSp data give ∆α S (M 2 Z ) = +0.0014. In all cases there are other data sets that are not much less constraining than those mentioned explicitly. Hence, in no case is it a single data set which is overwhelmingly providing the dominant constraint on the upper or lower limit of α S (M 2 Z ). Similarly, no single data sets would change the central value by more than 0.001 if it were to be omitted.
Two of the four dominant constraints nominally come from σ tt , and at NLO we have α S (M 2 Z ) = 0.1201 +0.0012 −0.0014 and at NNLO α S (M 2 Z ) = 0.1172 +0.0008 −0.0012 . However, in the previous section we highlighted the interplay between α S (M 2 Z ) and m t when examining the fit quality of the σ tt data. We demonstrated that if some extra flexibility is allowed on the choice of central value of m t and/or on the 1-σ uncertainty that is used, then the constraints are relaxed to some degree. Hence, we are reluctant to treat the constraint from the data on σ tt completely rigorously. In order to see quite how we should deal with the constraints nominally due to these data we first check which data sets provide the next tightest constraint. If we were simply to ignore the constraints from σ tt we would find a change in uncertainty at NLO of ∆α S (M 2 Z ) = −0.0012 → −0.0017 and at NNLO ∆α S (M 2 Z ) = +0.0008 → +0.0014. These are significant, but hardly dramatic changes, and it would be no surprise if some alternative treatment of the default top mass resulted in changes of a similar type. Hence, it might be suitable to take these values as a simple alternative, arguing that the constraints from σ tt are not sufficiently greater than those from other data sets either to ignore the possible effects of alternative treatments of the mass m t or to warrant a completely thorough investigation at this stage. 3 However, there is the additional feature to note -whichever criterion we use we have some, albeit not too dramatic, asymmetry in the α S (M 2 Z ) uncertainty. There is no strong reason to apply this slight asymmetry, as the χ 2 profile for the global fit follows the quadratic curve very well at both NLO and NNLO, and the degree of asymmetry obtained using the dynamical tolerance procedure is arguably within the "uncertainty of the uncertainty". Hence at NLO and NNLO we average the two uncertainties (obtained without the σ tt constraint) obtaining When considering the uncertainty on the prediction for a physical quantity we should include the uncertainty on α S (M 2 Z ), as well as that on the PDFs. This is particularly important for cross sections that at leading order are proportional to a power of the coupling, such as σ tt or σ Higgs , which are proportional to α 2 S . A naive procedure would be to compute the error as The constraint from σ tt data does provide the dominant constraint in one direction for eigenvector 15 at NNLO. However, very nearly as strong a constraint is provided by other data sets and the eigenvector only provides at the very most 40% of the uncertainty on one distribution, the gluon, at any x value, in practice at high x. Hence, a slightly increased tolerance for this eigenvector would have minimal impact on any PDF uncertainties. 4 Taking a weighted average of the values in eqs. (4) and (5) would result in values slightly nearer to the world average, reflecting the fact that the dynamical tolerance procedure used to determine the uncertainty results in a ∆χ 2 global > 1.
where ∆σ α S is variation of the cross section when α S (M 2 Z ) is allowed to vary over a given range. However, it is inconsistent to use different values of α S in the partonic hard subprocess cross section and in the PDF evolution. Moreover, in a global PDF analysis, there are non-negligible correlations between the PDFs and the value of α S .
In the MSTW study [13] of the PDF+α S (M 2 Z ) uncertainties arising from the MSTW2008 analysis we advocated using our best fit value of α S (M 2 Z ) as the central value for PDF predictions, and then provided additional eigenvector sets at ±0.5σ and ±1σ values of α S (M 2 Z ). The uncertainty was then calculated by taking the envelope of the predictions using all these eigenvector sets. This still seems like an appropriate algorithm for use with the dynamical tolerance procedure of obtaining uncertainties. However, it can only really be applied if the central prediction is obtained using the PDFs defined at the best fit value of α S (M 2 Z ), which is no longer the case, and moreover was a rather complicated and time-consuming procedure.
Since the MSTW study [13] was undertaken it has been shown that, within the Hessian approach to PDF uncertainties, the correct PDF+α S (M 2 Z ) uncertainty on any quantity can be obtained by simply taking the PDFs defined at α S (M 2 Z ) ± ∆α S (M 2 Z ) and treating these two PDF sets (and their accompanying value of α S (M 2 Z )) as an extra pair of eigenvectors [18]. In short, the full uncertainty is obtained by adding the uncertainty from this extra eigenvector pair in quadrature with the PDF uncertainty. So we are back to the naive form (6), but now, importantly, with the correlations between the PDFs and α S included. This has the advantages of both being very simple, but also separating out the α S (M 2 Z ) uncertainty on a quantity explicitly from the purely PDF uncertainty. Strictly speaking, the method only holds if the central PDFs are those obtained from the best fit when α S (M 2 Z ) is left free, and if the uncertainty ∆α S (M 2 Z ) on α S (M 2 Z ), that is used, is the uncertainty obtained from the fit. If we use PDFs defined at α S (M 2 Z ) = 0.118 at NNLO we are still very near the best fit, and the error induced will be very small. At NLO a larger error will be induced by using the PDFs defined at α S (M 2 Z ) = 0.118 than those at α S (M 2 Z ) = 0.120. Any choice of ∆α S (M 2 Z ) of 0.001 − 0.002 should only induce a small error. Hence, overall we now advocate using this approach with NLO PDFs defined at α S (M 2 Z ) = 0.120 and NNLO PDFs defined at α S (M 2 Z ) = 0.118. The value of ∆α S (M 2 Z ) is open to the choice of the user to some extent, but it is recommended to stay within the range ∆α S (M 2 Z ) that we have found. In Section 7 we apply the above procedure to determine the PDF+α S (M 2 Z ) uncertainties on the predictions for the cross sections for benchmark processes at the Tevatron and the LHC, but first we examine the change in the PDF sets themselves with α S (M 2 Z ).

Comparison of PDF sets
It is informative to see the changes in the PDFs obtained in global fits for fixed values of α S (M 2 Z ) relative to those obtained for the central value; we only consider the NNLO case here, but note that the NLO PDFs behave in a similar way. These are shown in Figs. 13-15      the various PDFs as a function of x for Q 2 = 10 4 GeV 2 -a value of Q 2 relevant to data from the LHC. In almost every case the changes in the PDFs for the coupling varied in the range 0.116 < α S (M 2 Z ) < 0.120 are well within the PDF uncertainty bounds. As expected, the gluon distribution for x < 0.1 is larger for α S (M 2 Z ) = 0.116 and smaller for α S (M 2 Z ) = 0.120: a change which preserves the product α S g, which approximately determines the evolution of F 2 (x, Q 2 ) with Q 2 at low x. This is the dominant constraint on the gluon, and a smaller low x gluon leads to a larger high-x gluon (and vice versa) due to the momentum sum rule. The u and d PDFs have the opposite trend as α S (M 2 Z ) changes. At small x values this is a marginal effect, due to the interplay of a variety of competing elements. At high x the decreasing quark distribution with increasing α S is due to the quicker evolution of quarks to lower x. The insensitivity of the strange quark PDF to variations of α S (M 2 Z ) at low x is partly just due to the relative insensitivity of all low-x quarks, but is also partially explained by the comments in the previous section about the MMHT analysis [4] of dimuon production in neutrino interactions -where the changes in α S (M 2 Z ) are, to some extent, compensated by changes in the B(D → µ) branching ratio parameter.
In Fig. 16 we compare the changes in the gluon PDF for different fixed values of α S (M 2 Z ) at a much lower value of Q 2 , namely Q 2 = 10 GeV 2 . Here the gluon PDF is much more sensitive to the value of α S (M 2 Z ), and the changes in the gluon PDF lie outside its uncertainty bounds. The message is clear. At the high value of Q 2 = 10 4 GeV 2 the long evolution length means that the gluon PDF in the relevant broad x range about x ∼ 0.01 is determined by PDFs at larger x, and is relatively insensitive to the parameters of the starting distributions.

Benchmark cross sections
In this section we show uncertainties for cross sections at the Tevatron, and for 7 TeV and 14 TeV at the LHC. Uncertainties for 8 TeV and 13 TeV will be very similar to those at 7 TeV To calculate the cross section we use the same procedure as was used in [4]. That is, for W, Z and Higgs production we use the code provided by W.J. Stirling, based on the calculation in [19], [20] and [21], and for top pair production we use the procedure and code of [22]. Here our primary aim is not to present definitive predictions or to compare in detail to other PDF sets, as both these results are frequently provided in the literature with very specific choices of codes, scales and parameters which may differ from those used here. Rather, our main objective is to illustrate the procedure for estimating realistic PDF+α S (M 2 Z ) uncertainties.

W and Z production
We begin with the predictions for the W and Z production cross sections. The results at NNLO are shown in Table 1. In this case the cross sections contain zeroth-order contributions in α S , with positive NLO corrections of about 20%, and much smaller NNLO contributions. Hence a smaller than 1% change in α S (M 2 Z ) will only directly increase the cross section by a small fraction of a percent. The PDF uncertainties on the cross sections are 2% at the Tevatron and slightly smaller at the LHC -the lower beam energy at the Tevatron meaning the cross sections have more contribution from higher x where PDF uncertainties increase. The α S uncertainty σ PDF unc. α S unc.  Table 1: Predictions for W ± and Z cross sections (in nb), including leptonic branching, obtained with the NNLO MMHT2014 parton sets. The PDF and α S uncertainties are also shown, where the α S uncertainty corresponds to a variation of ±0.001 around its central value. The full PDF+α S (M 2 Z ) uncertainty is obtained by adding these two uncertainties in quadrature, as explained in Section 5.
is small, about 0.6% at the Tevatron and close to 1% at the LHC, being slightly larger at 14 TeV than at 7 TeV. Hence, the α S uncertainty is small, but more than the small fraction of a percent expected from the direct change in the cross section with α S . In fact the main increase in cross sections with α S is due to the change in the PDFs with the coupling, rather than its direct effect on the cross section. From Fig. 14 we see that the up and down quark PDFs increase with α S below x ∼ 0.1 − 0.2 due to increased speed of evolution. From Fig. 13 we note that the strange quark PDF increases a little with α S at all x values. As already mentioned the Tevatron cross sections are more sensitive to the high-x quarks, which decrease with increasing α S , so this introduces a certain amount of anti-correlation of the cross section with α S . However, the main contribution is from low enough x that the distributions increase with α S , so the net effect is an increase with α S a little larger than that coming directly from the α s dependence of the cross section. As the energy increases at the LHC the contributing quarks move on average to lower x and the increase of the cross section with α S increasesvery slightly more so at 14 TeV than at 7 TeV. However, even at 14 TeV the total PDF+α S uncertainty obtained by adding the two contributions in quadrature, is only a maximum of about 25% greater (for W − ) than the PDF uncertainty alone if ∆α S (M 2 Z ) = 0.001 is used.

Top-quark pair production
In Table 2 we show the analogous results for the top-quark pair production cross section. At the Tevatron the PDFs are probed in the region x ≈ 0.4/1.96 ≈ 0.2, and the main production is from the qq channel. As we saw, the quark distributions are reasonably insensitive to α S (M 2 Z ) in this region of x, as it is the approximate pivot point of the PDFs. Hence, there is only a σ PDF unc. α S unc.  Table 2: Predictions for tt cross sections (in nb), obtained with the NNLO MMHT2014 parton sets. The PDF and α S uncertainties are also shown, where the α S uncertainty corresponds to a variation of ±0.001 around its central value. The full PDF+α S (M 2 Z ) uncertainty is obtained by adding these two uncertainties in quadrature, as explained in Section 5. small change in cross section due to changes in the PDFs with α S . However, the cross section for tt production begins at order α 2 S , and there is a significant positive higher-order correction at NLO and still an appreciable one at NNLO. Therefore, a change in α S a little lower than 1% should give a direct change in the cross section of about 2%. This is roughly the change that is observed. This is compared to a PDF only uncertainty of nearly 3% due to sensitivity to higher x quarks that occurs for W, Z production.
At the LHC the dominant production at higher energies (and with a proton-proton rather than proton-antiproton collider) is gluon-gluon fusion, with the central x value probed being x ≈ 0.4/7 ≈ 0.06 at 7 TeV, and x ≈ 0.4/14 ≈ 0.03 at 14 TeV. As seen from the left plot of Fig. 13 the gluon decreases with increasing α S (M 2 Z ) below x = 0.1 and the maximum decrease is for x ∼ 0.02 − 0.03. The α S (M 2 Z ) uncertainty on σ tt for 7 TeV is about 2%, almost as large as at the Tevatron, with the gluon above the pivot point still contributing considerably to the cross section, so the indirect α S (M 2 Z ) uncertainty due to PDF variation largely cancels. For 14 TeV the lower x probed means that most contribution is below the pivot point and there is some anti-correlation between the direct α S variation and the indirect, with a reduced α S uncertainty of 1.5%. At this highest energy the PDF only uncertainty has also reduced to about 2% due to the decreased sensitivity to the uncertainty in high-x PDFs, the gluon in this case. At the Tevatron and for 7 TeV at the LHC the α S (M 2 Z ) uncertainty is a little smaller than the PDF uncertainty, and the total is about 1.3 times the PDF uncertainty alone. At 14 TeV they are very similar in size, so the total uncertainty, for ∆α S (M 2 Z ) = 0.001 is about √ 2 that of the PDF uncertainty.

Higgs boson production
In Table 3 we show the uncertainties in the rate of Higgs boson production from gluon-gluon fusion. Again, the cross section starts at order α 2 S and there are large positive NLO and NNLO contributions. Hence, changes in α S of about 1% would be expected to lead to direct changes in the cross section of about 3%. However, even at the Tevatron the dominant x range probed, i.e. σ PDF unc. α S unc.   Table 4: Predictions for Higgs Boson cross sections (in nb), obtained with the NLO MMHT 2014 parton sets.The PDF and α s are shown, with the α s uncertainty corresponding to a variation of ±0.001 around the central value (α S (M 2 Z ) = 0.120). The full PDF+α S (M 2 Z ) uncertainty is obtained by adding these two uncertainties in quadrature, as explained in Section 5.
x ≈ 0.125/1.96 ≈ 0.06, corresponds to a region where the gluon distribution falls with increasing α S (M 2 Z ) and at the LHC where x ≈ 0.01 − 0.02 at central rapidity the anti-correlation between α S (M 2 Z ) and the gluon distribution is near its maximum. Hence, at the Tevatron the total α S (M 2 Z ) uncertainty is a little less than the direct value at a little more than 2%, and at the LHC it is reduced to 1.5%. In the former case this is a little less than the PDF uncertainty of ∼ 3%, with some sensitivity to the relatively poorly constrained high-x gluon, while at the LHC the PDF uncertainty is much reduced due to the smaller x probed, and is similar to the α S (M 2 Z ) uncertainty. Hence for ∆α S (M 2 Z ) = 0.001 the uncertainty on the Higgs boson cross section from gluon-gluon fusion is about √ 2 that of the PDF uncertainty alone.
We also repeat the study at NLO for the Higgs cross section. The results are shown in Tables 4 and 5

Conclusions
The PDFs determined from global fits to deep-inelastic and related hard-scattering data are highly correlated to the value of α S (M 2 Z ) used, and any changes in the values of α S (M 2 Z ) must be accompanied by changes in the PDFs such that the optimum fit to data is still obtained. In [4] we produced PDF and uncertainty eigenvector sets for specific values of α S (M 2 Z ), guided by the values obtained when it was left as a free parameter in the fit. In this article we explicitly present PDF sets and the global fit quality at NLO and NNLO for a wide variety of α S (M NNLO: α S (M 2 Z ) = 0.1172 ± 0.0013 (68% C.L.), already presented in [4], but also present the uncertainties. We show the variation of the fit quality with α S (M 2 Z ) of each data set, within the context of the global fit, and see which are the more and less constraining sets, and which prefer higher and lower values. We see that most data sets show a systematic trend of preferring a slightly lower α S (M 2 Z ) value at NNLO than at NLO, but note that no particular type of data strongly prefers a high or low value of α S (M 2 Z ). HERA and Tevatron data tend to prefer higher values, but are not the most constraining data.
There are examples of fixed target DIS data which prefer either high or low values and similarly for the LHC data sets, which are new compared to our previous analysis [13]. Indeed our best values of α S (M 2 Z ) are almost unchanged from α S (M 2 Z ) = 0.1202 (NLO) and α S (M 2 Z ) = 0.1171 (NNLO). They are also very similar to the values obtained by NNPDF of α S (M 2 Z ) = 0.1191 (NLO) [23] and α S (M 2 Z ) = 0.1173 (NNLO) [24]. However, our extraction disagrees with the recent value α S (M 2 Z ) = 0.1132 (NNLO) in [25]. We find agreement at the level of one sigma or less with the world average value of α S (M 2 Z ) = 0.1187 ± 0.0005, and this improves when we include the world average (without the DIS determinations included) as a data point in our fit, when we obtain α S (M 2 Z ) = 0.1195 (NLO) and α S (M 2 Z ) = 0.1178 (NNLO). Hence, our NNLO value including α S (M 2 Z ) as an external constraint is in excellent agreement with the preferred value, α S (M 2 Z ) = 0.118, for which eigenvector sets are made available. The PDF sets obtained at the 21 different values of α S (M 2 Z ) at NLO and NNLO can be found at [26] and are available from the LHAPDF library [27]. They should be useful in studies of α S (M 2 Z ) by other groups. In order to calculate the PDF+α S (M 2 Z ) uncertainty we now advocate the approach pioneered in [18] of treating PDFs with α S (M 2 Z )±∆α S (M 2 Z ) as an extra eigenvector set. As shown in [18], provided certain conditions are met (at least approximately), the α S (M 2 Z ) uncertainty may be correctly added to the PDF uncertainty by simply adding in quadrature the variation of any quantity under a change in coupling ∆α S (M 2 Z ) as long as the change in α S (M 2 Z ) is accompanied by the appropriate change in PDFs required by the global fit. As examples, we have calculated the total cross sections for the production of W , Z, top quark pairs and Higgs bosons at the Tevatron and LHC. For W and Z production, where the LO subprocess is O(α 0 S ) and is quark-initiated, the combined "PDF+α S " uncertainty is not much larger than the PDF-only uncertainty with a fixed α S . However, the additional uncertainty due to α S is more important for top quark pair production and Higgs boson production via gluon-gluon fusion, since the LO subprocess now is O(α 2 S ), though the details depend on the correlation between α S (M 2 Z ) and the contributing PDFs.
In addition, we note that for any particular process the details of the uncertainty can now be explicitly calculated in a straightforward way using the PDFs we have provided in this paper, together with the procedure for combining PDF and α S (M 2 Z ) uncertainty discussed in Section 5.
Moreover, it is also straightforward to apply the procedure to determine the uncertainties coming from combinations of PDF sets obtained by global analyses of different groups. Using techniques given in [28,29,30,31] it is possible to combine different PDF sets at a preferred value of α S (M 2 Z ) such that the central value and the uncertainty of the combination is correctly obtained. The procedure to determine the uncertainty due to variations of α S (M 2 Z ) is as follows. If each group used in the combination also makes available sets of PDFs obtained by repeating their global fits 5 with α S (M 2 Z ) ± ∆α S (M 2 Z ), then an additional pair of PDF sets representing the α S (M 2 Z ) variation of the combination can be obtained just by taking the average of the PDFs from each group obtained at α S (M 2 Z ) + ∆α S (M 2 Z ), and by taking the average at α S (M 2 Z ) − ∆α S (M 2 Z ). As a result the PDF+α S (M 2 Z ) uncertainty for any quantity calculated using the combined set is just the PDF induced uncertainty at the preferred value of α S (M 2 Z ) added in quadrature to the α S (M 2 Z ) uncertainty determined from the two combined sets defined at α S (M 2 Z ) ± ∆α S (M 2 Z ). Hence, a user may determine for any process the optimum prediction, the PDF uncertainty, the α S (M 2 Z ) uncertainty and the complete PDF+α S (M 2 Z ) uncertainty arising from the combination of a whole collection of different PDFs. 5 For instance, if α S (M 2 Z ) = 0.118 is the preferred value then repeating global fits at α S (M 2 Z ) = 0.117 and α S (M 2 Z ) = 0.119 would be sufficient to quantify the uncertainty due variations of α S .