An investigation of the $\alpha_S$ and heavy quark mass dependence in the MSHT20 global PDF analysis

We investigate the MSHT20 global PDF sets, demonstrating the effects of varying the strong coupling $\alpha_S(M_Z^2)$ and the masses of the charm and bottom quarks. We determine the preferred value, and accompanying uncertainties, when we allow $\alpha_S(M_Z^2)$ to be a free parameter in the MSHT20 global analyses of deep-inelastic and related hard scattering data, at both NLO and NNLO in QCD perturbation theory. We also study the constraints on $\alpha_S(M_Z^2)$ which come from the individual data sets in the global fit by repeating the NNLO and NLO global analyses at various fixed values of $\alpha_S(M_Z^2)$, spanning the range $\alpha_S(M_Z^2)=0.108$ to $0.130$ in units of $0.001$. We make all resulting PDFs sets available. We find that the best fit values are $\alpha_S(M_Z^2)=0.1203\pm 0.0015$ and $0.1174\pm 0.0013$ at NLO and NNLO respectively. We investigate the relationship between the variations in $\alpha_S(M_Z^2)$ and the uncertainties on the PDFs, and illustrate this by calculating the cross sections for key processes at the LHC. We also perform fits where we allow the heavy quark masses $m_c$ and $m_b$ to vary away from their default values and make PDF sets available in steps of $\Delta m_c =0.05~{\rm GeV}$ and $\Delta m_b =0.25~{\rm GeV}$, using the pole mass definition of the quark masses. As for varying $\alpha_S(M_Z^2)$ values, we present the variation in the PDFs and in the predictions. We examine the comparison to data, particularly the HERA data on charm and bottom cross sections and note that our default values are very largely compatible with best fits to data. We provide PDF sets with 3 and 4 active quark flavours, as well as the standard value of 5 flavours.


Introduction
In recent years there has been a significant improvement in both the precision and in the variety of the available data for deep-inelastic and related hard-scattering processes which can be used in determinations of the parton distribution functions (PDFs). This has been matched by the increasing precision of the theoretical calculations for the accompanying cross sections. These have both contributed to the recent PDF update, known as the MSHT20 PDFs [1], which supersede the previous MMHT2014 PDFs [2] obtained using the same general framework. Particular additions to the data used in the global fit have been the final HERA combined H1 and ZEUS data on the total and the heavy flavour cross sections, some final precision Tevatron asymmetry data and new Drell Yan, top quark pair, jet and Z p T data sets obtained at the LHC. For some of the LHC data this is the first of our PDF determinations for which the full NNLO calculations have been available. Additionally, the procedures used in the global PDF analyses have been improved, particularly the parameterisation, allowing the partonic structure of the proton to be determined with improved accuracy and reliability. Indeed, a new result is that the NNLO PDF set is found to be greatly favoured in comparison to the NLO PDF set [1]. However, in our default PDF determination we have presented PDFs with fixed, pre-determined values of the strong coupling constant α S (M 2 Z ), and only made passing reference to the preferred values. Here we extend the recent MSHT20 global PDF analysis to study in turn the preferred value and uncertainty on α S (M 2 Z ), the constraints from individual data sets in the fit, and the implications for predictions for processes at the LHC. Moreover, we make available NLO and NNLO PDF sets for various fixed values of α S (M 2 Z ), spanning the range α S (M 2 Z ) = 0.108 to 0.130 in units of 0.001. Similarly, our default fit uses fixed pole masses of the charm and bottom quarks, m c = 1.4 GeV and m b = 4.75 GeV. Here, we extend the MSHT20 global PDF analysis [1] to study the dependence of the PDFs, and the quality of the comparison to data, under variations of these masses away from their default values. We investigate the resulting predictions for processes at the LHC. We make available central PDF sets for m c = 1.2 − 1.6 GeV in steps of 0.05 GeV and m b = 4.00 − 5.50 GeV in steps of 0.25 GeV, and also make available the standard MSHT20 PDFs, as well as the sets with alternate masses, in the 3 and 4 flavour number schemes. 2 The strong coupling α S (M 2 Z ) In our default PDF study we fix the value α S (M 2 Z ) = 0.118 at NNLO, in order to be consistent with the world average value [3]. At NLO we consider the same value α S (M 2 Z ) = 0.118, and also use α S (M 2 Z ) = 0.120 since the best-fit value of α S (M 2 Z ) in NLO PDF studies consistently lies ∼ 0.002 above that at NNLO [4]. In [1] we noted, however, that as for the MMHT2014 PDFs the best fit value of α S (M 2 Z ) at NNLO is just a little below the default of α S (M 2 Z ) = 0.118 while at NLO it is again close to α S (M 2 Z ) = 0.120. Here we present the variation with α S (M 2 Z ) in The corresponding total χ 2 profiles versus α S (M 2 Z ) are shown in Fig. 1. The points indicate the fits performed with different fixed α S (M 2 Z ) values whilst the line represents a quadratic fit. These plots indeed clearly show the reduction in the optimum value of α S (M 2 Z ) as we go from the NLO to the NNLO analysis. It is also clear that the global χ 2 shows a very good quadratic behaviour as a function of α S (M 2 Z ), even for the extreme α S (M 2 Z ) values taken well away from the best fits. We also provide in Table 1 the ∆χ 2 values as one moves away from the best fit values of α S (M 2 Z ) of 0.1203 at NLO or 0.1174 at NNLO. In the next section we show how the individual data sets contribute to produce this χ 2 profile versus α S (M 2 Z ), and also determine the uncertainty on α S (M 2 Z ). It is a matter of debate whether one should actually extract the value of α S (M 2 Z ) from PDF global fits or simply use a fixed value, i.e. the world average value [3]. Our opinion has always been that a very accurate and precise value of the coupling can be obtained from PDF fits, and hence we have traditionally performed fits in order to determine this parameter and its uncertainty. Indeed the extracted value of α S (M 2 Z ) in the NNLO MSHT analyses continues the trend of our extraction of being close to the world average of α S (M 2 Z ) = 0.1179 ± 0.001 [3], and as in previous studies our NLO value is a little higher, a result frequently seen in extractions of α S (M 2 Z ). Hence, the result from our PDF fit is entirely consistent with the independent determinations of the coupling. We also note that the quality of the global fit to the data increases by only 2.7 units in χ 2 at NNLO when we move away from our absolute best-fit value to the default of α S (M 2 Z ) = 0.118, and so our default PDFs give an extremely good representation of our PDFs at the best fit value of α S (M 2 Z ). Hence, since for the use of PDF  sets by external users it is preferable to present PDFs at common (and hence 'rounded') values of α S (M 2 Z ) in order to compare and combine with PDF sets from other groups, for example as in [5][6][7][8][9], we continue to define those extracted for the choice α S (M 2 Z ) = 0.118 as the default. At NLO we also make a set available with the same α S (M 2 Z ) = 0.118 following the same reasoning, but in this case the χ 2 increases by 50 units from the best fit value. Therefore the default PDFs also contain the set at the round value of α S (M 2 Z ) = 0.120, extremely close to the best-fit value -differing by only 1.1 units of χ 2 . In [1] we provided PDF sets corresponding to the best fit for α S (M 2 Z ) values ±0.001 and ±0.002 relative to the default values, in order for users to determine the α S (M 2 Z ) uncertainty in predictions if so desired. Here we will extend the range of α S (M 2 Z ) values provided, and return to the issue of PDF+α S (M 2 Z ) uncertainty later.
2.1 Description of data sets as a function of α S (M 2 Z ) The MSHT20 global analysis [1] presented PDF sets at LO, NLO and NNLO in α S , where for NNLO we use the splitting functions calculated in [10,11] and for structure function data, the massless coefficient functions calculated in [12][13][14][15][16][17]. The fit is based on a fit to 61 different sets of data on deep-inelastic and hard scattering processes 1 . These comprise: 10 structure functions data sets from the fixed-target charged lepton-nucleon experiments of the SLAC, BCDMS, NMC and E665 collaborations; 6 neutrino data sets on F 2 , xF 3 and dimuon production from the NuTeV, CHORUS and CCFR collaborations; 2 fixed-target Drell-Yan data sets from E886/NuSea; eight data sets from HERA involving the combined H1 and ZEUS structure function data and heavy flavour structure function data; 8 data sets from the Tevatron, namely measurements of inclusive jet, W and Z production by the CDF and DØ collaborations; and finally, in a dramatic increase from the MMHT2014 study, 27 data sets from the ATLAS, CMS and LHCb collaborations at the LHC. The goodness-of-fit, χ 2 n , for each of the data sets is given for the NLO and NNLO global fits in the Tables 6 and 7 of [1], and the χ 2 definition is explained in Section 2.4 of the same article. The references to all of the data that are fitted are also given in [1].
For the NNLO global fit of [1], we denote the contribution to the total χ 2 from data set n by χ 2 n and we investigate the χ 2 n profiles as a function of α S (M 2 Z ) by repeating the global fit for different fixed values of α S (M 2 Z ) in the neighbourhood of α S (M 2 Z ) = 0.118. The results for data sets which show significant dependence on α S (M 2 Z ) are shown in Figs. 2 -5, where we plot the χ 2 n profiles when varying α S (M 2 Z ) for data set n as the difference from the value at the global minimum, χ 2 n,0 . Unlike in [4] we do not show all data sets, largely because the number has now expanded significantly. The points (+) in Figs. 2 -5 are generated for fixed values of α S (M 2 Z ) between 0.108 and 0.130 in steps of 0.001. These are then fitted to a quadratic function of α S (M 2 Z ) over the central region of the α S (M 2 Z ) variation, shown by the continuous curves, and included as a guide to the eye. The profiles satisfy (χ 2 n − χ 2 n,0 ) = 0 at α S (M 2 Z ) = 0.1174, corresponding to the value of α S (M 2 Z ) at the NNLO global minimum. If all data sets behaved in the same manner with respect to α S (M 2 Z ) then each would show a quadratic minimum about this point. Of course, in practice, the various data sets pull in varying degrees to smaller or larger values of α S (M 2 Z ). There is also some point-to-point fluctuation for the values of (χ 2 n − χ 2 n,0 ), even near the minimum, but this is generally small. A small number of data sets show some non-quadratic behaviour, but these do not include the sets with the most significant dependence on α S (M 2 Z ). Note that the fact that the minimum of χ 2 for an individual set within the global fit may be very different from the value of α S (M 2 Z ) preferred by the global fit highlights the issues in obtaining a value of α S (M 2 Z ) by comparison with a single data set using global fit PDFs, as discussed in [18].
We comment first in detail on the NNLO profiles, then we repeat this exercise at NLO, where the (χ 2 n − χ 2 n,0 ) profiles and fits are shown on the same figures (Figs. 2 -5) as the NNLO. The profiles in this case satisfy (χ 2 n − χ 2 n,0 ) = 0 at α S (M 2 Z ) = 0.1203. We make fewer comments on the NLO profiles as it was clearly shown in [1] that the NNLO fit is now preferred by the data, with NNLO needed to adequately describe many of the newer LHC data sets which became available for the MSHT20 analyses.
The fixed-target structure function data play an important role in constraining the value of α S (M 2 Z ), and there is some tension between these data sets. These are shown in Fig. 2 Z ) values around 0.114 and 0.120 respectively. The BCDMS and SLAC F d 2 data come from similar regions of x and Q 2 and are both subject to the deuteron corrections described in Section 2.2 of [1]. These are determined from a function with 4 free parameters which are allowed to vary, and have uncertainties, as α S (M 2 Z ) varies in the fits. However, the correction depends on x only, so there is perhaps some indication from this observation of α S (M 2 Z ) difference between proton and deuteron that the deuteron correction may prefer also some Q 2 dependence. The neutrino F 2 and xF 3 data prefer α S (M 2 Z ) ∼ 0.115 and 0.121 respectively. Neutrino dimuon production (not shown) has little dependence on α S (M 2 Z ), since the extra B(D → µ) branching ratio parameter, which we allow to vary with a 10% uncertainty, can partially compensate for the changes in α S (M 2 Z ). The combined H1 and ZEUS structure function data from HERA, shown in the top row of Fig. 3, do not provide a strong constraint on α S (M 2 Z ), though this is largely because the data play such a central role in the fit that the gluon distribution varies with α S (M 2 Z ) in a manner very much aligned with keeping the fit to these data at its optimum. The most sensitive inclusive data, i.e. the 920 GeV e + p data, mildly (given the very large number of points in this data set -402) prefer a value of α S (M 2 Z ) of about 0.120 at NNLO, while the heavy flavour structure function data prefer a quite low value of α S (M 2 Z ) at NNLO. These data, particularly at low Q 2 , are sensitive to the value of the charm mass m c , and there is some correlation between its value and α S (M 2 Z ), as discussed later in Section 3. The Tevatron data sets with the most interesting effects and largest constraints on the value of α S (M 2 Z ) are shown in the second row and third row (left) of Fig. 3. The Tevatron    data consist largely of Z-rapidity data and charge-lepton asymmetry measurements arising from W ± production, which are a ratio of cross sections, and therefore generally have little dependence on the value of α S (M 2 Z ). However, the latest DØ W -asymmetry, which is the most precise of these measurements, does have some limited sensitivity at NNLO, at least in the vicinity of the global best fit α S (M 2 Z ) values, with a preferred value very close to the global best-fit value. Nonetheless, given its small number of data-points and small χ 2 , the profiles are subject to significant noise, whilst the quadratic behaviour clearly reduces significantly away from these central α S (M 2 Z ) variations. The most constraining data sets from the Tevatron are therefore, predictably, the jet data sets. The CDF jet data prefer a slightly higher value, near α S (M 2 Z ) ∼ 0.119, at NNLO and disfavour low values, while the DØ jet data have a clear preference for higher values near α S (M 2 Z ) ∼ 0.124. Tevatron data on the total top pair production cross-section are also a part (along with LHC data) of the σ tt dataset, whose χ 2 variation is given in the bottom left of Fig. 4, which will be commented on later.
The big change since the MMHT2014 analysis in terms of data is the number of high precision LHC data sets and the variety of NNLO cross sections which can be used for these in the PDF analysis. Some of these display direct sensitivity to α S (M 2 Z ), e.g. inclusive jet cross sections, data on top-pair production and the p T distribution of the Z boson (all shown in Fig. 4), while others, e.g. determinations of the rapidity dependence of W and Z boson production have relatively indirect dependence on α S (M 2 Z ), but provide constraints due to their extreme precision. These are shown largely in Fig. 5, with the LHCb data in the bottom two rows of Fig. 3. Again, we only show those LHC data sets with clear sensitivity to α S (M 2 Z ). Note, however, that in some cases there is little apparent sensitivity since, similarly to the HERA data, the best fit PDFs change with α S (M 2 Z ) in such a manner as to compensate the direct α S (M 2 Z ) dependence. In general there are two contrasting trends in the LHC data. Overall, those data sets with direct dependence on α S (M 2 Z ) in their cross sections tend to more frequently prefer lower values of α S (M 2 Z ), as is clear in Fig. 4. This includes the ATLAS and CMS 7 TeV inclusive jet data, the ATLAS 8 TeV Z p T data, the ATLAS tt single differential dilepton data and CMS tt single differential data and CMS W + c jet data. Note, however, that the differential tt data (third row of Fig. 4) is calculated with a fixed value of m t = 172.5 GeV, whereas the total cross section (bottom row left of Fig. 4) allows the mass to vary (with a penalty), and the best fit value varies as α S (M 2 z ) does. In addition, the W + c jet cross section calculation (shown in Fig. 5 top left) is at NLO. In detail, the ATLAS 7 TeV jets data, CMS 7 TeV jets data, CMS 8 TeV single differential tt data, and ATLAS 8 TeV Z p T data at NNL0 all prefer α S (M 2 Z ) values in the ∼ 0.111 − 0.112 range. Meanwhile some of the other such data sets, including perhaps the most constraining jet data set in the fit, the 8 TeV CMS inclusive jet data, prefers an α S (M 2 Z ) value near the best fit value. In that case the χ 2 in the vicinity of the best fit depends relatively weakly on α S (M 2 Z ) compared to the large number of points, perhaps suggesting the high-x gluon varies in such a way as to moderate the α S (M 2 Z ) dependence for this data set. The 2.76 TeV CMS inclusive jet data, ATLAS 8 TeV single differential tt dilepton data, Tevatron and LHC total tt cross-section data also prefer a moderate α S (M 2 Z ) value. Finally, the CMS 7 TeV W + c jet data shown in the top left of Fig. 5 also support a slightly lowered value of α S (M 2 Z ) ∼ 0.115. In contrast, the precision W, Z data from ATLAS and CMS, shown in the remainder of Fig. 5, tend to prefer higher values of α S (M 2 Z ). At NNLO these tend to be only slightly raised relative to the best fit, but the trend at NLO is both clearer and more noticeable. In particular the CMS 8 TeV W data, ATLAS high-precision 7 TeV W, Z data, ATLAS 8 TeV High-mass Drell-Yan and ATLAS 8 TeV Z data χ 2 profiles minimise in the α S (M 2 Z ) ∼ 0.120 region, while for the ATLAS 8 TeV W data this occurs in the α S (M 2 Z ) ∼ 0.128 region, though in some cases the profiles are rather flat in the vicinity of the minima.
Given the relatively weak dependence of the cross section to α S (M 2 Z ) of these data sets the effect is most likely due to the manner in which α S (M 2 Z ) affects the evolution of the quark and anti-quark PDFs. Some of the high-rapidity LHCb data prefers lower values, for example the LHCb 7 TeV Z → ee and the LHCb W asymmetry p T > 20 GeV data shown in the bottom two rows (third row right and bottom row left) of  Fig. 3) and the LHCb 8 TeV Z → ee (not shown here) have a slight preference for the best fit α S (M 2 Z ) value. In any case these LHCb data sets have less constraining power. Hence, as with other general types of data, the LHC data do not provide a consistent trend for either low or high α S (M 2 Z ) compared to the overall best fit, as different individual data sets can pull in different directions.
At NLO similar conclusions for the α S (M 2 Z ) pulls of the different data set types and individual data sets can be drawn. For the structure function data the NNLO corrections to the structure functions are positive and speed up the evolution. In order to compensate for these missing corrections, the optimum values of α S (M 2 Z ) is generally larger than at NNLO. The difference α S,NNLO < α S,NLO is clearly evident in the majority of the corresponding plots in Fig. 2. The behaviour of the HERA data sets ( Fig. 3 top row) is similar to that at NNLO, with the heavy flavour structure function data again preferring a low value of α S (M 2 Z ). As for the Tevatron jet data, both the CDF and DØ jet data prefer a higher value of α S (M 2 Z ) at NLO than NNLO, in order to compensate for the missing positive NNLO corrections. The DØ W asymmetry data on the other hand at NLO are completely dominated by fluctuations due to the small number of points and weak dependence on α S (M 2 Z ). As a result we plot no quadratic fit in this case. For this dataset, whilst a preference for α S (M 2 Z ) around the best fit value could be seen for NNLO, this is no longer the case at NLO, with no clear trend obvious. The LHC data sets directly sensitive to α S (M 2 Z ), such as the jets, tt and Z p T (Fig. 4) generally show a similar behaviour at NLO to NNLO, mainly preferring lower values of α S (M 2 Z ) and doing so for the same data sets as reported at NNLO. As was the case for the Tevatron jets, for these LHC data sets there is perhaps a slight tendency for slightly increased values of α S (M 2 Z ) being preferred relative to NNLO. One of the most notable examples of this is the total tt crosssection data (from Tevatron and LHC) at NLO preferring a value around 0.124, compared to its preference for the best fit value at NNLO, however in this specific case this is the result of a poor fit at NLO, reflecting large NNLO corrections missing at that order. Finally, we comment on the precise LHC Drell-Yan data sets of Fig. 5 at NLO. We have seen in [1] that several of these, particularly the ATLAS 7 and 8 TeV precision W , Z data sets, are poorly fit at NLO, clearly preferring NNLO, therefore few conclusions can be drawn at NLO as a result. This is evidenced by the significantly different behaviour of the χ 2 profiles with α S (M 2 Z ) for these data sets at NLO relative to NNLO. Several data sets can also be observed to show less quadratic behaviour at NLO than NNLO, for example not just the DØ W asymmetry but also the LHCb 7 TeV Z → ee, LHCb 2015 W, Z, CMS 7 TeV jets, ATLAS 7 TeV W, Z and ATLAS 8 TeV Z data sets.

The best fit values and uncertainty on
Ever since the MSTW08 analysis of PDFs [19] we have determined the uncertainties of the PDFs using the Hessian approach with a dynamical tolerance procedure. That is, we obtain PDF 'error' eigenvector sets, each corresponding to 68% confidence level uncertainty, where the eigenvectors are orthogonal to each other and span the PDF parameter space. As in the MSTW and MMHT studies, we again determine the uncertainty on α S (M 2 Z ) at NLO and NNLO by using the same technique as for the PDF eigenvector uncertainties, i.e. we apply the tolerance procedure to determine the uncertainty in each direction away from the value at the best fit when one data set goes beyond its 68% confidence level uncertainty. The values at which each data set does reach its 68% confidence level uncertainty, and the value of α S (M 2 Z ) for which each data set has its best fit (within the context of a global fit) are shown at NLO and NNLO in Fig. 6, where we include only the most constraining data sets.
The dominant constraint on α S (M 2 Z ) in the downwards direction at NNLO is the CMS 8 TeV data on the W rapidity distribution, which gives ∆α S (M 2 Z ) = −0.0013, though this is closely followed by the ATLAS 8 TeV Z data, the SLAC deuteron data and ATLAS 8 TeV high-mass Drell Yan data, which give ∆α S (M 2 Z ) = −0.0017, −0.0018, −0.0019 respectively. In the upwards direction, at NNLO, the BCDMS proton structure function data give ∆α S (M 2 Z ) = +0.0012. This is very closely followed by the CMS tt single differential data and ATLAS tt single differential dilepton data, though both of these data sets have the caveat that the top mass is fixed, as discussed above. A bound of ∆α S (M 2 Z ) = +0.0018 is provided by the SLAC proton structure function data. In both directions there are other data sets that are not much less constraining than those mentioned explicitly. Hence, it is far from being a single data set which is overwhelmingly providing a dominant constraint on the upper or lower limit on α S (M 2 Z ). The dominant constraint at NLO in the downwards direction is from the top pair cross section data and, using the dynamical tolerance procedure, this gives an uncertainty of ∆α S (M 2 Z ) = −0.0014. However, even though the mass dependence is correctly accounted for when varying α S (M 2 Z ), the fit to this data set is poor compared to that at NNLO, with χ 2 = 17.5 at NLO at the best fit value of α S (M 2 Z ), as opposed to χ 2 = 14.3 at NNLO. Moreover, this is achieved for m t = 163.7 GeV, an unrealistically low pole mass value, and as α S (M 2 Z ) decreases m t has to decrease further in order to try to maintain the fit quality. It was noted in [1] that at NLO a number of data sets were simply fit poorly, and the tt total cross section is one of these, with large NNLO corrections missing from the cross section calculation. Hence, we do not use the constraint from this data set as our lower limit of α S (M 2 Z ). The next strongest constraint in the downwards direction is ∆α S (M 2 Z ) = −0.0017 from LHCb 7 and 8 TeV W, Z data, and we take this value as our lower limit. An almost identical constraint is provided by SLAC deuterium structure function data. In the upwards direction the strongest constraint is again from BCDMS proton structure function data with an uncertainty of ∆α S (M 2 Z ) = +0.0013. The next strongest constraint is nominally from the CMS 8 TeV tt single differential data with ∆α S (M 2 Z ) = +0.0014. However, given the fixed value of m t in the cross section used we would not include this constraint in any case; we note that there is a detailed study of the constraints on both α S (M 2 Z ) and m t in [20]. Constraints of ∆α S (M 2 Z ) ≈ +0.0018, +0.0019, +0.0021 are set by the following LHC data sets respectively: ATLAS 8 TeV Z p T distribution, ATLAS 7 TeV inclusive jets, and the CMS 7 TeV inclusive jets, although in the latter case the profile indicates a potential lack of quadratic behaviour. As at NNLO, the SLAC proton structure function data also provides an upper bound, this time of ∆α S (M 2 Z ) = +0.0023. The uncertainties in the upwards and downwards directions are slightly asymmetric, but for simplicity we chose to symmetrise these. Hence at NLO and NNLO we average the two uncertainties (obtained without the σ tt constraint). We obtain This corresponds to ∆ NLO χ 2 global = 19 and ∆ NNLO χ 2 global = 17. These are the sort of tolerance values typical of the PDF eigenvectors, though a little towards the higher end.
The NNLO value of α S (M 2 Z ) is well within 1σ of the world average of 0.1179 ± 0.001, while the NLO value is consistent within 2σ. This is not surprising as most determinations of α S (M 2 Z ) in the world average are obtained at NNLO, so it is effectively a NNLO value, which is lower than an NLO value would be. Hence, we present the values in eqs. (3) and (4) as independent measurements of α S (M 2 Z ), but acknowledge that at NNLO, taking both this determination and the world average into account then a round value of α S (M 2 Z ) = 0.118 is an appropriate one at which to present the PDFs. At NLO we would recommend the use of α S (M 2 Z ) = 0.120 as the preferred value for the PDFs, but have also made eigenvector sets available at α S (M 2 Z ) = 0.118.

The PDF+α S (M 2 Z ) uncertainty on cross sections
Within the Hessian approach to PDF uncertainties it has been shown that the correct manner in which to account for the PDF+α S (M 2 Z ) uncertainty on any quantity, with the correlations between the PDFs and α S included, can be obtained by simply taking the PDFs defined at α S (M 2 Z ) ± ∆α S (M 2 Z ) and treating these two PDF sets (with their accompanying value of α S (M 2 Z )) as an extra pair of eigenvectors [21]. The full uncertainty is obtained by adding the uncertainty from this extra eigenvector pair in quadrature with the PDF uncertainty, i.e.
This procedure has the benefit of being both simple, but also separating out the α S (M 2 Z ) uncertainty on a quantity explicitly from the purely PDF uncertainty. Strictly speaking, the method only completely holds if the central PDFs are those obtained from the best fit when α S (M 2 Z ) is left free, and if the uncertainty ∆α S (M 2 Z ) on α S (M 2 Z ) that is used is the uncertainty obtained from the fit. If instead we use PDFs defined at α S (M 2 Z ) = 0.118 at NNLO we are still very near the best fit, and the error induced by not expanding the eigenvector pair about the best fit value of α S (M 2 Z ) will be very small. At NLO a distinctly larger error will be induced by using the PDFs defined at α S (M 2 Z ) = 0.118 rather than those at α S (M 2 Z ) = 0.120. Any choice of ∆α S (M 2 Z ) of 0.001 − 0.002, as opposed to the α S (M 2 Z ) uncertainty in the last subsection, should only induce a small error. Hence, we advocate using this approach with NLO PDFs defined at α S (M 2 Z ) = 0.120 and NNLO PDFs defined at α S (M 2 Z ) = 0.118. The value of ∆α S (M 2 Z ) is open to the choice of the user to some extent, but it is recommended to stay close to the values of ∆α S (M 2 Z ) that we have found. A simple, and perfectly consistent choice might be to use ∆α S (M 2 Z ) = 0.001, similar to that for the world average. In Section 2.5 we apply the above procedure to determine the PDF+α S (M 2 Z ) uncertainties on the predictions for the cross sections for benchmark processes at the Tevatron, the LHC and at a potential future circular collider (FCC). First, we examine the change in the PDF sets themselves with α S (M 2 Z ).

Comparison of PDF sets with varying
It is informative to see the changes in the PDFs obtained in global fits for fixed values of α S (M 2 Z ) relative to those obtained for the central value. We only consider the NNLO case here, but note that the NLO PDFs behave in a very similar way. These are shown in Figs. 7-9 at Q 2 = 10 4 GeV 2 .
In general, the changes in the PDFs for the coupling varied in the range 0.116 < α S (M 2 Z ) < 0.120 are within the PDF uncertainty bounds. In more detail, the gluon distribution for x < 0.1 is larger for α S (M 2 Z ) = 0.116 and smaller for α S (M 2 Z ) = 0.120. This approximately preserves the product α S g, which largely determines the evolution of F 2 (x, Q 2 ) with Q 2 at low x. This is the dominant constraint on the gluon, and then the additional constraint of the momentum sum rule means a smaller low x gluon leads to a larger high-x gluon (and vice versa). The u and d PDFs have the opposite trend as α S (M 2 Z ) changes. At small x values this is a marginal effect, with the change in the gluon with α S (M 2 Z ) maintaining the evolution of the small-x structure function, and hence also the small-x quarks. At high x the decreasing quark distribution with    increasing α S is due to the quicker evolution of quarks. This is seen explicitly also in the plots for the valence distributions. The relative insensitivity of the strange quark PDF to variations of α S (M 2 Z ) at low x is partly just due to the insensitivity of all low-x quarks, but is also explained by changes in α S (M 2 Z ) being, to some extent, compensated by changes in the B(D → µ) branching ratio parameter, which we allow to be free in the fit, with a 10% uncertainty. At high x, the strange PDF shows the opposite trend to the u and d quark PDFs, increasing with α S (M 2 Z ). This occurs in order to compensate for the reduction of the u and, in particular, the d valence quark PDFs in this region, ensuring the charge weighted sum remains approximately constant.
Finally, in Fig. 10, we show the effect of different fixed values of α S (M 2 Z ) on the gluon at the lower scale of Q 2 = 10 GeV 2 , much closer to the starting scale. Here, the changes in the gluon PDF now lie notably outside the uncertainty bands of the α S (M 2 Z ) = 0.118 central fit. The reason for this difference is that at lower scales the gluon PDF is more sensitive to the larger variations in the low scale α S (Q 2 ) value. In contrast, at the high value of Q 2 = 10 4 GeV 2 relevant for LHC physics, the long evolution length means that the gluon in the data region around x ∼ 0.01 is determined by evolution and hence a convolution over the PDFs at larger x, leaving it more insensitive to the α S (M 2 Z ) value.

Benchmark cross sections
In this section we show uncertainties for cross sections at the Tevatron, and for 8 TeV and 13 TeV at the LHC, as well as for a 100 TeV FCC (pp). Uncertainties for 7 TeV and 14 TeV will be very similar to those at 8 TeV and 13 TeV, respectively. We calculate the cross sections for W and Z boson, Higgs boson via gluon-gluon fusion and top-quark pair production. For the W/Z ratio there will be almost complete cancellation in the α S (M 2 Z ) uncertainties.
We calculate the PDF and α S (M 2 Z ) uncertainties for the MSHT20 PDFs [1] at the default values of α S (M 2 Z ). We use a value of ∆α S (M 2 Z ) = 0.001 as an example: we provide our PDF sets with α S (M 2 Z ) changes in units of 0.001 and this is very similar to the uncertainty in the world average. However, for values similar to ∆α S (M 2 Z ) = 0.001 a linear scaling of the change in the prediction with α S (M 2 Z ) can be applied to a very good approximation. As explained in Section 2.3, the full PDF+α S (M 2 Z ) uncertainty may then be obtained by adding the two uncertainties in quadrature.
To calculate the cross section at NNLO in QCD perturbation theory we use the same procedure as in Section 9 of [1]. As there we use LO electroweak perturbation theory, with the qqW and qqZ couplings defined by and other electroweak parameters as in [19]. We take the Higgs mass to be m H = 125 GeV and the top pole mass is m t = 172.5 GeV. For the tt cross section we use top++ [22]. Here our primary aim is not to present definitive predictions or to compare in detail to other PDF sets, as both these results are frequently provided in the literature with very specific choices of codes, scales and parameters which may differ from those used here. Rather, our main objective is to illustrate the size of the PDF+α S (M 2 Z ) uncertainties.

W and Z production
The predictions for the W and Z production cross sections at NNLO are shown in Table  2. In this case the cross sections contain zeroth-order contributions in α S , with positive NLO corrections of about 20%, and much smaller NNLO contributions. Hence, an approximately 1% change in α S (M 2 Z ) will only directly increase the cross section by a small fraction of a percent. The PDF uncertainties on the cross sections are about 2% at the Tevatron and slightly smaller at the LHC; the lower beam energy at the Tevatron meaning the cross sections have more contribution from higher x, where PDF uncertainties increase. For these cross sections the α S uncertainty is small, about 0.6% at the Tevatron and close to 1% at the LHC, being slightly larger at 13 TeV than at 8 TeV, and larger again at 100 TeV. Hence, the α S uncertainty is small, but more than the small fraction of a percent expected from the direct change in the cross section with α S . This is because the main increase in cross sections with α S is due to the change in the PDFs with the coupling, rather than its direct effect on the cross section. From Fig. 8 we see that the up and down quark PDFs increase with α S below x ∼ 0.1 − 0.2, due to increased speed of evolution. From Fig. 7 we see that the strange quark PDF increases a little with α S at all x values. As already mentioned, the Tevatron cross sections are more sensitive to the high-x quarks, which decrease with increasing α S , so this introduces a certain amount of anti-correlation of the cross section with α S . However, even at the Tevatron the main contribution is from low enough x that the distributions increase with α S . The net effect is therefore an increase with α S , which is a little larger than that coming directly from the α S σ PDF unc. α S unc.  dependence of the cross section. As the energy increases at the LHC the contributing quarks move to lower x and the increase of the cross section with α S increases. This is a smaller effect than the increase in the PDF uncertainty itself at 100 TeV due to the very small x PDFs sampled. For any collider scenario the total PDF+α S uncertainty obtained by adding the two contributions in quadrature, is only a maximum of about 20% greater than the PDF uncertainty alone, if ∆α S (M 2 Z ) = 0.001 is used.

Top-quark pair production
In Table 3 we show the analogous results for the top-quark pair production cross section. At the Tevatron the PDFs are probed in the region x ∼ 0.2, and the main production source is the qq channel. The quark distributions are reasonably insensitive to α S (M 2 Z ) in this region of x, as it is in the approximate region of the transition point of the PDFs, where evolution switches from PDFs decreasing with scale to increasing. Hence, there is only a small change in cross section due to changes in the PDFs with α S . However, the cross section for tt production begins at order α 2 S , and there is a significant positive higher-order correction at NLO, and still an appreciable one at NNLO. Therefore, a change in α S a little lower than 1% should give a direct change in the cross section of about 2% or slightly more, which is indeed the change that is observed. This is to be compared with a slightly smaller PDF uncertainty of nearly 2%.
At the LHC the dominant production mechanism, due to the higher energy and protonproton nature of the collisions is gluon-gluon fusion, with the central x value probed being σ PDF unc. α S unc.  x ≈ 0.05 at 8 TeV, and x ≈ 0.03 at 13 TeV. As seen from the left plot of Fig. 7 the gluon decreases with increasing α S (M 2 Z ) below x = 0.1 and the maximum decrease is for x ∼ 0.01. The α S (M 2 Z ) uncertainty on σ tt at 8 TeV is slightly less than 2%, almost as large as at the Tevatron, with the gluon above the pivot point still contributing considerably to the cross section, such that the indirect α S (M 2 Z ) uncertainty due to PDF variation largely cancels. At 13 TeV the lower x probed means that most contribution is below the pivot point and there is some anticorrelation between the direct α S variation and the indirect impact via the PDFs, with a reduced α S uncertainty of 1.5%. At this energy the PDF only uncertainty has also reduced to about 2% due to the decreased sensitivity to the uncertainty in high-x PDFs, the gluon in this case. At 100 TeV we have x ≈ 0.004, and the PDF uncertainty has approximately minimised, while the anti-correlation between the gluon and α S (M 2 Z ) has increased such that there is a reduced α S uncertainty of 1.2%. At the 8 TeV and 13 TeV LHC the α S (M 2 Z ) uncertainty is similar to the PDF uncertainty, and the total is about 1.4 times the PDF uncertainty alone. At the Tevatron and 100 TeV FCC the α S (M 2 Z ) uncertainty is slightly larger, such that the total uncertainty, for ∆α S (M 2 Z ) = 0.001 is about 1.6-1.7 that of the PDF uncertainty.

Higgs boson production
In Table 4 we show the uncertainties in the rate of Higgs boson production from gluon-gluon fusion. As with top-pair production the cross section starts at order α 2 S and there are large positive NLO and NNLO contributions. Therefore, changes in α S of about 1% would be expected to lead to direct changes in the cross section of about 2 − 3%. However, even at the Tevatron the dominant x range probed, i.e. x ≈ 0.06, corresponds to a region where the gluon distribution falls with increasing α S (M 2 Z ), so there is some anti-correlation. At the LHC where x ≈ 0.01 − 0.02 at central rapidity the anti-correlation between α S (M 2 Z ) and the gluon distribution is near its maximum, and at the FCC where x ≈ 0.001, anti-correlation remains high. Hence, at the Tevatron the total α S (M 2 Z ) uncertainty is a little less than the direct value, i.e. a little more than 2%, and at the LHC and FCC it is reduced to about 1.5%. In the former σ PDF unc. α S unc.  case this is slightly less than the PDF uncertainty of ∼ 2.8%, with some sensitivity to the relatively poorly constrained high-x gluon, while at the LHC and FCC the PDF uncertainty is much reduced, due to the smaller x probed, and is smaller than the α S (M 2 Z ) uncertainty. Hence for ∆α S (M 2 Z ) = 0.001 the Higgs boson cross section from gluon-gluon fusion is about 1.6-1.7 that of the PDF uncertainty alone.

.1 Choice of the range of heavy-quark masses
In the study of heavy-quark masses that accompanied the MMHT PDFs [23] we varied the charm and bottom quark masses, defined in the pole mass scheme, from 1.15 GeV to 1.55 GeV, in steps of 0.05 GeV, and m b from 4.25 GeV to 5.25 GeV in steps of 0.25 GeV. This was an asymmetric range about our default value of m c = 1.4 GeV, and was because in this previous study for both charm and bottom the preferred mass values were towards the lower end of the range. In the present study, as we will show, there is no longer such a clear preference for lower values, so we choose for m c the symmetric range from 1.2 GeV to 1.6 GeV, while for m b we expand our range slightly from 4 GeV to 5.5 GeV.
Let us consider this range compared to the constraint from other determinations of the quark masses. These are generally quoted in the MS scheme, and in [3] where the two uncertainties are almost completely correlated. This disfavours m c ≤ 1.2 − 1.3 GeV and m b ≤ 4.6 − 4.7 GeV. There is some indication from PDF fits for a slightly lower m pole than that suggested by the simple use of the perturbative series out to the order at which it starts to show lack of convergence for the central pole mass value. As the fit quality prefers values slightly in this direction, we allow some values a little lower than this in our scan. In the upper direction the fit quality clearly deteriorates, so our upper values are not too far beyond the central values quoted above. We now consider the variation with m c and m b in more detail.  We repeat the global analysis in [1] for values of m c = 1.2−1.6 GeV in steps of 0.05 GeV. As in [1] we use the "optimal" version [26] of the TR' general mass variable flavour number scheme  GM-VFNS [27]. This uses the NNLO coefficient functions calculated in [28], and at NNLO we require some degree of approximation in the vicinity of Q 2 ∼ m 2 h as the O(α 3 S ) heavy flavour coefficient functions in the fixed flavour number scheme (FFNS) are still not known exactly, though the leading small-x term [29] and threshold logarithms [30,31] have been calculated. We assume all heavy flavour is generated by evolution from the gluon and light quarks, i.e. there is no intrinsic heavy flavour. We perform the analysis with α S (M 2 Z ) left as a free parameter in the fit at both NLO and NNLO, but also present our results at fixed values of the coupling of α S (M 2 Z ) = 0.118 at NLO at NNLO. We will concentrate on the results and PDFs with fixed coupling, as the standard MSHT PDFs were made available at these values. At NLO PDFs are made available with α S (M 2 Z ) = 0.118 and our default value of 0.120. We present results in terms of the χ 2 for the total set of data in the global fit and for just the data on the reduced cross section,σ cc(bb) , for open charm production at HERA [32]. For these variations, as well as the fit ∆χ 2 values shown by the points, we also provide a quadratic fit line as a guide to the behaviour. This is shown at NLO with α S (M 2 Z ) = 0.118 in Fig. 11. The variation in the quality of the fit to the HERA combined charm and bottom cross section data is very significant. The heavy flavour data clearly prefer a value close to m c = 1.5 − 1.55 GeV, above our default value of m c = 1.4 GeV. The deterioration is clearly such as to make values of m c < 1.3 GeV strongly disfavoured. However, there is a different variation in the fit quality to the global data set, with a clear preference for values near to m c = 1.35 GeV. The main constraint comes from the χ 2 for the inclusive HERA cross section data, shown in Fig. 12, but there is also a distinct preference for a low value of the mass from the χ 2 for the NMC structure function data, shown in Fig. 13, where the data for x ∼ 0.01 and Q 2 ∼ 4 GeV 2 are sensitive to the turn-on of the charm contribution to the structure function. There is also clear sensitivity in the ATLAS 7 TeV W, Z data, the ATLAS 8 TeV Z data and Z p T data and the DØ W asymmetry, see Figs. 14 and 15. Overall, there is some element of tension between the preferred value from the global fit and the fit to charm data. We do not attempt to make a rigorous determination of the best value of the mass or its uncertainty as we believe there are more precise and better controlled methods for this. However, a rough indication of the uncertainty could be obtained from the χ 2 profiles by treating m c in the same manner as the standard PDF eigenvectors and applying the dynamic tolerance procedure.

Dependence on m c
The analogous results for the global fit quality with α S (M 2 Z ) left free are shown in Table  5 The results of the same analysis at NNLO are shown for α S (M 2 Z ) = 0.118 and α S (M 2 Z ) left free in Fig. 16 and Table 6, respectively, where again in the latter case the corresponding α S (M 2 Z ) values are shown. The variation in the fit quality for the HERA combined charm and bottom cross section data is much reduced compared to NLO. The heavy flavour data clearly prefer a value at NNLO close to the default of m c = 1.4 GeV. The deterioration is clearly such     as to make very low values of m c strongly disfavoured, in contrast to MMHT14. The variation in the fit quality to the global data set is quite similar this time, with a preference for values near to m c = 1.35 GeV. Compared to NLO there is little constraint coming from the inclusive HERA cross section data, shown in Fig. 17. However, there is still a distinct preference for a low value of the mass from NMC structure function data shown in Fig. 18, where again the data for x ∼ 0.01 and Q 2 ∼ 4 GeV 2 are sensitive to the turn-on of the charm contribution to the structure function and prefer a lower value giving quicker evolution. At NNLO there is also a more clear similar effect for NuTeV F 2 (x, Q 2 ) data in Fig. 19 (left). The ATLAS 7 TeV W, Z data again distinctly prefer a high value of m c , see Fig. 19 (right). At NLO the fit to these data is so poor that it is difficult to attach as much importance to this result, but at NNLO it is more significant. The larger charm mass means suppression of the charm quark distribution. The charm quark (and antiquark) make more contribution to W production at  the LHC via charm-antistrange annihilation (or strange-anticharm) than to Z production, due to charm-anticharm annihilation. Hence, a charm suppression lowers the W +,− cross section compared to the Z cross section, particularly at lower rapidity where charm and strange quark distributions are more comparable to those of the up and down quark than at high x where valence quarks dominate. This is what the data prefer, and charm suppression effectively acts in the same manner as strange quark enhancement, which is a well-known feature of these data. Similar charm suppression is effectively seen in the NNPDF3.1 PDFs [33], but in that case due to direct suppression for 0.01 < x < 0.1 in the input fitted charm rather than via the mass. There is also significant sensitivity to the charm mass in the ATLAS 8 TeV Zp T data and the CMS W + c jet data, shown in Fig. 20.
The analogous results for the global fit quality with α S (M 2 Z ) left free at NNLO are shown in Table 6  Broadly speaking, the results at NLO and NNLO are similar, but with greater χ 2 variation at NLO. In both cases the preferred value of m c in the global fit is now only slightly below our default value, and much closer than for the MMHT14 study [23]. We repeat essentially the same procedure for the bottom quark mass, m b . We now vary the values of m b in the range 4 − 5.5 GeV in steps of 0.25 GeV. The results for the NLO PDFs with α S (M 2 Z ) = 0.118 are shown in Fig. 21. In this case we again provide a quadratic fit line to the ∆χ 2 values, in order to indicate the behaviour, though in this case the changes are sufficiently small that very small fluctuations in the fit quality make the quadratic behaviour less clear. There is a tendency to prefer slightly low values of m b , similar to the results in [23].

Dependence on m b
For the predictions to the heavy flavour cross section data, the preference is for low values of m b ∼ 4.5 − 4.75 GeV but this is slightly lower, m b ∼ 4.25 − 4.5 GeV in the global fit. There is not a very clear pull from most data sets other than the heavy flavour data, so it is largely a cumulative effect, though at NLO the ATLAS 7 TeV W, Z data quite strongly disfavours m b > 5 GeV and the inclusive HERA combined data also prefer lower values. In the case of m b it is likely the fit quality of some data sets is affected as much, or even more, by the change in the details of the running of the coupling as by the change in the PDFs. The results for the NNLO fit with α S (M 2 Z ) = 0.118 are shown in Fig. 22. The global fit is fairly weakly dependent on m b , and prefers a value m b = 4.25 − 4.75 GeV. As in the NLO case the χ 2 for the prediction forσ cc(bb) is better for slightly higher values of m b and the χ 2 minimises for m b = 4.75 GeV, which is our default value. The inclusive HERA combined data again prefers lower values of m b , and in this case this is the dominant reason for the global fit having a minimum in χ 2 a little below the fit to heavy flavour data.
In summary, the constraints on m b are relatively weak, and at both NLO and NNLO there is general compatibility with our default value of m b = 4.75 GeV, particularly from the most direct constraint from HERA heavy flavour data, although the global fit prefers a slightly lower value.  Figs. 23 and 24. We see at Q 2 = 5 GeV 2 (that is, quite close to the transition point Q 2 = m 2 c ) that the change in the gluon is well within its uncertainty band, though there is a slight increase in the gluon, mainly at smaller x with higher m c . The increased gluon with higher m c quickens the evolution of the structure function, which is suppressed by larger mass, so that these effects compensate each other. Similarly the light quark singlet distribution increases slightly at low Q 2 and small x for larger m c to make up for the smaller charm contribution to the small x structure function, and decreases a little at higher x in order to maintain momentum conservation. This trend is maintained at larger scales (e.g. as shown at Q 2 = 10 4 GeV 2 in Fig. 24), helped at small x by the increased gluon, with a crossing point at x = 0.06. For both the gluon and the singlet quark distributions, however, even at low Q 2 the changes are within uncertainties for these variations in m c . The charm distribution increases at low Q 2 for decreasing m c , and vice versa, simply due to the change in evolution length, ln(Q 2 /m 2 c ). This effect is well outside the uncertainty band of the central fit, but this should be expected. We have identified the transition point at which heavy flavour evolution begins with the quark mass. This has the advantage that the boundary condition for evolution is zero up to NLO (with our further assumption that there is no intrinsic charm), though there is a finite O(α 2 S ) boundary condition at NNLO in the GM-VFNS, available in [34]. In principle the results on the charm distribution at relatively low scales, such as that in Fig. 23 are sensitive to this choice of transition point at finite order, though as the order in QCD increases the correction for changes due to different choices of transition point arising from the corresponding changes in the boundary conditions become smaller and smaller, ambiguities always being of higher order than the calculation, see e.g. [35]. At scales more typical of LHC physics, however, the relative change in evolution length for the charm distribution is much reduced, as are the residual effects of choices relating to choice of transition point and intrinsic charm. This can be seen in both the significantly reduced variation in the charm PDF with varying m c in Fig. 24, and also partly in the notably reduced uncertainty bands on the central fit charm PDF. As a result, at these scales the change in the charm distribution is of the same general size as the PDF uncertainty for fixed m c for much of the x range, with a crossing point at x ∼ 10 −4 , although the variation around 10 −3 − 0.2 is larger, as seen in Fig. 24. However, as seen in Section 3, the variations performed for m c here are wider than favoured by the χ 2 variations or by the allowed variations in masses determined by other means. We also note that the charm structure function at these high scales is reasonably well represented by the charm distribution, while at low scales, which certainly includes Q 2 = 5 GeV 2 , this is not true. Indeed, at NNLO the boundary condition for the charm distribution is negative at very low x if the transition point is m 2 c , but this is more than compensated for by the gluon and light quark initiated cross section.

Changes in the PDFs
The relative changes in the gluon and light quarks for variations in m b are significantly reduced, due to the much smaller impact of the bottom contribution to the structure functions from the charge-squared weighting, as can be seen in Figs. 25 and 26, where we show NNLO PDFs for m b = 4 GeV to m b = 5.5 GeV. At Q 2 = 50 GeV 2 the relative change in the bottom distribution for a ∼ 10% change in the mass is similar to that for the same type of variation for m c . However, the extent to which this remains at Q 2 = 10 4 GeV 2 is greater than the charm    case due to the smaller evolution length.

Effect on benchmark cross sections
In this section we show the variation with m c and m b for cross sections at the Tevatron, for 8 TeV and 13 TeV at the LHC (variations for 7 TeV and 14 TeV will be very similar to those at 8 TeV and 13 TeV respectively) as well as for a 100 TeV FCC (pp). We calculate the cross sections for W and Z boson, Higgs boson via gluon-gluon fusion and top-quark pair production. Again our primary aim is not to present definitive predictions or to compare in detail to other PDF sets, but to illustrate the relative influence of varying m c and m b for these benchmark processes.
We show the predictions for the default MSHT20 PDFs, with PDF uncertainties, and the relative changes due to changing m c from 1.25 GeV to 1.55 GeV, and m b from 4.25 GeV to 5.25 GeV, i.e. changing the default values by approximately (but a little larger than) 10% in each direction in each case. The dependence of the benchmark predictions on the value of m c in Tables 7 -9 largely reflects the behaviour of the gluon with x. The changes in cross section to good approximation scale linearly in variation of masses away from the default values.

W and Z production
We begin with the predictions for the W and Z production cross sections. The results at NNLO are shown in Table 7. The PDF uncertainties on the cross sections are 2% at the Tevatron and slightly smaller at the LHC and larger at the FCC. The m c variation is about 0.4% at the Tevatron, mainly smaller at the LHC, and much larger for the FCC, i.e. about 1.2 − 1.5%. In all cases increased m c leads to an increase in the cross sections. This is due to increased light quarks at small x, though the decrease in the charm quark has the opposite effect, most significantly at the LHC. The larger effect at the FCC is a consequence of the smaller x sampled, where light quarks changes are larger while decreases in the charm quark are smaller. Changes in the cross section with m b variation are 0.3% at most.

Top-quark pair production
In Table 8 we show the analogous results for the top-quark pair production cross section. The PDF uncertainties on the cross sections are around 2% at the Tevatron and at the LHC, but a little smaller at 100 TeV as there is less sensitivity to the high-x gluon. The m c variation is about 0.5% at the Tevatron, between approximately 0.3% and 0.4% at the LHC and around 0.6% at the FCC. At the Tevatron the cross section decreases with increasing m c due to the decrease in high x light quarks seen in Fig. 24, and the dominance of the quark channel at this collider. At the LHC and FCC, where gluon gluon fusion is the dominant production mechanism, the cross section is positively correlated with m c due to the increase in the gluon distribution. Again, changes with m b are smaller but follow the same pattern as for m c .

Higgs boson production
In Table 9 we show the uncertainties in the rate of Higgs boson production from gluon-gluon fusion. For this process the cross section always increases with increasing m c , due to positive correlation between the gluon and m c . At the Tevatron the resultant uncertainty is ∼ 0.3%. At the LHC it is a little larger at ∼ 0.5 − 0.6%, whereas at the FCC it has increased to about 1% due to the greater change in the small-x gluon. Again changes with m b are smaller, but follow the same trend as for m c .
We recommend that in order to estimate the total uncertainty due to PDFs and the quark masses it is best to add the uncertainty due to the variation in quark mass in quadrature with the PDF uncertainty, or the PDF+α S uncertainty, if the α S uncertainty is also used.

PDFs in three-and four-flavour-number-schemes
In our default studies we work in a general-mass variable-flavour-number-scheme (GM-VFNS) with a maximum of 5 active flavours. This means that we start evolution at our input scale of Q 2 0 = 1 GeV 2 with three active light flavours. At the transition point m 2 c the charm quark starts evolution, from a non-zero value at NNLO and beyond, and then at m 2 b the bottom quark also starts evolution. The evolution is in terms of massless splitting functions, and at high Q 2 the contribution from charm and bottom quarks lose all mass dependence other than that input     via the perturbative boundary conditions at the chosen transition point. The explicit mass dependence is included at lower scales, but falls away like inverse powers as Q 2 /m 2 c,b → ∞. We do not currently ever consider the top quark as a parton, though this would probably need to change for detailed studies at 100 TeV.
We could alternatively keep the information about the heavy quarks only in the coefficient functions, i.e. the heavy quarks would only be generated in the final state. This is called a fixed-flavour-number-scheme (FFNS). For example, we could decide that neither charm and bottom exist as partons, and this would be a 3-flavour FFNS. Alternatively we could let charm evolution turn on, but never allow the bottom quark to be treated as a parton -a 4-flavour FFNS. We will use this notation for PDFs where the bottom quark is absent, but strictly speaking it is a GM-VFNS with a maximum of 4 active flavours as the charm quark will not exist for scales below m 2 c . One might produce the partons for the 3-and 4-flavour FFNS by performing global fits in these schemes. However, as discussed in [36], the fit to structure function data is not optimal in these schemes. Indeed, definite evidence for this has been provided in [26,37,38]. Moreover, much of the data (for example, on inclusive jets and W, Z production at hadron colliders) is not known to NNLO in these schemes, and is very largely at scales where m c,b are very small compared to the scales, and effectively act as if massless. So it seems clear that the GM-VFNS are more appropriate. Hence, in [39] it was decided to make available PDFs in the 3-and 4flavour schemes simply by using the input PDFs obtained in the GM-VFNS, but with evolution of the bottom quark, or both the bottom and charm quark, turned off. This procedure was continued in [40] and [23], and is the common choice for PDF groups who fit using a GM-VFNS but make PDFs available with a maximum of 3-or 4-active flavours. Hence, here, we continue to make this choice for the MSHT20 PDFs.
We make PDFs available with a maximum of 3 or 4 active flavours for the NLO central PDFs and their uncertainty eigenvectors for both the standard choices of α n f,max =5 S (M 2 Z ) of 0.118 and 0.120, and for the NNLO central PDF and the uncertainty eigenvectors for the standard choice of α n f,max =5 S (M 2 Z ) of 0.118. 2 We also provide PDF sets with α S (M 2 Z ) displaced by 0.001 from these default values, in order to facilitate the calculation of α S uncertainties in the different flavour schemes. Finally, we make available PDF sets with different values of m c and m b in the different fixed-flavour schemes.
By default, when the charm or bottom quark evolution is turned off, we also turn off the contribution of the same quark to the running coupling. This is because most calculations use this convention when these quarks are treated entirely as final state particles. This results in the coupling running more quickly above m c and m b . In particular, if the coupling at Q 2 0 is chosen so that α The variation of the PDFs defined with a maximum number of 3 and 4 flavours, compared to our default of 5 flavours, is shown at Q 2 = 10 4 GeV 2 in Fig. 27 for NNLO PDFs. The the differences are due to two different effects. For fewer active quarks there is less gluon branching, so the gluon is by definition larger if the flavour number is smaller. However, it is also the case that as Q 2 increases the coupling gets smaller as there are fewer active quarks.
Hence, evolution is generally slower, which means that partons decrease less quickly for large x and grow less quickly at small x. The latter effect dominates for the evolution of the light quarks, so these are smaller at small x where evolution acts to increase the distribution and vice versa at high x. However, for the gluon the two effects compete at small x. Overall, the reduction in quark branching is the greater effect, mainly due to the fact that this sets in suddenly at flavour transition points, and hence directly affects the evolution immediately. The effect on the evolution of the coupling sets in directly at transitions points, but the resultant effect on the value of the coupling, and hence on PDF evolution, is cumulative, and hence less direct. At high x for the gluon the two effects act in the same direction, leading to a large enhancement of the gluon for smaller active flavour number in this region. As expected, the effects are the same in the 3 and 4 flavour schemes, but larger in the former.
Note the NLO sets without the α S (M 2 Z ) value indicated in the name (e.g. MSHT20nlo mcrange nf3 and MSHT20nlo mbrange nf3) are at α S (M 2 Z ) = 0.118, matching the naming of the NNLO sets which are also at α S (M 2 Z ) = 0.118.
We note again for clarity that for sets with 3 and 4 flavours, the quoted α S (M 2 Z ) values correspond to evolving the coupling from Q 0 = 1 GeV at a value that for 5 active flavours corresponds to this α S (M 2 Z ), but with 3 or 4 active flavours, i.e. the values specified here and in the info files for α S (M 2 Z ) are 5-flavour scheme values which are then translated internally in generating the 3 and 4 flavour number scheme sets.

Conclusions
The PDFs determined from global fits to deep-inelastic and hard-scattering data are highly correlated to the value of α S (M 2 Z ) used, and any changes in the values of α S (M 2 Z ) must be accompanied by changes in the PDFs such that the optimum fit to data is still obtained. In [1] we produced updated PDFs and uncertainty eigenvector sets for specific values of α S (M 2 Z ), close to the best fit values. In this article we additionally present PDF sets and the global fit quality at NLO and NNLO for a wide variety of α S (M NNLO: α S (M 2 Z ) = 0.1174 ± 0.0013 (68% C.L.), already presented in [1], but here also determine the uncertainties. We show the variation of the fit quality with α S (M 2 Z ) of numerous data sets, within the context of the global fit, and see which are the more and less constraining sets, and which prefer higher and lower values. We see that, in common with the global fit, most data sets show a systematic trend of preferring a slightly lower α S (M 2 Z ) value at NNLO than at NLO, but note that whilst there are some trends, no particular type of data consistently prefers a high or low value of α S (M 2 Z ). There are examples of fixed target DIS data which prefer either high or low values and similarly for the collider data sets. Indeed, our best values of α S (M 2 Z ) are almost unchanged from the previous values [4] of α S (M 2 Z ) = 0.1201 (NLO) and α S (M 2 Z ) = 0.1172 (NNLO). They are also similar to the values obtained by NNPDF of α S (M 2 Z ) = 0.1185 (NNLO) and α S (M 2 Z ) = 0.1207 (NLO) [42], and by CT18 of α S (M 2 Z ) = 0.1164 (NNLO) [43]. However, our extraction disagrees with the recent value α S (M 2 Z ) = 0.1147 (NNLO) obtained by ABMP in [44], but agrees better with their value α S (M 2 Z ) = 0.1191 at NLO in [45]. We find agreement of our NNLO value at the level of less than one sigma with the world average value of α S (M 2 Z ) = 0.1179 ± 0.001. Hence, our NNLO best fit value of α S (M 2 Z ) is in excellent agreement with the default value, α S (M 2 Z ) = 0.118, for which eigenvector sets are made available. The PDF sets obtained at the 23 different values of α S (M 2 Z ) at NLO and NNLO are made publicly available. These will be useful in studies of α S (M 2 Z ) by other groups, but we warn against the naive extraction of the best-fit value of α S (M 2 Z ) using the fit quality to a particular data set, as discussed in [18]. Indeed, the explicit presentation of the χ 2 deterioration away from the best fit values of α S (M 2 Z ) in Table 1 provides additional information which should impact on studies of α S (M 2 Z ) variation while using our global PDFs. In order to calculate the PDF+α S (M 2 Z ) uncertainty we recommend the approach introduced in [21] of treating PDFs with α S (M 2 Z )±∆α S (M 2 Z ) as an extra eigenvector set. As shown in [21], provided certain conditions are met, the α S (M 2 Z ) uncertainty is correctly added to the PDF uncertainty by simply adding in quadrature the variation of any quantity under a change in coupling ∆α S (M 2 Z ) as long as the change in α S (M 2 Z ) is accompanied by the appropriate change in PDFs required by the global fit. As examples, we have calculated the total cross sections for the production of W , Z, top quark pairs and Higgs bosons at the Tevatron, LHC and FCC. For W and Z production, where the LO sub-process is O(α 0 S ) the combined "PDF+α S " uncertainty is not much larger than the PDF-only uncertainty with a fixed α S . The additional uncertainty due to α S is more important for top quark pair production and Higgs boson production via gluon-gluon fusion, since the LO sub-process is now O(α 2 S ), though the details depend significantly on the correlation between α S (M 2 Z ) and the contributing PDFs. For any particular process the details of the uncertainty can be explicitly calculated in a straightforward way using the PDFs we have provided in this paper, together with the procedure for combining PDF and α S (M 2 Z ) uncertainty discussed in Section 2.3. The additional purpose of this article is to present and make available PDF sets in the framework used to produce the MSHT20 PDFs, but with differing values of the charm, and of the bottom, quark masses. We do not strictly make a determination of the optimum values of these masses, but we do investigate and note the effect the mass variation has on the quality of the fits to the data, concentrating on the final combined HERA heavy flavour cross section data [32] in particular. We note that for both the charm and bottom quarks our default values of m c = 1.4 GeV, m b = 4.75 GeV (10) are close to the values preferred by the fit, more so than in the previous global PDFs [23], and that these are consistent with the values of pole masses one would expect by conversion from the values measured in the MS scheme. For instance, our NNLO studies indicate that m c 1.3 − 1.4 GeV, m b 4.25 − 4.75 GeV.
We also make PDFs available with a maximum of 3 or 4 active quark flavours. All publicly available sets can be found at [46] and will also be available from the LHAPDF library [47].
Finally, we investigate the variation of the PDFs and the predicted cross sections for standard processes corresponding to these variations in heavy-quark mass. For reasonable variations of m c the effects are small, but not insignificant, compared to PDF uncertainties. For variations in m b the effect is smaller. Changes in PDFs with the value of m b are much smaller than PDF uncertainties, except for the bottom distribution itself, which can vary more than its uncertainty at a fixed value of m b . In summary, while currently the uncertainties on PDFs due to quark masses are subleading in comparison to the PDF uncertainties, these are not negligible and we can expect them to become more important in the future, as precision requirements increase.