The Strangest Proton?

We present an improved determination of the strange quark and anti-quark parton distribution functions of the proton by means of a global QCD analysis that takes into account a comprehensive set of strangeness-sensitive measurements: charm-tagged cross sections for fixed-target neutrino-nucleus deep-inelastic scattering, and cross sections for inclusive gauge-boson production and $W$-boson production in association with light jets or charm quarks at hadron colliders. Our analysis is accurate to next-to-next-to leading order in perturbative QCD and specifically includes charm-quark mass corrections to neutrino-nucleus structure functions. We find that a good overall description of the input dataset can be achieved and that a strangeness moderately suppressed in comparison to the rest of the light sea quarks is strongly favored by the global analysis.


Introduction.
An accurate determination of the strange quark and anti-quark parton distribution functions (PDFs) of the proton [1][2][3] is key to carrying out precision phenomenology at current and future colliders, specifically for measuring fundamental parameters of the Standard Model such as the mass of the W boson [4] and the Weinberg angle [5], see also [6]. Because of the limited experimental information available, however, the strange quark and anti-quark PDFs remain significantly more uncertain than the other light sea quark PDFs.
The strange quark and anti-quark PDFs have been determined from neutrino-nucleus deep-inelastic scattering (DIS) for a long time, specifically from measurements of dimuon cross sections, whereby the secondary muon originates from the decay of a charmed meson, ν µ + N → µ + c + X with c → D → µ + X [7][8][9][10]. When interpreted in terms of the ratio between strange and non-strange sea quark PDFs, R s ≡ (s +s)/(ū +d), these measurements favor values around R s 0.5 when PDFs are evaluated at values of the momentum fraction x = 0.023 and scale Q = 1.6 GeV. Therefore it came as a surprise when a QCD analysis of the W -and Z-boson rapidity distributions measured by the ATLAS experiment in proton-proton collisions [11] suggested instead a ratio closer to R s 1. This result was subsequently corroborated by an analysis based on an increased integrated luminosity [12]. Complementary information on the strange quark and anti-quark PDFs is provided by W -boson production in association with light jets [13] and charm quarks [14], the latter process being dominated by the partonic scattering g + s → W + c. However, measurements of these processes performed by AT-LAS [15,16] and CMS [17,18] point to opposite directions: the former to R s 1; the latter to R s 0.5.
This state of affairs has been a source of controversy in recent accurate determinations of PDFs. The NNPDF3.1 global analysis [19] found that, whereas the ATLAS W , Z dataset does favour a larger total strangeness, its χ 2 remains poor, and it has a moderate impact in the global fit. Conversely, the recent CT18 global analysis [20] presented fits with and without the ATLAS measurement of [12], with the resulting PDFs being significantly different. Dedicated studies of the strange quark and antiquark PDFs have been presented [21][22][23][24][25], however these often focus on a restricted set of processes or datasets, or are based on theoretical and methodological assumptions that can potentially bias the results.
A global reinterpretation of all of the strangenesssensitive measurements within an accurate theoretical and methodological framework appears to be therefore compelling in order to establish whether or not a proton strangeness crisis occurs. This paper fulfills this purpose: we present an improved determination of the strange quark and anti-quark PDFs, accurate to next-tonext-to-leading order (NNLO) in perturbative QCD, by expanding the NNPDF3.1 analysis [19] in two respects. First, we take into account an extended piece of experimental information which is relevant to constrain the strange quark and anti-quark PDFs. Second, we improve the theoretical description of dimuon neutrino DIS data, by implementing NNLO charm-quark mass corrections, and of W +c production data, by including a theoretical uncertainty that accounts for the unknown NNLO QCD corrections; we also explicitly enforce the positivity of the F c 2 structure function. See [26] and references therein for a comprehensive description of the NNPDF methodology.
Input dataset. The bulk of the dataset included in our analysis corresponds to the one used in [27], which is in turn a variant of the dataset used in the NNPDF3.1 NNLO analysis [19]. It contains in particular measurements of dimuon neutrino-nucleus DIS cross sections from the NuTeV experiment [9], and of inclusive gauge boson production in proton-(anti)proton collisions from several Tevatron and LHC experiments [12,[28][29][30][31]. These measurements represented the most constraining source of experimental information on the strange quark and anti-quark PDFs in the NNPDF3.1 analysis.
We supplement this dataset with a number of new mea-surements. Concerning neutrino-nucleus DIS, we include measurements of the ratio of dimuon to inclusive chargedcurrent cross sections, R µµ (ω) = σ µµ (ω)/σ CC (ω), from the NOMAD experiment [10]. The data is presented for three kinematic variables ω: the neutrino beam energy E ν , the momentum fraction x, and the square root of the final-state invariant mass √ŝ . Because experimental correlations are not provided amongst measurements in different kinematic variables, only one measurement can be included in the fit at a time: we select the n dat = 19 data points as a function of E ν , and verify that similar results can be obtained with the √ŝ set, see the Supplementary Material (SM). The kinematic sensitivity of the NOMAD measurements is roughly 0.03 ∼ < x ∼ < 0.7.
Concerning proton-proton collisions, we augment the inclusive gauge boson production measurement from the ATLAS experiment (7 TeV) [12] with the off-peak and forward rapidity bins (not included in NNPDF3.1) for a total of n dat = 61 data points. Furthermore, we include the n dat = 37 data points corresponding to the AT-LAS (7 TeV) [16] and CMS (7 TeV and 13 TeV) [17,18] W +c measurements; for ATLAS, we consider the charmjet dataset, which is amenable to fixed-order calculations (instead of the D-meson dataset). Finally, we take into account the n dat = 32 data points corresponding to the ATLAS W +jets measurement (8 TeV) differential in the transverse momentum of the W boson [15]. Overall, these LHC processes are sensitive to the proton strangeness in the region 10 −3 ∼ < x ∼ < 0.1. Our analysis contains a total of n dat = 4096 data points; experimental correlations within each dataset are available for all of the new measurements considered here and are therefore included in our analysis.
Theoretical calculations. The theoretical settings adopted in our analysis closely follow those described in [19,27] (whereby, in particular, the charm PDF is fitted), with two improvements. First, the positivity of the structure function F c 2 is now enforced with a procedure similar to that described in [26] for light quarks. This additional constraint is required to prevent the fitted charm PDF becoming unphysically negative once the new datasets are included in the fit. Second, we incorporate the recently computed NNLO charm-quark mass corrections [32,33] in the description of the NuTeV and NOMAD measurements. We do so by multiplying the next-to-leading order (NLO) theoretical prediction in the FONLL general-mass scheme [34,35] by a Kfactor defined as the ratio between the NNLO result in the fixed-flavor number (FFN) scheme with and without the charm-mass correction. This approach provides a good approximation of the exact result, because theoretical predictions in the FFN and FONLL schemes are very close for the NuTeV and NOMAD kinematics (see the SM). These K-factors are in general smaller than unity, and thus enhance the (anti-)strange quark PDF when accounted for in the fit. Nuclear corrections to neutrino-nucleus data are not included: for NuTeV, they were demonstrated to be immaterial [36]; for NOMAD, they mostly cancel out in the ratio R µµ (ω), see the SM. Additional considerations apply specifically to the theoretical treatment of the new datasets. The NOMAD observables are evaluated as two-dimensional integrals over the differential DIS cross sections, e.g. for E ν , where Q 2 max (x) = 2m p E ν x and x 0 = Q 2 min /(2m p E ν ), with m p the proton mass. While the NOMAD measurements are reconstructed for Q 2 ≥ 1 GeV 2 , we assume Q 2 min = m 2 c , where m c = 1.51 GeV is the initial parametrization scale adopted in our analysis [19]. We explicitly verified that results are unaffected if Q 2 min = 1 GeV 2 is chosen instead. The integrand in Eq. (1) is either the dimuon (i = µµ) or the inclusive charged-current (i = CC) cross section. We compute Eq. (1) and the ratio R µµ with APFEL [37]; our results agree (at permille level) with those of [33,38] (after the correction of a bug in APFEL), see the SM for the details of the numerical benchmark. Theoretical predictions for inclusive W -and Z-boson production and for W -boson production with charm quarks and light jets are evaluated at NLO using MCFM+APPLgrid [39,40], and are supplemented with NNLO QCD K-factors evaluated with FEWZ [41]. Because NNLO QCD corrections are unknown for W +c production, in this case we accompany the data with an additional correlated uncertainty, estimated from the 9 point scale variations of the NLO calculation [42,43].
Procedure. We perform the three fits summarized in Table I. The first fit (str base) is our baseline, and corresponds to the fit of [27] with the addition of the NNLO charm-mass K-factors for the NuTeV data and of the positivity constraint on F c 2 . See the SM for a comparison with the NNPDF3.1 PDF set. This fit is then sup-plemented with all the new LHC data to obtain the second fit (str prior), for which we generate N rep = 850 Monte Carlo replicas. This second fit is finally supplemented with the NOMAD data, specifically the set that depends on E ν , to determine the third fit (str). Bayesian reweighting and unweighting [44,45] are used in this last step, because they allow one to evaluate the twodimensional integral in Eq. (1) only once, a task that would otherwise be computationally very intensive in a fit. After reweighting, one ends up with N eff = 105 effective replicas, from which we construct a set of N rep = 100 replicas. Variants of the str fit based on a perturbatively generated charm PDF and on the inclusion of the alternative NOMAD √ŝ dataset are discussed in the SM.
Results. In Table II we summarize the values of the χ 2 per data point obtained from each of the three fits and for the datasets discussed above: χ 2 bs for str base; χ 2 pr for str prior; and χ 2 str for str. By comparing these values across the three fits, we observe that the description of the new datasets -which, in particular, is not optimal for the ATLAS W , Z data in the str base fit and for the NOMAD data in the str base and str prior fitsmarkedly improves as soon as they are included in subsequent fits. The largest effect is witnessed by the NOMAD data, whose χ 2 decreases from about 9 in the str base and str prior fits to about 0.6 in the str fit. The χ 2 for all of the other datasets is in general not affected upon addition of the NOMAD data in the str fit, except for the NuTeV dataset, whose χ 2 is further reduced in comparison to the str prior fit. We therefore conclude that the dataset is overall consistent and well described in the str fit. The NOMAD dataset is illustrated in Fig. 1, where we compare the corresponding theoretical predictions, obtained from the str prior and str fits (that is, before and after the inclusion of the NOMAD data), and the experimental measurements. See the SM for similar comparisons for other strangeness-sensitive datasets.
We now turn to study the impact of the new datasets included in the fits on the ratio R s . The corresponding assessment for individual PDF flavors is provided in the SM. In Fig. 2 we display the ratio R s , and its relative PDF uncertainty δR s /R s , at Q = 10 GeV for the three fits discussed above. The impact of the new datasets is clearly visible. Concerning the central value, collider datasets do not alter its expectation (the results obtained from the str base and str prior fits are almost identical); the NOMAD dataset, instead, prefers a somewhat more suppressed strange sea for x ∼ > 0.1. Concerning uncertainties, collider datasets lead to a reduction of the relative uncertainty on R s of about 4% for x ∼ < 0.1; the NOMAD dataset, instead, reduces it by about a factor of two for x ∼ > 0.1. Overall, the impact of the new datasets depends on x, and is mostly significant for x = 0.2, where the uncertainty on R s is reduced from 20% to 8%. For x ∼ > 0.3 no experimental constraints are available, there-  fore the PDF uncertainty blows up. These conclusions are unaffected if the NOMAD √ŝ dataset is used instead of the E ν one. Fig. 3 compares the str fit with the results obtained from the CT18/CT18A [20] (CT18A is a variant of CT18 that includes the ATLAS W , Z data), MMHT14 [46], and ABMP16 [47] fits. They all include only a subset of the data listed in Table II, in particular: the NuTeV dataset is part of all PDFs; the NOMAD dataset is only part of ABMP16; and the off-peak and forward ATLAS W , Z bins, the W +c and the W +jets datasets are not part of any of these PDF sets. The upper (lower) panel of Fig. 3 displays the absolute (normalised) central values of R s (s + ), with the insets displaying the relative PDF uncertainties for each case. Our str determination agrees with the CT18A and ABMP16 results within uncertainties in the data region; it, however, overshoots the CT18   and MMHT14 results. Note that the very small PDF uncertainties of the ABMP16 result should be realistically rescaled by a tolerance factor T = χ 2 > 1 [1], which is  Table I and other recent PDF analyses.
however not accounted for in their analysis. With this caveat, our results for s + and R s are also the most precise, in particular around x ∼ 0.1, thanks to the wider dataset (and specifically of NOMAD) utilized to constrain the strange quark and anti-quark PDFs. Fig. 4 displays the values of R s for the three fits of Table I and the fits shown in Fig. 3; here R s is evaluated at Q = 1.6 GeV and x = 0.023 as in the ATLAS studies [11,12] claiming a symmetric strange quark sea. Fig. 4 makes it clear the consistent effect of the new datasets included in our analysis: the value of R s = 0.78 ± 0.20 in the str base fit is made more precise by the LHC datasets, which reduce its uncertainty by about a third, without altering much its central value, R s = 0.76 ± 0.12; then the neutrino-DIS NOMAD dataset shifts this number towards a lower value by a half-sigma bringing in also a further moderate reduction of the uncertainty, R s = 0.71 ± 0.10. A larger reduction of the uncertainty on R s , because of NOMAD data, is obtained if R s is evaluated at a higher value of x, see Fig. 2, and the SM.
Our final str result indicates that the strange sea is neither highly suppressed (R s 0.5), as suggested by neutrino data, nor fully symmetric (R s 1), as allegedly preferred by the ATLAS W , Z production measurements. Actually R s turns out to lie halfway between these two limiting scenarios. We find similar conclusisons by studying the momentum fraction K s , see the SM. Our result is in agreement with the determination of R s obtained from other recent PDF determinations within uncertainties. We therefore conclude that the result R s = 1.13 ± 0.11, reported in [12] from an analysis of HERA and ATLAS W , Z data within the xFitter framework [48], is not compatible with ours, possibly because it is affected by a restricted dataset and/or methodological limitations.
We finally emphasize that, apart from the more extensive dataset, our analysis differs from all of the other PDF determinations shown in Figs. [3][4] in that the charmquark PDF is fitted on the same footing as the other light-quark PDFs [49]. This feature was demonstrated to improve the description of DIS and LHC datasets, and in particular to partially relieve tensions between the NuTeV and the ATLAS W , Z datasets [19]. The interplay between the charm-and the strange-quark PDFs is further addressed in the SM, where we find that revisiting our analysis with a perturbative charm PDF leads to a worse fit quality while affecting only marginally our conclusions on R s .
Summary. By means of a state-of-the-art global analysis, which combines all the relevant experimental and theoretical inputs, we have achieved a precise determination of the strangeness content of the proton. We have demonstrated the compatibility of all strangenesssensitive datasets; quantified their relative impact on the fit; compared our results to other recent global analyses; and assessed the robustness of our results with respect to methodological choices. Our str PDF set, available in the LHAPDF format [50] together with its perturbative charm counterpart, 1 represents an important input for phenomenology. Our determination of the strange and anti-strange quark PDFs could be further stress tested with more exclusive processes, e.g., measurements of kaon production in semi-inclusive DIS (SIDIS). Studies of the strange PDFs based on SIDIS [51][52][53] notoriously prefer a suppressed strangeness, but are also subject to the potential bias coming from their sensitivity to the fragmentation of the strange quarks into kaons.
In short, our analysis demonstrates that no proton strangeness crisis occurs, that the strange PDF can be precisely determined, and that the proton is not as strange as it has often thought to be.

Supplementary Material
Computation of the NOMAD observables. As explained in the main text, the NOMAD experiment measured the ratio of dimuon to inclusive charged-current cross sections, R µµ , as a function of the neutrino beam energy E ν , the momentum fraction x, and the square root of the partonic center-of-mass energy √ŝ . The ratio R µµ as a function of E ν is evaluated by means of Eq. (1), where the expression of the integrand reads with the index i = µµ, CC denoting either the dimuon or the inclusive charged-current quantities. They enter, respectively, the numerator and the denominator of R µµ . The kinematic factors Y ± = 1 ± (1 − y) 2 are related to the inelasticity y, which can in turn be expressed as y = Q 2 /(2m p E ν x); G F , M W and m p are respectively the Fermi constant, the mass of the W boson and the mass of the proton. The factor K i is either the identity, for i = CC, or the charm semileptonic branching ratio B µ , for i = µµ. In the latter case we use the E ν -dependent parametrization with the values of the parameters a and b determined in [10], a = 0.097 ± 0.003 and b = 6.7 ± 1.8. The corresponding uncertainty is included in the covariance matrix of the measurement. Both the charm (for i = µµ) and the total (for i = CC) structure functions F i p (p = 2, L, 3) are evaluated with APFEL [37]. For illustrative purposes, in the left panel of Fig. 5 we display the charm production cross section, Eq. (1) with i = µµ, as a function of E ν in the kinematic range measured by the NOMAD experiment. The cross section is obtained in the FFN scheme (with n f = 3) at different perturbative orders using the NNPDF3.1 NNLO PDF set (consistently with n f = 3). The inset displays the ratio to the leading order (LO) calculation. Higher-order corrections clearly suppress the cross section, in particular as E ν increases. For instance, in the highest energy bin the NNLO cross section is about 10% smaller than the LO prediction. The size of the NNLO correction is comparable or larger than the size of the NLO one, therefore its inclusion is mandatory to achieve a good description of the NOMAD data.
While the comparison of Fig. 5 is presented in the FFN scheme, all the fits discussed in this work are based on the FONLL general-mass variable flavor number scheme. As discussed in [35], the theoretical predictions for neutrino-DIS charm production obtained in either the FFN or in the FONLL schemes turn out to be very close in the kinematic range accessed by the NuTeV experiment. We explicitly checked this statement, and that the same applies also to the NOMAD measurements. To this purpose, we explicitly computed the relative difference between the FONLL-A and FFN scheme predictions for the NuTeV and NOMAD datasets based on structure functions accurate to O (α s ). We found that differences were less than 1% in the entire kinematic range for NuTeV, and of about 1.5% irrespective of the value of E ν for NOMAD. These differences are well below the experimental and the PDF uncertainties. We therefore conclude that using a NNLO K-factor determined in the FFN scheme in fits that otherwise use the FONLL scheme is unlikely to affect our results.
We tested the robustness of our computation of Eqs. (1)-(2) in two further respects. First, we performed a benchmark against the independent computation based on [33]. In the process, a bug affecting the APFEL calculation of NLO charged-current coefficient functions was identified and corrected, a fact that made our computation in excellent agreement with that based on [33]. Second, we estimated the impact of nuclear corrections on our predictions. To this purpose, we recomputed them with the recently presented nNNPDF2.0 NLO Fe nuclear PDF set [54] and we compared the result with the predictions obtained with the NLO free proton PDF set consistently determined in [54]. The full set of correlations between Fe and proton PDFs were therefore appropriately taken into account. The relative difference between the two computations (with and without nuclear PDF corrections) turned out to range between 3%, in the lowest E ν bin, and a fraction of percent, in the bins at the highest E ν . These differences are smaller than both the data and PDF uncertainties, therefore neglecting nuclear PDF uncertainties is a justified approximation which should have a negligible impact on our results. We also note that the effects of nuclear corrections due to a Fe target were explicitly studied in [36] for the NuTeV case and found to be immaterial in a global fit of PDFs.
Stability with respect to the choice of NOMAD observable. The NOMAD cross section ratio R µµ is presented in terms of three kinematic variables characteristic of the DIS process: the energy of the neutrino beam E ν , the momentum fraction x, and the square root of the invariant mass of the final state (or partonic center-of-mass energy) √ŝ . Because the three datasets are reconstructed from the same underlying measurement, and the associated experimental correlation matrix is unknown, it is not possible to include the three of them simultaneously in a fit without incurring in double counting. The best-fit str presented in this work is based on the NOMAD cross section ratio as a function of the neutrino energy E ν . We selected this specific set because, among the three kinematic variables, E ν is the one more directly accessed in the experimental setup.
Here we study the stability of our results when the NOMAD E ν dataset is replaced by its √ŝ counterpart in the fit. To this purpose, we reweight the str prior fit with the NOMAD √ŝ dataset, to obtain the str s hat fit. After reweighting, one ends up with N eff = 135 effective replicas (out of N rep = 850 initial replicas), from which we construct an ensemble of N rep = 100 replicas. The number of effective replicas is similar to that obtained after reweighting with the E ν dataset. In Table III we report the values of the χ 2 per data point computed from the str s hat fit for all of the strangeness-sensitive datasets considered in this work. The format is similar to that used in Table II   we report, for ease of comparison, the corresponding values of the χ 2 obtained from our best-fit str. We observe that a very similar fit quality is achieved in the str s hat and str fits, not only for the NOMAD data, but also for all of the other datasets: the differences in the values of the χ 2 between the two fits are smaller than statistical fluctuations. The impact of the NOMAD √ŝ dataset is further displayed in Fig. 6. In the left panel, we compare the data and the corresponding theoretical predictions computed with the str base and str s hat fits. We observe a reduction of the relative uncertainty similar to that reported in Fig. 1 for the NOMAD E ν dataset. In the right panel, we compare the ratio R s computed from the str and str s hat fits: both the central values and the PDF uncertainties of R s are very similar. This fact confirms the robustness of our analysis, and the independence of the results upon the choice of the NOMAD dataset included in the fit.
Impact of the treatment of the charm PDF. The description of the charm-and strange-quark PDFs is known to be correlated. For instance, charged-current weak boson production at colliders is sensitive at LO to both strange and charm quark PDFs via the cs and sc initial-state partonic channels. For this reason, it is interesting to assess the robustness of our results against the theoretical treatment of the charm-quark PDF in our fits. To this purpose, we repeated the analysis carried out in the str prior and str fits assuming a purely perturbative charm-quark PDF (that is, generated from the gluon and light quarks via DGLAP evolution). This assumption is used in other global analyses discussed in this work, see Figs. 3-4. This way we obtained the corresponding str prior pch and str pch fits. In this case we generated only N rep = 500 replicas in the str prior pch fit; after reweighting we are left with N eff = 157 effective replicas, from which we constructed an ensemble of N rep = 100 replicas in the str pch fit.
We collect the values of the χ 2 per data point obtained from the str pch fit for the usual datasets in Table III. For ease of comparison, the corresponding values, obtained with the str fit, are also reported. We observe that  Fig. 8 now for the Z dilepton rapidity distributions from the ATLAS measurement of [12] for both the central and forward selection cuts (top); and the charm dimuon cross sections from the NuTeV measurement of [9], for both neutrino and antineutrino beams (bottom). In the latter case only the ratio to data is shown; data points are sorted by their ID value, roughly corresponding to increasing x, Q 2 , and y values when the plot is read from left to right.
the perturbative charm fit (str) achieves a better description of the strangeness-sensitive datasets, and of the global dataset overall, than the fitted charm fit (str pch). We note in particular the χ 2 values of the ATLAS W , Z and of the total datasets, which increase respectively from 1.67 to 1.80 and from 1.17 to 1.20 when comparing the str and the str pch fits. We therefore confirm that fitting charm achieves an improved description of the experimental data.
To further illustrate the impact of the perturbative charm assumption, in the left panel of Fig. 7 we reproduce Fig. 1 now with the str prior pch and str pch fits. In comparison to the fitted charm case, see Fig. 1, the prediction obtained from the str prior pch fit is closer to the NOMAD data and has smaller uncertainties than its counterpart obtained with the str prior fit. The agreement after reweighting is, however, comparably good in the two cases. Finally, the right panel of Fig. 7 displays the ratio R s obtained from our fitted and perturbative charm best fits, str and str pch. While PDF uncertainties turn out to be very similar in the two cases, the perturbative charm fit prefers a central value which is systematically larger than the one obtained from the fitted charm fit. The size of the shift, however, is at most at the one-sigma level in units of the PDF uncertainty bands, in line with previous studies [19,49].
Comparison between experimental data and theory predictions. Here we extend the comparison between theoretical predictions and data to some of the measurements used in this work, other than those from the NOMAD experiment presented in Fig. 1. In Figs. 8 and 9 we display: the W +c lepton rapidity distributions corresponding to the ATLAS measurement of [16] (for both W + and W − ) and to the CMS measurements (sum of W + and W − ) of [17,18] (respectively at 7 TeV and 13 TeV); the Z dilepton rapidity distributions from the ATLAS measurement of [12] (for both the central and forward selection cuts); and the charm dimuon cross sections from the NuTeV measurement of [9] (for both neutrino and antineutrino beams). In the first three cases, the insets display the ratio of the theory to the central value of the experimental measurement. In the last case, only this ratio is shown; data points are sorted by their ID value, roughly corresponding to increasing x, Q 2 , and y values when the plot is read from left to right. Theoretical predictions are evaluated with the str base and str fits. A fair agreement between data and theory is found in all cases, as expected from the pattern of χ 2 values reported in Table II. However, we clearly see that the size of the PDF uncertainty relative to the size of the data uncertainty depend on the dataset. Concerning the ATLAS and CMS W +c measurements, experimental uncertainties span the range between 10% and 20%, and are consistently larger than PDF uncertainties. We note that the PDF uncertainties in the theory predictions are markedly reduced in the str fit in comparison to str base fit, as highlighted by the ratios in the insets. Concerning the ATLAS Z measurement, the total experimental uncertainty is much smaller, around 2% for the central rapidity bin, and is comparable to the PDF uncertainty. We therefore expect this measurement to be one of the most constraining amongst all of the LHC measurements considered in this work. Interestingly, once the NOMAD dataset is included in the fit, the central value of the theoretical prediction approaches the central value of the ATLAS data, and PDF uncertainties are slightly reduced. A similar trend can be observed for the forward selection data and for the NuTeV measurement. This behaviour is a further sign of the good overall compatibility of all of the datasets, and in particular of neutrino DIS and LHC gauge boson production measurements.
Implications for other PDF flavor combinations. In the main manuscript we focused on the impact of the new datasets on the ratio R s and the total quark-antiquark strange distribution s + , see Figs. 2-3. We now briefly discuss the corresponding impact on other PDF flavor combinations. In Fig. 10 we compare the gluon, the quark singlet, and the charm, up, down, and strange sea PDFs resulting from the str base, str prior and str fits at Q = 100 GeV. The upper panels display the PDFs normalised to the str base fit; the lower panels display the corresponding onesigma uncertainty relative to each fit. In Fig. 11 we display a similar comparison now for the absolute PDFs and uncertainties of the up, down, and strange quark valence distributions. From these comparisons, we observe that the new datasets have a little impact on the gluon and the quark singlet PDFs, both on central values and on uncertainties, as expected. A bigger effect, entirely due to the NOMAD dataset, is observed instead on the charm PDF: in the str fit, its central value is suppressed in comparison to the str prior fit; uncertainties are reduced by up to a factor 2 for x 0.05. Note that the NOMAD data is indirectly sensitive to the charm PDF through its interplay with the sc andsc contributions to W -boson production. Concerning valence distributions, the new datasets leave central values unchanged, but reduce the uncertainties in the entire kinematic range. The pattern of uncertainty reduction is the same for the three quark valence combinations: the electroweak LHC datasets constrain the distributions at low to mid values of x, x ∼ < 0.1, while the NOMAD datasets do so at larger values of x, x ∼ > 0.1. The two datasets are therefore complementary, and concur together to make all valence quark PDFs more precise. Finally, we explicitly verified that the impact of the new datasets on the up and down sea quark PDFs is minimal.
Additional metrics to quantify the strangeness suppression. In Fig. 4 we displayed the ratio R s for a single kinematic point, x = 0.023 and Q = 1.6 GeV, for ease of comparison with the ATLAS studies [11,12]. However, as illustrated in Fig. 3, R s is a function of x and Q. We therefore repeat the comparison displayed in Fig. 4 for a different kinematic point, x = 0.13 and Q = 100 GeV, see Fig. 12. In this case one finds values of R s typically smaller than in the previous case. For example, for the str fit one can compare the value R s = 0.71 ± 0.10 for (x, Q) = (0.023, 1.6 GeV) with the value R s = 0.64 ± 0.05 for (x, Q) = (0.13, 100 GeV). We observe however that the qualitative features displayed in Fig. 4, namely the reduction of PDF uncertainties between the str base and the str fits, and the relative size of predictions obtained from the various PDF sets, are recovered in Fig. 12.
Another measure of the strange to light sea quark suppression is the ratio of the corresponding momentum fractions .
The middle and right panels of Fig. 12 display K s (Q 2 ) for Q = 1.6 GeV and Q = 100 GeV, respectively. The qualitative interpretation of this quantity is consistent with that of R s , in particular, PDF uncertainties are reduced by a factor of two in the str fit with respect to the str base fit. The values of K s grow with the scale Q, as expected due to DGLAP evolution effects: for instance, using the str fit, one finds K s = 0.64 ± 0.07 at Q = 1.6 GeV, and K s = 0.81 ± 0.04 at Q = 100 GeV. We therefore conclude that our results are consistent upon different choices of metric used to quantify the strangeness suppression with respect to the rest of the light sea quarks.
Comparison with NNPDF3.1. Our baseline fit str base differs from NNPDF3.1 [19] in several respects. These include: the treatment of inclusive jet production from ATLAS and CMS with NNLO K-factors, see [27]; an updated treatment of non-isoscalarity effects and branching fractions in neutrino-DIS data, see [36]; the inclusion of the NNLO massive corrections to the NuTeV structure functions; the new F c 2 positivity constraint; and the correction of the APFEL bug found in the benchmark reported in Fig. 5. For completeness, we compare the NNPDF3.1 and str base parton sets in Fig. 13. Specifically we display the gluon, the quark singlet, the charm, up, down, and strange quark sea PDFs at Q = 100 GeV, normalized to the central value of NNPDF3.1. In comparison to NNPDF3.1, in the str base fit we observe: an increase in the central value of the strange PDF for x ∼ > 10 −3 (mostly due to the correction of the APFEL bug); a similar effect in the case of the charm PDF for x ∼ > 10 −2 (mostly due to the new F c 2 positivity constraint); a moderate rearrangement of the quark flavor separation at medium and large-x; and a harder gluon at large-x (mostly due to the improved NNLO treatment of jet data). All in all, while the two fits agree within uncertainties, the improvements introduced in the str base fit justify its adoption as the baseline for the present study.