Deuterium scattering experiments in CTEQ global QCD analyses: a comparative investigation

Experimental measurements in deep-inelastic scattering and lepton-pair production on deuterium targets play an important role in the flavor separation of $u$ and $d$ (anti)quarks in global QCD analyses of the parton distribution functions (PDFs) of the nucleon. We investigate the impact of theoretical corrections accounting for the light-nuclear structure of the deuteron upon the fitted $u, d$-quark, gluon, and other PDFs in the CJ15 and CT18 families of next-to-leading order CTEQ global analyses. The investigation is done using the $L_2$ sensitivity statistical method, which provides a common metric to quantify the strength of experimental constraints on various PDFs and ratios of PDFs in the two distinct fitting frameworks. Using the $L_2$ sensitivity and other approaches, we examine the compatibility of deuteron data sets with other fitted experiments under varied implementations of the deuteron corrections. We find that freely-fitted deuteron corrections modify the PDF uncertainty at large momentum fractions and will be relevant for future PDFs affecting electroweak precision measurements.

for the values of sin θ W extracted from Z boson production at the LHC 8 TeV [4]. Left: correlations with valence PDFs and PDF ratios at Q =81. 45 GeV, plotted as a function of x for CT14 NNLO PDFs. Right: the same, for correlations with PDFs of individual flavors.
energy facility such as the Relativistic Heavy Ion Collider, the Jefferson Lab CEBAF accelerator, or the future Electron-Ion Collider -this flavor composition is typically specified by helicity-averaged parton distribution functions (PDFs) of the proton. The PDFs, f (x, Q), have long been of strong interest from the perspective of both fundamental Quantum Chromo-Dynamics (QCD) as well as particle phenomenology, given that they quantify the probability of resolving a quark or gluon constituent of flavor f carrying a fraction x of the proton's longitudinal momentum in a scattering process with a squared energy scale Q 2 1 GeV 2 . For this reason, the PDFs play a central role in predicting cross sections for pp collisions at the LHC, and, in particular, their accuracy influences the ability of LHC measurements or other high-energy data to constrain the SM parameters, including in the electroweak sector.
Due to the challenge of reducing their uncertainties and empirically distinguishing among their parton flavors, PDFs have historically been determined most robustly through "global QCD fits" [5,6,7,8], now increasingly performed at next-to-next-to-leading order (NNLO) accuracy in α s , and drawing upon large collections of experimental measurements sensitive to QCD and different underlying PDF combinations. In spite of the growing number of LHC measurements, deeplyinelastic scattering (DIS) experiments involving fixed hadronic or nuclear targets at BCDMS, NMC, SLAC, and JLab continue to provide key information to disentangle the PDFs in recent global QCD analyses such as CJ15 [9], ABMP16 [10], CT18 [11], NNPDF3.1 [12], and MSHT20 [13]. The fixed-target experiments complement analogous DIS collisions at the HERA ep col-lider by extending the momentum fraction coverage to larger x values and adding measurements on deuterium targets. In fact, such experiments provide the leading constraints on the (anti)quark PDFs at low scales Q and large momentum fractions x 0.05, as well as on the gluon PDF by observing scaling violations over the same kinematic region [14,15,16].
In precision tests of the electroweak sector, the substantial PDF dependence of the involved theoretical calculations affects experimental determinations of SM parameters, such as the weak-mixing angle θ W extracted from the A FB forward-backward asymmetry measured in the production of Z bosons during Runs 1 and 2 of the LHC. Fig. 1 illustrates typical Hessian correlations [1,2,3] of PDF combinations (left) and PDFs (right) with the sin θ W values extracted from 8 TeV A FB measurements at the LHC. Here, the correlations are computed using the preliminary (unpublished) AT-LAS Run-1 data [4] on sin θ W extracted with individual PDF eigenvector sets of the CT14 NNLO ensemble [17].
In the left subfigure, we see that the values of the extracted sin θ W are strongly correlated with the valence combinations of light-quark PDFs at Q = 81. 45 GeV, d val (x, Q) ≡ d(x, Q) −d(x, Q) at x ≈ 0.008 − 0.05 and u val (x, Q) ≡ u(x, Q) −ū(x, Q) at x ≈ 0.008 − 0.1. In addition, significant correlations with the extracted sin θ W exist at higher x 0.3 as well, especially for the PDF ratio d/u, and again for d val . Remarkably, the correlations are weaker with the PDFs of individual parton flavors (shown at right) than with the valence combinations. An anti-correlation with the d-quark showing the changes in the χ 2 for all data sets and most sensitive experimental data sets in the CT18Z NNLO global QCD analysis [11].
via the valence-quark sum rule, is evident in this case, though not exceptionally strong.
The sizable correlations between fitted PDFs and sin θ W in Fig. 1 are consistent with the significant PDF uncertainties on these and similar BSM-sensitive quantities, including the W boson mass, M W , and Higgs cross section, σ H . For this reason, the realization of next-generation precision in the determination of these electroweak parameters is critically dependent on the reduction of their associated PDF uncertainties, including the high-x uncertainty of the d-quark and gluon (g) PDFs, as well as of d val and d/u.
We might ask where the experimental constraints on the relevant PDF combinations for LHC electroweak precision tests arise from. While direct measurements at the LHC will supply increasing information on the PDFs affecting A FB [18] and other observables [19], recent CTEQ studies [9,11,15,16,20] find that deepinelastic scattering experiments on nuclear targets will continue to provide strong constraints on the downquark PDFs in the nucleon in the near future. In a global QCD analysis, experimental measurements made solely on proton targets are at present insufficient for full separation of parton distributions for d, s, g, and anti-quark flavors. Assuming parton-level charge symmetry, d p (x, Q) ≈ u n (x, Q), between the PDFs in the proton p and the neutron n, and correcting for lowenergy nuclear effects [14], one can then use scattering processes on the deuteron or heavier nuclei to constrain the down-type PDFs in the proton.
We illustrate the importance of fixed-target data in the determination of the weak mixing angle with the help of Lagrange Multiple (LM) scans [21] in Fig. 2, in which we examine the dependence of the figure-ofmerit function χ 2 in the CT18Z NNLO analysis [11] on the values of the valence u val (left) and d val (right) quarks at x = 0.03 and Q = 85 GeV. We plot the change in χ 2 , as compared to the value in the best fit, for all data sets (labeled as "Total" in the figure) and for individual experiments with the highest sensitivity to this PDF combination. The curves for the experimental data sets are labeled according to the convention in Table 1. The LM scans show that a small group of DIS experiments -NMC ratio of d and p DIS cross sections [22], inclusive HERA I+II DIS [23], BCDMS p and d reduced cross sections [24,25,26,27], CCFR F 3 structure function [28] -contribute the largest variations in ∆χ 2 when the valence PDFs are varied, together with several lepton pair production experiments by ATLAS, LHCb, E605, and E866. In the case of d val in the right Fig. 2, the BCDMS measurements on p and d show somewhat different preferences, with the deuteron data clearly preferring a higher d val (0.03, 85 GeV). In these and other cases, the deuteron DIS measurements, with their large numbers of precise data points, have large statistical significance in the global fit. As this LM scan also illustrates, the proton and deuteron data sets in the CT18 fit sometimes prefer somewhat different values of d val (x, Q), which in turn may hamper the efforts for reducing the PDF uncertainty in the relevant EW precision measurements.
In this article, we will employ a statistical technique called the L 2 sensitivity [16] that can be viewed as a fast approximation to the LM scans that are usually very computing-intensive. With this technique, we will survey agreement between the constraints on the PDFs from the deuteron and other data sets in a wide range of x and for various treatments of deuteron corrections. While we have investigated sensitivities for various PDF flavors, our presentation will focus on the sensitivities to u, d, and g PDFs that are most affected.
Apart from its significance for the LHC and SM phenomenology, the physics of large-x quark PDFs is interesting in its own right. Nonperturbative QCD approaches [5,29,30] and lattice QCD [31,32] provide increasingly complete predictions for the flavor composition of unpolarized protons at x → 1, which in turn inform one on the role of color confinement in the binding of the valence quarks. These predictions can then be confronted with phenomenological determinations of Mellin moments and PDFs at large x. For example, Ref. [33] observes that the proton (BCDMS F p 2 , E866 pp DY, HERAI+II ep DIS) and deuteron (BCDMS F d 2 , NMC p/d ratio) experiments in the CT18 NNLO analysis have somewhat different preferences for the effective exponents β controlling the (1 − x) β falloff of the valence up and down quark PDFs at x → 1. In turn, these differences impact comparisons of phenomenological PDFs against large-x predictions from quark counting rules [34,35,36] and other nonperturbative approaches [29,30]. The analysis methods utilized in this paper can shed light on these issues.

The role of nuclear-medium effects
Extracting parton-level information from nuclear data sets involving the deuteron or heavier targets requires an understanding of the effects of the nuclear environment [8,14]. The trivial dependence on the nuclear atomic number A and charge Z is normally implemented by constructing a nuclear PDF as a linear combination of the PDFs on free protons and neutrons, as reviewed, e.g., in Sec. 5. A of Ref. [7]. On top of this trivial (A, Z) dependence, low-energy interactions in the nuclear medium may modify the quark and gluon distributions relatively to those in free nucleons. The nuclear corrections that account for these deviations can be computed with increasing sophistication and connection to the formal theory describing low-energy dynamics. One can, for example, utilize phenomenological, data-driven ratios to convert nuclear-target cross sections to free-nucleon ones [11,37]; parametrize and fit the nuclear deformation of the PDFs either inside the deuteron in a nucleon PDF fit [13,38] or in a heavy nucleus in a nuclear PDF fit [39,40,41,42,43,44]; or, finally, adopt a dynamical model of the lowenergy nucleon-nucleon interactions and calculate the hard cross-section as a double convolution of parton distributions inside the nucleons and nucleon wave functions inside the nuclear targets [9,45,46]. The resulting extractions of the nucleon PDFs from nuclear data then have a dependence on the assumed nuclear corrections.
The specific methods used typically differ when analyzing light nuclear targets (e.g., the deuteron) or heavy nuclei (e.g., Fe). Here, we concentrate on and summarize the techniques utilized in the CT18 and CJ15 fits which form the starting point for the study presented in this paper, and then briefly mention other approaches: 1. Deuteron corrections. A dynamic deuteron correction in the CJ15 next-to-leading order (NLO) PDF fit [9] was applied to any process involving interaction with a deuterium target, The CJ15 analysis also applies a phenomenological parametrization for the off-shell deformation of the scattered nucleon's structure function (in short, "off-shell corrections") with parameters fitted to data to increase the model flexibility. Care is taken that the valence quark number inside the nucleon is not modified; since the off-shell function is flavor independent and has no significant dependence on Q 2 , it must then change the sign in the interval of [0, 1], meaning that it is essentially given by a polynomial with one or more roots in this range. An analysis similar in spirit to the CJ15 fit is available from the AKP group [45], and differences in the extracted off-shell functions are currently being investigated by the two groups jointly. Alternatively, the deuteron correction can be fitted using a purely phenomenological parametrization as in the MSHT20 analysis [13], or the additional uncertainty associated with the deuteron effects can be learned from the global analysis data themselves, as done in the NNPDF study [37]. Other groups do not include deuteron corrections by selecting the fitted data in a {x, Q} region where the deuteron corrections are small compared to the precision of data, as is done, e.g., in the CT18 analysis [11]. 2. Heavy-nucleus effects. Nucleon PDF fits may include DIS experiments performed on heavy nuclear targets, such as CCFR [47] and NuTeV [48], involving 56 Fe, and CHORUS [49], with 82 Pb (which is not presently fitted in either CT or CJ packages studied here). It has been known empirically for some time that the structure functions of these heavier nuclear targets exhibit x-and A-dependent deviations from the structure function of the physical deuteron, owing to a variety of physical processes characterizing the nuclear medium [50,51,52,53], including the heavy-nucleus analogue of the EMC and Fermi-motion effects discussed for the deuteron at high x, and nuclear (anti)shadowing phenomena at lower x.
To address these effects, the CT group corrects DIS cross sections on iron (CCFR [47], CDHSW [54], and NuTeV [48]) and proton-copper Drell-Yan (E605 [55]) to the corresponding cross sections on deuterium using a phenomenological parametrization of the nuclear-to-deuteron cross section ratios based on results in [51] (see also [56], Fig. 2a), which depends on x but not on Q 2 in the fitted region. To include the heavy-nuclear data in the MSHT20 [13] and earlier MMHT fits, a nuclear correction factor, R f , [38] having the form f A (x, is defined to be the PDF of a proton bound in a nucleus of mass number A, was determined. This was assessed using the de Florian et al. nuclear PDF (nPDF) of Ref. [41], then weighted by a 3-parameter modification factor as in Ref. [38], which is actively refitted along with the PDF-associated parameters. NNPDF [57] determines the uncertainty due to heavy-nuclear effects using a similar statistical procedure as for the deuteron.
As can be seen from this summary, global analyses vary in their treatments of the nuclear effects, but with the frequent conclusion that the resulting differences are marginal in comparison to contemporary PDF uncertainties. Indeed, the experiments in question are fitted reasonably well, while the higher-order QCD and parametrization uncertainties on the PDFs are comparable for the most part to the nuclear ones at NLO or even NNLO. In the present study, however, we identify several areas where the deuteron corrections will play a prominent role in the near future, as the field advances toward higher accuracy in the determination of nucleon PDFs. We compare, in particular, the effects of deuteron corrections in two independent PDF global fits by the CTEQ-JLab (CJ) and CTEQ-TEA (CT groups which differ in their phenomenological focus, data selection and, crucially, the treatment of scattering process in nuclear targets. We find that the deuteron effects will have pronounced consequences for both the fitted PDFs in the large-x region and the correlations among the PDFs and quantities derived from them in an extended x range.
More specifically, this paper will elucidate constraints on the d-quark, gluon, and other PDFs arising in the CT18 and CJ15 global fits. We will accomplish this by analyzing the L 2 sensitivity to various PDFs [16], a simple informative figure of merit that allows us to look inside the CJ and CT fits and understand the constraints from the fitted experiments on various parton flavors in an expansive region of x and Q.

Paper organization
After this introduction, the remainder of the article is as follows. In Sec. 2, we briefly present the deuteronstructure corrections (Sec. 2.1) with which this investigation is primarily concerned, as well as power-suppressed QCD effects, also relevant to fits involving nuclear data and/or at lower Q and W (in Sec. 2.2). In Sec. 3, we summarize the relevant features of the CT and CJ fitting frameworks and the special modifications we made in the two analyses to allow their direct comparisons. In Sec. 3.3, we review the L 2 sensitivity method. In Sec. 4, we apply this method to analyze the impact of deuteron-structure corrections on the fit results, and examine the patterns of PDF pulls obtained in the several iterations of CT/CJ under different assumed treatments of the deuteron corrections. As representative cases, we will concentrate on the d/u ratio in Secs. 4.2 and 4.5, and on the gluon in Secs. 4.4 and 4.6. In Section 4.3 we also explore the implications of our analysis to precision measurements of the weak mixing angle. The conclusions in Sec. 5 are followed by a technical appendix describing a numerical procedure to reconcile the error analyses in the CJ and CT approaches.
2 Low-energy QCD effects

Deuteron-structure effects
The critical low-energy effect considered in this study, which arises due to MeV-scale dynamics characterizing nuclear bound states, is the modification of the parton-level substructure of nucleons embedded in the nuclear medium -in particular, those effects arising in the most weakly bound nuclear system, the two-body deuteron. We plot the nuclear correction ratio, F N 2 /F d 2 , calculated using the central CJ15 fit results for several selections of the Q 2 scale. Each of the four panels above highlights a given value of Q 2 , while graying out the curves for other scales in order to retain visual information on the scale dependence of the correction factor at large x. In the upper two panels, which focus on lower scales, Q 2 = 5, 10 GeV 2 , the dotted lines indicate the range of x that is only accessible to CJ (W 2 > 3 GeV 2 ) but not CT (W 2 > 12.25 GeV 2 ), due to the more conservative cut of the latter.
In the CJ framework, these corrections are treated as nuclear wave-function effects, and the deuteron parton distributions f d are calculated as a convolution of the bound nucleon's parton distributions, f N , with a suitable nucleonic "smearing function," S N/d : Here, z represents the momentum fraction of the (isoscalar) nucleon within the deuteron, defined as z ≡ (M d /M N )(p N · q/p d ·q); p d,N are the deuteron and nucleon four-momenta; and M d,N are their respective on-shell masses. This representation is founded on the so-called Weak Binding Approximation (WBA) to the calculation of nuclear structure functions [46,58], where the S N/d smearing function is calculable based on an assumed nuclear potential; as in Ref. [9], we assume the AV18 potential. Since p N is generically off-shell for a bound nucleon, but typically by only a small amount, one can further expand the bound-nucleon PDF, f N , in powers of its The first term, corresponding to p 2 N = M 2 N , gives the PDF of the free, on-shell nucleon. In the second term, the O(ω) coefficient (also known as "off-shell function") can be phenomenologically parametrized and determined in a global fit from the interplay of data involving deuterium targets and information involving free-nucleonbased observables like W boson production at the Tevatron, the Relativistic Heavy Ion Collider (RHIC) or the LHC. Like in Ref. [9], we assume the flavor-independent 3-parameter shape function with x 1 fixed by requiring the off-shell PDFs to satisfy the quark-number sum rule. Further technical details and a discussion of the fit results can be found in Ref. [9]. Section 4 considers three main scenarios for implementing the deuteron corrections (d.c.) discussed above: (i) an uncorrected scenario for which no nuclear effects are included for the deuteron; (ii) a fixed scenario in which the nuclear wave-function effects (on-and off-shell) are frozen to the AV18informed choice of Ref. [9], and the off-shellness correction, δf N (x), is set to the CJ15 central fit; and (iii) a free scenario particular to CJ, in which the parameters in Eq. (3) for the off-shell nucleon are allowed to vary.
The dynamical deuteron corrections are natively implemented in the CJ framework, and the off-shell parameters can be simultaneously fitted with the PDFs. So far, however, the CT code only supports deuteron corrections given in the form of analytic interpolations, such as the one obtained from the correction in [60].
To implement the fixed CJ15 deuteron correction in the CT framework and render it more directly comparable to CJ with respect to its treatment of deuteron target data, we instead multiply the experimental DIS deuteron structure function by the F N 2 /F D 2 nucleon-todeuteron ratio plotted in Fig. 3: The effective isoscalar combination of proton and neutron structure functions thus defined can then be directly compared to uncorrected theoretical calculations of the isoscalar deuteron DIS structure function. On this logic, the CT and CJ fits with a fixed CJ15 correction are placed on similar theoretical footing regarding the implementation of the deuteron effects, with the main difference being whether the correction is imposed within the theoretical structure function calculation or in the F d 2 experimental data -a fact which is immaterial for the sake of evaluating the χ 2function and allows us to compare the impact of the same fixed correction on the CJ and CT frameworks. While a full analysis of the nuclear correction uncertainties is outside the scope of this article, the effect of letting the nuclear off-shell parameters free to vary in the present analysis can be appreciated by comparing the CJ fits in the fixed and free nuclear corrections scenarios.
The size and x dependence of the deuteron corrections, as quantified by the isoscalar nucleon-to-deuteron structure function ratio F N 2 /F d 2 , are shown in Fig. 3 for several representative choices of Q 2 . One immediately notices that deuteron corrections depend on the DIS scale and, at large x, increase with Q 2 toward a fixed point in the Q 2 → ∞ Bjorken limit; as such, deuteron corrections become effectively scale independent for Q 2 50 GeV 2 . For each plotted value of Q 2 , the figure also indicates the maximum x values below which data are accepted in the CJ and CT fits according to their W 2 > 3 and W 2 > 12.25 GeV 2 kinematic cuts, respectively. For CJ, which extends the analyzed DIS data set to the low-Q 2 and large-x values shown in Fig. 4, it is imperative to correctly account for the Q 2 dependence of the deuteron correction in order to avoid conflicts with the leading-twist logarithmic Q 2 evolution that constrains the fitted gluon distribution in DIS experiments. For CT, with its larger W 2 cut, the deuteron corrections are small and nearly scale independent, as seen in Fig. 3, except for the less precise BCDMS deuteron points with x 0.6 (see the kinematical map in Fig. 4), where some influence from the deuteron correction is expected and indeed quantified in Sec. 4.

Power-suppressed effects
Due to their less conservative kinematical restrictions on Q 2 and W 2 , the CJ global fits extend into a region for which power-suppressed corrections are nonnegligible, as depicted in Fig. 4. On the one hand, dynamical higher-twist corrections of O(Λ 2 /Q 2 ) emerge because of the presence of multi-parton correlations within the soft portion of the factorized DIS process, for which the first subleading contribution to the twist expansion for unpolarized scattering are matrix elements of twist-4 operators [61,62]. As in CJ, these are often determined phenomenologically using forms like represents the leading-twist structure function, and a fitted coefficient, C(x) = αx β (1 + γx), parametrizes the power-suppressed twist-4 corrections. On the other hand, target-mass corrections of O(M 2 N /Q 2 ) are due to the non-negligible mass, M N , of the struck nucleon, and are implemented via the operator product expansion of Georgi and Politzer [63,64] or related prescriptions, as extensively reviewed in [65,66]. Both corrections are natively implemented in the CJ framework.
In contrast, CT imposes more restrictive kinematical cuts in W 2 , such that the standard CT data sets lie beyond the region for which the finite ∼ 1/Q 2 corrections are significant. In the past CT studies it mattered little whether the deuteron correction was included according to a specific model or not included at all (the default choice). While we expect some interplay between the deuteron and power-suppressed effects, we do not systematically isolate the latter and leave their investigation to future studies.

Selections of experimental data sets
The CT18 and CJ15 global data sets consist of both high-energy measurements as well as data down to the few-GeV region. Despite their somewhat differing phenomenological emphases -with CT generally aimed toward high-energy processes, and CJ toward the low-Q and large-x region probed at facilities like JLab and SLAC -there is a substantial overlap with respect to the key experiments they include. In Table 1, we provide a complete listing of the experiments included in each global analysis. Of particular importance for interpreting our results in Sec. 4, the leftmost column of Table 1 designates a process-based category label for each experiment, placing, e.g., the BCDMS F d 2 inclusive structure function data in Group 4: DIS Deuterium. We will investigate the agreement between these groups of experiments with the help of L 2 sensitivities. In contrast, previous studies [15,16,33] employing the same technique focused primarily on the individual experiments.
While the CJ and CT analyses share 10 experiments, Table 1 shows that the CJ fits include additional DIS data at fixed-target energies from SLAC, HERMES, JLab, and NMC. They also include Tevatron measurements of charge asymmetries reconstructed to the level of W bosons and cross sections for photon plus leading jet production. The CT PDF fits more extensively cover collider observables. They include HERA heavy-quark production, an assortment of cross sections and cross section asymmetries in electroweak boson production at the LHC, as well as LHC cross sections for inclusive jet and tt production. The CT fits include CCFR and NuTeV cross sections for both inclusive (Group #8) and semi-inclusive (Group #10, for opposite-sign dimuon production) charge-current DIS on iron. The CT fits, however, implement only the most direct measurements of Tevatron and LHC charge asymmetries in W → lepton decay, presented as a function of the rapidity and transverse momentum of the charged lepton. They do not include the CDF and DØ bosonlevel charge asymmetries fitted by CJ, which directly probe the large-x PDF ratios, while they also involve additional recursive unfolding of the data that utilizes a PDF-dependent calculation to reconstruct the weak boson's rapidity.

Modifications in the fitting methodologies
For the study presented in this article, we modified some default settings of the CJ15 and CT18 fits, fully  Table 1: A comprehensive listing of experiments included within the CT and CJ frameworks for this study, grouped according to the experimental process they represent. For each experiment, we give the number of experimental data points, N pt , included into the CT and CJ fits presented in this study, with blank entries indicating that the given experiment is omitted from the respective global fit.
described respectively in Ref. [9] and [11], to place the two fitting frameworks on a common footing and isolate the impact of various assumed treatments of the deuteron structure.
1. We match perturbative orders between the two fits at NLO in α s . In practice, this means that in the CT fits, performed by default at NNLO, we instead compute the hard cross sections, perturbative PDF evolution, and running of α s at O(α s ) accuracy to agree with the default NLO settings used in CJ. 2. We perform supplementary fits by excluding some data sets that appear in one fit only. While both CJ and CT fits include Tevatron lepton charge asymmetry measurements presented as a function of the charged lepton's rapidity, the CJ fit also includes the fixed-target low W 2 and Q 2 DIS data from SLAC [74] and JLab [73], as well as the CDF [85] and DØ [86] W boson charge asymmetry with reconstructed weak boson kinematics. On the other hand, CT makes use of neutrino-initiated DIS data sets on heavy nuclear targets (both inclusive and semiinclusive DIS [SIDIS] di-muon production in ν-A scattering). In CT, data on heavy-nuclear targets are fitted at the isoscalar level after being corrected in the fit using a phenomenological parametrization of the F A (x, Q 2 )/F d (x, Q 2 ) ratio from Ref. [56]. To isolate the impact of these extra experiments, we performed CJ fits without the W asymmetry and SLAC DIS data sets, and CT fits without the inclusive ν-A DIS data. 3. As in the original CJ and CT publications, we estimate the final PDF uncertainties using the Hessian method [1], but in this paper we fix the tolerance to be T 2 = 10 for both global analyses, in between the nominal T 2 = 2.71 in the CJ15 fit and the T 2 = 37 value (at the 68% probability level) used in the CT18 fits. Furthermore, we do not include the additional "Tier-2" tolerance contribution [11,105] that is applied in the CT18 fits to prevent the error PDFs from running into strong disagreements with individual experiments, but content ourselves with the "Tier-1" tolerance as defined in [106].
Regarding the lattermost point, in the CJ analysis it is additionally necessary to implement a numerical prescription at the level of individual eigenvector directions of the diagonalized Hessian matrix to guarantee ∆χ 2 = T 2 to the needed accuracy and to ensure the validity of the analysis methods utilized in this article. These technical details are reviewed in the appendix.

The L 2 sensitivity technique
In the next two sections, we will investigate the impact of the DIS deuteron data on the PDFs with the help of the "L 2 sensitivity" [16]. The method is easy to use and has already been applied to clarify the sensitivity of the global data sets to the CT18 NNLO PDFs [11], LHC parton luminosities [107] (Sec. II.2), and effective exponents of the high-x PDF falloff [33]. Here we give a quick summary of the L 2 technique and refer the interested reader to Ref. [16] for additional details. The L 2 sensitivity provides a fast approximation to the information contained in the LM scans typified by Fig. 2. It does so by quantifying the extent to which variations in the fitted PDFs drive the shifts in the loglikelihood χ 2 E for each experiment E. For example, the L 2 plots discussed below supply the change in an experiment's value of χ 2 given a defined increase in the value of a PDF of interest, as a function of x for fixed Q. That being the case, if two experiments are in relative tension with respect to the behavior of a given PDF -say, one favors a larger PDF value in a given x range, the other favors a smaller value -an increase in the PDF would result in a negative change, ∆χ 2 E1 < 0, in the χ 2 for the first experiment, and a positive shift, ∆χ 2 E2 > 0, for the second experiment. As shall be seen below, these competing pulls are visualized by the L 2 method as opposing deviations from zero in the negative/positive directions. In this fashion, the L 2 sensitivity can encapsulate, on a flavor-by-flavor basis, the patterns of pulls exerted on the PDFs (often called "PDF pulls" below for short) by various experiments or groups of experiments as a function of x and Q. In fact, the method is not limited to the influence of data sets on the PDFs, but can be extended to any observable that can be calculated starting from those.
More formally, the L 2 sensitivity yields an approximation of the shift ∆χ 2 E in the value of the log-likelihood for experiment E caused by an upward 1σ variation of the chosen PDF or PDF-dependent theoretical prediction. We evaluate the L 2 sensitivity in the Hessian approximation as where cos ϕ(f, χ 2 E ) represents the cosine of the correlation angle between a PDF of flavor f (or, indeed, any PDF derived quantity) and the experimental χ 2 E , evaluated over the 2N Hessian eigenvector sets. In the Hessian formalism adopted both by the CT and CJ frameworks to estimate PDF uncertainties, this correlation Fig. 5: A comparison of the PDF pulls of the deuterium DIS data computed according to the L 2 method at 2 GeV for the CT fixed d.c. (left) and CJ fixed d.c. (right) fits; both cases are for the scenario with fixed deuteron corrections. Here and elsewhere, we place on the vertical axis a self-explanatory label of ∆χ 2 as we plot the L 2 sensitivity which approximates this quantity.
cosine can be computed as indicated in Ref. [15]: where and "±" denote the PDF variations along the positive and negative direction of the j-th Hessian eigenvector. When L 2 sensitivities are summed over all experiments, the resulting sum should be close to zero by construction, assuming deviations from a symmetric Gaussian probability distribution are negligible (see 6 for further discussion). As an example, Fig. 5 shows the combined L 2 sensitivities, S f,L2 (E), of the experiments in the DIS deuteron group (#4 in Table 1) to each parton flavor separately, calculated according to Eq. (5) for the CT and CJ fixed d.c. fits, where the deuteron corrections were fixed to the central value determined in the CJ15 analysis. These figures can be interpreted as the statistical pulls at fixed Q = 2 GeV from this group of experiments on each PDF flavor, f (x, Q), at the x values specified on the horizontal axis. One can observe quite large deviations from zero, with S f,L2 values nearly reaching ±10 units in some regions of x. This non-negligible pull by the deuteron DIS experiments is ultimately offset by contributions from other groups of experiments to obtain a zero result (within about one unit of χ 2 ) when summing over all of these. It is therefore interesting to investigate which experimental groups pull significantly against the DIS deuteron data sets, as we do below in Sec. 4.5 and 4.6. Here and elsewhere, we compute L 2 sensitivities at a default scale of Q = 2 GeV as this is close to the initial scale, 1.3 GeV, for DGLAP evolution, as well as to the Q values accessed in DIS experiments and typical scales used to present predictions for PDFs in nonperturbative QCD models and lattice QCD [16]. We typically do not observe pronounced differences in the L 2 sensitivity plots for distinct scales, Q 1,2 , provided |Q 2 −Q 1 | O(100 GeV). A set of companion plots generated at a higher scale, Q = 100 GeV, may be viewed at Ref.
[108]. Fig. 5 contains a substantial amount of information. For example, looking at the left panel for CT fixed d.c., the negative S f,L2 for the d quark at x ≈ 0.25 indicates that the deuteron DIS data prefer a higher d quark at x ≈ 0.25 than the nominal d-quark PDF in the full fit. Similarly, the positive S f,L2 for the u quark at the same x indicates a preference for a relatively lower u-quark PDF. In totality, the deuteron DIS data prefer a higher d/u at x = 0.1−0.4 than that obtained in the full CT fit. (This preference for an enhanced d/u in this x interval is further confirmed in Fig. 9 of Sec. 4.5.) From the right panel of Fig. 5, we also read that the deuteron DIS data in the CJ fixed d.c. fit prefer an enhanced d/u over a slightly higher interval x = 0.3−0.7. Regarding other flavors and x ranges, in the left panel of CT fixed d.c. we observe a significant preference of the deuteron DIS group for lower u-and d-quark PDFs at x = 0.01 − 0.1, in the region relevant for LHC Wboson production. One also notices a preference for a larger gluon PDF in the interval, x = 0.02−0.1, relevant for Higgs-boson production at the LHC, and, at slightly lower x, for a larger perturbative charm-quark PDF, which is radiatively generated from the gluon.  Finally, we remark that the Hessian sensitivity is most effective in identifying the top 5-10 experiments or groups of experiments sensitive to variations in the chosen PDF, as has been verified by comparing the rankings obtained from Hessian sensitivities and LM scans [15,16]. However, especially for subleading experiments, detailed rankings depend on the chosen definition of the sensitivity indicator and deviations from the simple linear approximation that we assumed when using symmetric finite derivatives in S f,L2 (E) as in Eqs. (6) and (7).

Comparison of deuteron data impact in the CJ15 and CT18 fits
In accordance with the preceding discussion in Sec. 2 and 3, our analysis will be based on a series of fits named according to the following convention: In each enumerated case, the CT and CJ fits are methodologically comparable to each other. The differences between the fits in the first two categories will highlight the nontrivial effects of deuteron corrections. We consider several variations of the fixed d.c. fits. Firstly, the comparison of the CJ sensitivity plots in categories 2 and 3 demonstrates that freeing the d.c. parameters in the CJ fit tangibly improves the agreement among all categories of experiments. Secondly, we try to make the CT and CJ fits comparable not only methodologically but also (partially) with respect to data selections. We do this in category 4 by removing the indicated data sets from each fit.

Overall agreement of theory and data
We first review the overall quality of these fits by referring to Table 2, which lists values for the total χ 2 per point (χ 2 /N pt ) according to the experimental groups listed in Table 1. The χ 2 values for SLAC DIS, Tevatron W -boson asymmetry, and ν-A DIS data sets are shown in Table 2 in separate lines. The χ 2 values in Table 2 indicate that the deuteron data agree globally with the published fits of both groups. Deuteron corrections are essential to the CJ analysis, which includes deuteron DIS data at the largest values of x from SLAC as well as the very sensitive DØ  data on the reconstructed W -charge asymmetry [9]. Even if the total χ 2 /N pt may seem only marginally improved, the highlighted data sets require deuteron corrections to reconcile the SLAC DIS deuteron data with the deuteron-independent DØ W -asymmetry data.
In CT, the inclusion of deuteron corrections also improves the description of the DIS deuteron data (especially the BCDMS F d 2 measurements), producing a 14-unit reduction in χ 2 for the N pt = 373 points fitted in Group #4, with an additional 6-unit reduction once the inclusive ν-A data are removed. Mainly due to the absence of the SLAC DIS data in CT, this is a smaller relative reduction than that observed for CJ in the left columns of Table 2.
It is also interesting to note far more substantial shifts in χ 2 within Table 2 among the other CT experimental groups: the introduction of the fixed deuteron correction in CT improves the χ 2 of the DY data (#7) by more than 100 units, while at the same time, increasing the χ 2 of the LHC weak-boson production (#6) by an opposing ∆χ 2 change of 80 units. The χ 2 for group 5, "WZ Tevatron", also increases by 16 units. The inclusive ν-A DIS data set (#8) is fitted well globally, with χ 2 /N pt = 0.80 (0.83) in the CT no d.c. (CT fixed d.c.) fit. In sum, the fixed deuteron correction improves the CT total χ 2 by 18 units for 3670 data points. Since the ν-A DIS data are well-described, removing them from the CT fit actually increases the total χ 2 /N pt , but also seems to release some tension with the WZ LHC data, whose χ 2 value decreases by 32 units. While these look like modest changes, overall, we will see next that they do influence some PDF fla-vors and the local compatibility among select groups of experiments.
It is instructive to evaluate the shifts in χ 2 discussed above in the context of corresponding variations associated with NNLO effects. For this purpose, we compare in Table 3 values of χ 2 obtained with CT for the default NLO fits explored in this paper, CT no d.c. and CT fixed d.c., against two NNLO fits. These latter fits are the NNLO counterpart of CT no d.c., which we call no d.c. NNLO, and an alternative NNLO fit adopting the modified DIS scale choice of the CT18X NNLO fit discussed in Ref. [11], which we denote no d.c. NNLO-X. These additional fits allow us to quantify the impact on χ 2 of the NNLO correction itself, as well as the effect of perturbative scale variations at NNLO, respectively. In addition, in the trailing columns of Table 3 we also show χ 2 differences of each of these fits with respect to our base CT no d.c. fit at NLO. For example, These comparisons illustrate that inclusion of the fixed deuteron correction at NLO may impact the theoretical description of select experimental groups at a level comparable to NNLO corrections and scale variations. This is especially evident for the DIS deuteron data, for which the 14-unit reduction in χ 2 discussed above for CT fixed d.c. can be compared to small 5 and 4-unit increases with no d.c. NNLO and no d.c. NNLO-X, respectively. The above-mentioned χ 2 shift for the Drell-Yan data with CT fixed d.c. similarly surpasses the corresponding shifts obtained with the NNLO fits. In contrast, the tt data, for instance, are largely insensi-tive to the deuteron correction at NLO, but are significantly better-fitted with NNLO corrections. Ultimately, as future generations of QCD fits pursue higher accuracy at NNLO and beyond, it may therefore also be appropriate to consider deuteron corrections.

Impact of deuteron corrections on the d/u PDF ratio
As discussed in Sec. 1.1, experimental information involving deuterium targets, especially measurements of the DIS structure function of the deuteron (Group #4, "DIS Deuterium," in Table 1), has been pivotal for separating the d-quark content of the proton from other parton flavors. The leading impact of deuteron corrections thus primarily influences the extracted d-PDF at high x, where the deuteron most prominently differs from a superposition of a free proton and a free neutron. In contrast, the effect of the deuteron corrections on the u-type PDFs is comparatively mild, as these are most directly constrained by measurements of the proton's structure function (Group #3, "DIS Proton").
To gauge the leading impact of deuteron corrections, in this subsection we now examine the x dependence of the d/u ratio within the CT/CJ frameworks, before proceeding in Sec. 4.3 to an examination of the indirect effects on the lower-x d val PDF relevant for sin 2 θ W and on the gluon PDF in Sec. 4.4 (the PDF pulls will be considered in Secs. 4.5 and 4.6). Fig. 6 illustrates the impact of the F d 2 corrections, as introduced in Sec. 2.1, on d/u. The upper row shows the d/u ratio and its error band obtained in the discussed series of fits, normalized to the central value obtained in the fits with no deuteron corrections. The lower row shows the unnormalized d/u ratios themselves, using a linear horizontal scale to better visualize the x > 0.1 interval, and in particular the x → 1 behavior of this quantity. In both rows, the left and right panels give results for CT and CJ fits, respectively. We see that the deuteron corrections have a qualitatively similar impact on d/u in both CT and CJ, especially at x 0.1, with evidence of a mild, few-percent enhancement of the fitted d/u ratio over the no d.c. fits for 0.1 x 0.5 once the fixed deuteron correction is included. This enhancement turns into a suppression at still higher x 0.5, beyond which d/u is strongly affected by the 2-body nucleon-nucleon corrections included in the F d 2 calculation. For CJ, this suppression is larger than in the CT case, but compatible with the latter within the respective uncertainties of each fit.
The qualitative x dependence of the deuteron-corrected fit of d/u in the top rows of Fig. 6 closely follows the F N 2 /F d 2 ratio plotted in Fig. 3, in which F N 2 and F d 2 represent the deuteron structure function computed using the isoscalar and full nuclear predictions, respectively. Indeed, in the no d.c. fits, F d 2 is effectively fitted as an isoscalar target, but in the CT fixed d.c., for example, the F d 2 data for the physical deuteron are corrected to F N 2 , which leads to a relative suppression of the fitted d/u PDF ratio for x 0.5.
In the CJ and CT fixed d.c. fits (red curves), the d.c. parameters are held constant at their values obtained from the central CJ15 fit. If, on the other hand, the d.c. parameters in Eq. (3) are actively fitted with the PDF shape parameters, we obtain the same central PDFs but a narrower uncertainty band on d/u, as shown by the CJ free d.c. error band in the right panels of Fig. 6. This reduction in the PDF uncertainty of d/u, which at first sight is paradoxical because we have increased the number of fit parameters, is actually a consequence of the correlation between the treatment of F d 2 structure function data and extracted d-PDF. More specifically, when the nucleon off-shellness parameters are freed, the variations in the d-quark parameters that were necessary to encompass the F d 2 data in the CJ fixed d.c. fit are partly absorbed by the parameters of Eq. (3). In other words, releasing the offshell parameters reduces tensions in parameter space, and ultimately also diminishes the overall experimentby-experiment χ 2 E variations in the PDF analysis, as we shall note again below in Sec. 4.5 and 4.6. We have verified that, over the same x range, the relative uncertainty in the determination of other flavors, such as, e.g., the gluon, does correspondingly increase. This is an instance of the fact that, typically, the constraining power of a fit can be enhanced by a greater number of free parameters only in a limited sector of parameter space.
The different choices of data sets in the CT and CJ fits also affect the fitted d/u PDF ratio. For example, unlike CJ, the CT fit includes DIS data from inclusive ν-A collisions, multiplied by a phenomenological parametrization of the heavy-nuclear structure function relative to the physical deuteron. The green bands in Fig. 6 (left) are for the CT no nu-A variant of the CT fit. The removal of these data augments the shifts in the CT d/u ratio induced by including the fixed deuteron correction, which now is comparable to the CJ result. The reason for this may simply lie in the lack of further deuteron-to-isoscalar proton plus neutron corrections, or may also be related to possible discrepancies between neutral-and charge-current DIS data or interactions [52,109,110].
Conversely, the CT fits do not include the low-W SLAC data and the reconstructed W -boson asymmetry Fig. 6: Upper row: The PDF ratios d/u and their asymmetric error bands for T 2 = 10 at scale Q = 2 GeV within the CT (left) and CJ (right) fitting frameworks. We normalize all d/u error bands to the ratio from the central no d.c. fit (without any assumed deuteron correction). The left panel shows the CT no d.c., CT fixed d.c., and CT no nu-A error bands. The right panel shows the analogous CJ no d.c., CJ fixed d.c., CJ no-W slac, and the CJ free d.c. fits. The abscissas are scaled to highlight the impact of the deuteron corrections at large x, where the impact is most pronounced, as well as the modest enhancement in the d/u ratio for x 0.01 in CT at left. Lower row: now showing the absolute d/u ratios on a linear x-axis scale to highlight the behavior at high x.
from the DØ experiment that are influential upon the large-x d-quark fit in CJ [9]. When these are removed from the CJ fit as well -see the green CJ no-W slac bands in the right column of Fig. 6 -we obtain an enlarged uncertainty that includes both the deuteron corrected and uncorrected bands. The uncertainty on d/u at x → 1 in the CJ no-W slac fit is wider than in the CT no nu-A fit in part in reflection of different parametrization forms and selection of experiments between the CJ and CT analyses.

Impact on the valence PDFs in the LHC EW precision region
The effects of including deuteron structure corrections to F d 2 , while most pronounced at x > 0.1, have some consequences in the low-x region as well. In the CT result shown in Fig. 6(left), modest enhancements in d/u at about x ∼ 10 −3 can be seen for the CT fixed d.c. fit. While the sea-quark PDFs are relatively unaffected in this kinematic region, the valence component of the d-quark and, to a lesser extent, the u-quark PDFs in this small-x region are sensitive to the inclusion and theoretical treatment of both neutrino-nucleus and deuterium data at large x. This sensitivity is mainly a consequence of corresponding valence sum rules. Fig. 7 illustrates this feature. In the left panel, the fixed d.c. CT best-fit valence PDF (red dotted line), shown in this case at Q = 81 GeV ≈ M W , prefers a slightly lower d val at x ≈ 0.008 − 0.13 than in the no d.c. fit, and a slightly higher one at x ≈ 0.13 − 0.45. This deviation becomes still more pronounced if the neutrino-nucleus DIS data are removed (leading to the green dot-dashed line). In the right panel, we observe for CJ a qualitatively similar trend and associated x dependence over slightly shifted x regions of 0.01 − 0.1 and 0.1 − 0.53.
In Sec. 1.1, we illustrated in Figs. 1 and 2 that the weak-mixing angle measurements at the LHC using the forward-backward asymmetry, A FB , are sensitive to the valence d-and u-PDFs in an x-region about ∼ 0.03, i.e., where Fig. 7 indicates a dependence of these PDFs upon the treatment of the deuteron/heavy-nucleus data at higher-x values.

Impact of deuteron corrections on the gluon PDF
The deuteron data sets can also impact the gluon density through Q 2 -scaling violations; this is particularly true when these measurements cover a large range of the four-momentum squared, Q 2 , of the exchanged boson. Similarly to the case for d val in Sec. 1.1, the La-grange Multiplier scans [11] and PDF sensitivity techniques [15,16] in the CT18 analysis collectively demonstrate that the gluon at x > 0.1 receives significant constraints from the DIS deuterium data. Constraints from the extensive fixed-target DIS data are in fact competitive with the HERA DIS data, which probe the lowerx region, and also with LHC and Tevatron inclusive jet production, which cover a wide x range but involve complex arrays of systematic effects which remain under active study.
To address this point, in Fig. 8 we plot the error bands for the gluon PDF at Q = 2 GeV as determined in the series of fits discussed at the beginning of this Section, again normalized to the central value obtained in the no d.c. fits. As seen in the left panel of Fig. 8, the gluon PDF in the CT fits exhibits a modest sensitivity to the chosen deuteron correction treatment, with a dependence that is somewhat moderated by the adopted W 2 > 12.25 GeV 2 cut. Still, as with d/u, there is a qualitative tendency for the fixed deuteron correction to reduce the high-x gluon PDF, with this modification being enhanced by the exclusion of the inclusive ν-A DIS data. Like CT, the CJ fits seen in Fig. 8 (right), which include the SLAC DIS data, similarly display a relative suppression of the gluon for x 0.3 once deuteron corrections are taken into account. While this effect is of moderate size, it is nonetheless statistically significant in the context of the T 2 = 10 tolerance used to determine the uncertainty bands. For CJ, it will be interesting to confirm this effect by fitting the full JLab 6 GeV inclusive data set [111], and, even more so, the JLab 12 GeV data which will augment the precision of the available DIS measurements over a wide Q 2 range at large x, once available. 4.5 Valence-sector PDF pulls: the d/u ratio In Fig. 9, we plot the L 2 sensitivities of the groups of experiments to the d/u PDF ratio for varied implementations of the deuteron corrections. L 2 sensitivities for fits exploring data-set variations are shown in Fig. 10, which we also discuss below.
Starting with the no d.c. fits that either do not include (CT) or remove (CJ) deuteron corrections, we notice that, in both cases, the landscape of PDF pulls tends to be dominated by a few competing groups of experiments, which differ between the two fitting frameworks. For CT in the left panel, these are the LHC W/Z production (Group #6) and inclusive nuclear DIS (Group #8), which possess the sharpest opposing pulls at x ∼ 0.2, for example, in the direction of either favoring or disfavoring a larger d/u ratio, respectively. At slightly higher x > 0.3 values, which are of particular interest from the perspective of QCD-informed models of the d/u behavior as x → 1, these are joined by the DIS-deuterium (#4) and Drell-Yan (#7) groups of experiments. Turning to the CJ case, displayed in the top-right panel, it is the DIS deuterium (#4) and gamma-jet (#1) groups that dominate the landscape of PDF pulls -and quite strikingly at large x -with lesser but also significant pulls from the Tevatron W/Z production data (#5). The large-x pulls are expected, since the SLAC data are quite sensitive to nuclear dynamics in the deuteron target, as already noted.
Once fixed deuteron-structure corrections are introduced into the respective fixed d.c. fits, the relative patterns of PDF pulls experience an intriguing series of shifts, which we display in the middle row of Fig. 9. For CJ, the deuteron corrections largely resolve the huge tensions between the photon+jet and DIS deuteron data, because dynamical nuclear effects in the latter are now included in the theoretical calculation of the deuteron DIS cross section without forcing the dquark to deform to compensate for the missing nuclear effect. A residual tension between the DIS deuteron (#4) and W/Z Tevatron data (#5) data is still visible at x ≈ 0.5. While the large-x tensions are reduced, the small-x pulls visibly change in shape for several flavors.
For CT, the introduction of the fixed deuteron correction detailed in Sec. 2.1 instead preserves the qualitative x dependence of the L 2 pulls (i.e., the shapes) but softens their magnitude in a few cases, for example, for the Drell-Yan (#7) and the small-x LHC W/Z production (#6) data. Notably, the DIS-deuterium sensitivity shifts to closely resemble that of the LHC W/Z data (#6) in favoring a larger d/u for x ∼ 0.2. At the same time, the size of the competing pulls at high x 0.3 between the DIS-deuterium and inclusive ν-A data (#8) is enhanced, while the opposing pulls of other experiments are modestly reduced over the same range in x. This is especially clear for the CT Drell-Yan data (#7), which in the CT no d.c. fit had preferred a softer d/u ratio at low x < 0.01 and an enlarged value of d/u over 0.01 x 0.2. In CT fixed d.c., these preferences mostly vanish. At the same time, outside this interval of very high x, the opposing pulls of the inclusive ν-A data (Group #8) and both the DIS deuterium (#4) and LHC W/Z (#6) experiments increase and sharpen for x 0.01. In fact, this is the same collection of experiments for which in Table 2 we observed increases (in the case of Groups #6 and #8) in their respective values of χ 2 /N pt upon introducing deuteron corrections for F d 2 . Both the χ 2 values in Table 2 and the L 2 analysis for CT therefore indicate a noticeable rearrangement of the pulls of the inclusive neutrino-nucleus DIS data and select other experiments introduced by the correction to the deuteron DIS data. This rearrangement, being presently of similar order with respect to Fig. 8: Same as Fig. 7, but now for the gluon PDF. other contributing effects, will require attention in the future.
In the last row of Fig. 9 we present the L 2 pulls of the CJ free d.c. fit in which the deuteron off-shellness degrees-of-freedom in Eq. (3) are released. Comparing the vertical extents of the peaks with those of the CJ fixed d.c. fit in the middle row, we see that freeing the offshell parameters moderates the PDF pulls over the whole x range, especially those at x > 0.3 between the DIS deuteron (#4) and the WZ Tevatron CJ Drell-Yan (#5) data. This behavior can be generically understood as a consequence of increasing the number of free parameters, but is not guaranteed, for example, in the presence of incompatible data sets. The CJ free d.c. plot is thus an indication of a good level of consistency between the considered data sets, when the PDFs and deuteron corrections are fitted together.
The results discussed so far suggest a nontrivial relationship between the treatment of the DIS deuterium data and the description of other data sets in each fitting framework: the impact of deuteron-structure corrections in a global fit like CT and CJ cannot generally be expected to apply to deuterium data alone, but have secondary effects on the patterns of pulls of other data sets.
It is therefore interesting to study variations in the choices of experimental data sets in both fits, in particular, removing from each analysis those data sets that showed especially strong sensitivity to deuteron correc-tions or otherwise played a major role in the foregoing discussion.
For CT, we remove the entire collection of inclusive ν-A measurements (Group #8), and refit with the fixed deuteron corrections of Sec. 2.1 in place; the resulting L 2 sensitivity plot is displayed in the left panel of Fig. 10. Overall, the magnitude of the PDF pulls is slightly reduced, with the biggest change occurring for the DIS Deuterium Group (#4), which is now more closely aligned with the pulls exerted by the DIS proton data (#3) throughout the plotted domain in x. When considered in parallel with Fig. 9, and in the light of the previous discussion of Fig. 9, the left panel of Fig. 10 suggests a connection between the pulls of the DIS deuteron and ν-A data in fits with and without deuteron corrections. For both groups of experiments, the interplay between the theoretical description of deuterium and heavy-nuclear data is relevant. To that end, investigating the systematic treatment of nuclear effects for light and heavy nuclei is a critical subject for future global analyses that aim to use such data for constraining the nucleon PDFs to higher accuracy.
A similar consideration arises for CJ. As we have discussed, the combination of W -boson charge asymmetries and SLAC DIS data is strongly constraining on the d/u ratio at large x, and the d.c. treatment influences also the PDF pulls at smaller x values (as seen in the right panels of Fig. 9). We therefore remove these

WZ Tevatron
Drell-Yan   Drell-Yan Fig. 9: The L 2 sensitivities computed according to Eq. (5) for Q = 2 GeV, giving the pulls on the d/u PDF ratio of the process-dependent data sets fitted by CT (left) and CJ (right). Upper, middle, lower rows: results for the no d.c., fixed d.c., and free d.c. fits discussed at the beginning of Sec. 4.  data sets from the fit to obtain the CJ no-W slac fit shown in the right panel of Fig. 10. In this instance, the removal of the combined W and SLAC DIS data relieves tensions seen in CJ fixed d.c., for x < 0.1. However, the large-x tension between DIS deuteron and WZ Tevatron data (that now only include the W → decay lepton asymmetries) remains largely intact and can in fact also be seen in the CT panel on the left of Fig. 10. It remains to be seen whether this is of experimental origin, or due to an as yet incomplete treatment of nuclear corrections in the deuteron target.

PDF pulls in the gluon and light-quark sea sectors
At first sight, it might seem reasonable to suppose that deuteron-structure corrections, being most sizable at high x and more immediately connected to extractions of the d-quark, would be relatively inconsequential for determinations of the gluon PDFs. In actuality, constraining the gluon PDF through DIS data requires an adequate prediction of the scale dependence of both proton and deuteron DIS data sets, with the latter simultaneously sensitive to the (Q 2 dependent) deuteron corrections discussed in Sec. 2.1. Moreover, the momentum sum rule requires that the changes in the total momentum fraction from the large-x and small-x quark and gluon PDFs compensate one another. The practical implementation of the deuteron correction can therefore impact g(x, Q) over a still broader range beyond high x.
We therefore turn to an examination of the pulls on the gluon PDF in fits with and without deuteron corrections, presented in Fig. 11, before examining CT and CJ fits with the modified data sets in Fig. 12. Comparing the CJ no d.c. fit in the upper-right panel of Fig. 11 to the CJ fixed d.c. fit in the middle-right panel, one sees that adding a fixed deuteron correction clearly aligns the pulls of the DIS proton group (#3) and DIS deuteron group (#4) on the gluon. The x dependence of these pulls was effectively uncorrelated without the deuteron correction. In the presence of the fixed correction, however, they are aligned and pronounced over the whole x range and are opposed mostly by the strong pull of the W Z Tevatron data (Group #5). Furthermore, after the off-shell parameters in the CJ deuteron correction are freed (lowermost panel), the tension between Groups #3, #4, and #5 is relieved, resulting in a more consistent data set, with weak pulls ( 3 units) everywhere. It is also interesting to note that a similar effect arose in the d/u sector discussed in the previous subsection, and therefore seems to be a robust feature of fitting the deuteron corrections simultaneously with the PDF parameters.
In the two CT fits in the upper left and middle left panels of Fig. 11, a somewhat different pattern emerges. Inclusion of the deuteron correction in CT fixed d.c. does have the effect of partially aligning the pulls of     Fig. 9, for the PDF pulls on g(x, Q) at Q = 2 GeV.   the DIS Proton (Group #3) and DIS deuteron (Group #4) on the gluon, but this effect is restricted to a narrower interval, x ∈ [0.2, 0.5]. The pulls from the other groups are relatively unaffected by the deuteron correction, although we see some realignment of the pulls from LHC W Z and fixed-target Drell-Yan production experiments (Groups #6 and #7). The weaker dependence of the gluon pulls in the CT fit on the deuteron correction compared to the CJ case is most likely due to the absence of the SLAC DIS data from the former fit. One can therefore investigate the effect of the removal of the SLAC data on the CJ fit. Intriguingly, as shown in the right panel of Fig. 12, simultaneously removing the SLAC DIS data and the Tevatron W -boson asymmetry data largely alleviates the competing gluon pulls, which are now smaller than those observed in the CT fit, especially from the LHC sets not included in CJ.
Clearly, the gluon pulls in the CT fit are due to the data other than the large-x SLAC and W production data. In particular, in the absence of the large-x SLAC and W production data in the CT fit, we notice a strong preference for a harder gluon at x ≈ 0.3 from the ν-A DIS experiments (Group #8) both with and without the nuclear correction. In fact, the preference of the CDHSW ν-A DIS deuteron data for a harder gluon at large x had already been identified in the CT18 analysis [11], although the net effect of including this data set in the CT18 fit does not exceed the PDF uncertainty. However, as the left panel of Fig. 12 shows, removing these data from the fit does not substantially alter the pulls of the remaining experiments shown in the CT plots of Fig. 11, which are led by the jet and W/Z measurements from the LHC.

Conclusion
In this analysis, we have for the first time undertaken a comparative analysis of two global fitting frameworks, CTEQ-JLab (CJ) and CTEQ-TEA (CT), using the L 2 sensitivity statistical metric developed in Refs. [15,16]. This metric allowed our study to take advantage of the complementary strengths of the two frameworks: the extended experimental coverage and various theoretical developments implemented in the two approaches, as well as the flexible PDF parametrizations available in CT and the unique capabilities of CJ in describing low-energy and nuclear dynamics. In doing so, we made a number of technical adjustments to each framework (discussed in detail in the appendix) in order to reconcile the CT and CJ treatment of PDF uncertainties and thereby render them sufficiently similar to be meaningfully juxtaposed.
We have, in particular, concentrated on evaluating the impact on PDF determinations of nuclear corrections which take into account the two-baryon structure of the deuteron. In fact, as discussed in Sec. 1.1, DIS and Drell-Yan measurements on deuterium are very informative in providing flavor separation of d-type quarks from other parton flavors (under an assumption of nucleon charge symmetry). At the same time, the introduction of deuterium data into proton PDF fits brings along its own uncertainties associated with nuclear and power-suppressed effects. Global analyses take diverse approaches in handling the deuteron and heavy-nuclear effects, from selection of the least affected experimental data [11,12,13], to including some fixed [11] or free [9,13] nuclear corrections and performing Bayesian marginalization [37,57] with respect to the nuclear parameters. It is therefore important to understand the role of the deuteron corrections in a controlled setting, by isolating them from other factors that affect the existing PDF ensembles at comparable levels. 1 By examining the fitted PDFs and resulting PDF pulls of experimental data under several theoretical scenarios for the treatment of deuteron corrections, in Sec. 4 we have gathered a substantial number of results that clarify these questions. We reiterate here our overriding conclusions based on this investigation: of relevance to precision studies in the electroweak sector. -In both fitting frameworks, the modifications caused by the deuteron-structure corrections are moderated by the inclusion of some non-deuteron data sets; for CT, these are inclusive neutrino DIS data on heavy nuclear targets, while for CJ, a combination of high-x SLAC DIS data and reconstructed bosonlevel Tevatron charge asymmetry requires special attention. Disentangling the interplay among these fitted experiments will require a further study at NNLO accuracy, including additional investigation of the implementation of theoretical corrections for nuclear data sets (including both deuteron and heavier targets) and the treatment of W/Z data.
As the drive to realize next-generation accuracy in PDF analyses gains speed with preparations for the High-Luminosity LHC [112], Electron-Ion Collider [113], and Long-Baseline Neutrino Facility [114], we recommend consideration of deuteron corrections and broader nuclear effects in PDF analyses, as well as continued phenomenological and model-based studies [60,115,116,117] of deuteron structure in parallel. Deuteron effects will become particularly unavoidable with increasing PDF precision and in PDF-benchmarking studies, most obviously for the d-PDF at x 0.2 and beyond, but, ultimately, for consistency of the extracted gluon density and over a widening range of x. Consideration of the parton-level violation of the charge symmetry in the deuteron [118] may become relevant as precision goals advance still further. As emphasized in Sec. 1.1 and 4.3, the achievement of ultimate precision in tests of the SM in the electroweak sector will partly depend upon the successful treatment of the issues described in this work.

Appendix: Comparison procedure and technical details
To meaningfully compare two distinct global fits on a common footing as done in this article, it has been necessary to conciliate their methodologies, cf. Sec. 3.2. Part of this consisted in making a few technical adjustments to ensure the mutual compatibility of the CJ and CT computations of the L 2 sensitivity introduced in Sec. 3.3 and employed in Sec. 4. In this Appendix, we provide more detail about these adjustments.
The L 2 sensitivity can be interpreted as a fast approximation, based on the Hessian error formalism [1], of the ∆χ 2 E shifts contained in the LM scans shown in the panels of Fig. 2. For that reason, we expect the sum of L 2 sensitivities over all fitted experiments E to vanish for each parton flavor f ; i.e., This desired result -that the "total" L tot 2 sensitivities approximately sum to zero for all flavors within a given fit -represents an ideal scenario in which all data sets demonstrate full mutual compatibility, and uncertainties are small enough to validate a quadratic approximation for the χ 2 function around the central PDF parameters, a 0 . In practice, however, neither condition was perfectly realized when generating the Hessian eigenvector sets, causing L tot 2 to deviate from zero for both the CJ15 and CT18 eigenvector ensembles. We illustrate the graphs of L tot 2 obtained with the standard eigenvector computations, for tolerance T 2 = 10, in the left panels of Fig. 13. In the upper figure, the total sensitivities in the standard CJ free d.c. show very pronounced deviations from zero. In the lower figure, we see milder but not entirely negligible deviations from zero in the counterpart default CT no d.c. fit.
We have identified the sources of this behavior and remedied this situation as follows.
-CJ: The standard CJ PDF error sets are obtained by the Hessian analysis around the best-fit PDF parameters, a 0 , with each eigenvector scaled by a given tolerance factor T to nominally produce an increase of T 2 above the minimum in the χ 2 function (T = 1.646 in the CJ15 analysis [9], T = √ 10 in this paper). For the computation of the sensitivities, the χ 2 function is instead scanned along every eigenvector starting from the best-fit parameters, a 0 , until parameters a i are found in the plusand minus-directions such that ∆χ 2 ( a 2i+1 ) = ∆χ 2 ( a 2i ) = T 2 ∀ i = 1, . . . , N par .
This way, the error PDFs correspond exactly to a given likelihood L ∝ e −T 2 /2 , and, by construction, L tot 2 = 0 except for numerical uncertainties, see Eq. (6). The total sensitivities after this adjustment of the eigenvector excursions are shown in the upper right panel of Fig. 13. All L tot 2 values are now well within ±1 unit from zero.
-CT: The published general-purpose CT fits [11, 17, 119] apply a two-tier procedure to determine the published error bands. Excursions along each eigenvector are constrained both by the Tier-1 penalty imposed by the increase of the global χ 2 above T 2 units, and by the dynamical Tier-2 penalty based on effective Gaussian variables that ensure that no single-experiment χ 2 E value increases above its bestfit value by more than its uncertainty at the requested confidence level. The Tier-2 procedure may lead to unequal excursions in the plus-and minusdirections along a given eigenvector, therefore introducing a departure from the condition for L tot 2 = 0. For this reason, we revert to the Tier-1 PDF error determination, which nominally satisfies the desired condition on L tot 2 . The resulting total sensitivities are compared to the standard CT18 calculation in the lower row of Fig. 13. 2 All total sensitivities with the Tier-1 error sets are within ±1 unit from zero, with a somewhat enhanced deviation observed for the gluon PDF and for the charm PDF that follows it.
The very bad CJ L tot 2 obtained before the adjustment in the upper left panel of Fig. 13 was traced primarily to a substantial deviation from Gaussianity observed along a small number of eigenvector directions. This can happen in a fit where certain linear combinations of parameters are poorly constrained by the data, especially when exploiting an extended parametrization such as in the CJ case, where additional parameters are included to allow for a non-vanishing d/u quark at large x, to describe off-shell deformation of bound nucleons in a deuteron target, and to fit higher-twist corrections to the standard twist-2 calculation of electron-nucleon DIS. In fact, we have verified that the very large values of the total sensitivity are primarily driven by the combined HERA data set, which accounts for most of the calculated sensitivities across all parton flavors, with a secondary large contribution provided by the DØ Wasymmetry measurements.
In the CT analysis, both the Tier-1+2 and Tier-1 total sensitivities in the lower row of Fig. 13 are of a natural order, being generally ≈ 0, especially in the Tier-1 calculation at lower-right. This is aside from the high-x L tot 2 values for the g-and c-PDFs, which are somewhat larger than in the CJ15 adjusted analysis, but still 1 in this case. The Tier-1 calculation is less affected by non-Gaussianities and produces a smaller total sensitivity than the default Tier-1+2 analysis. As with the CJ analysis, it is interesting to identify the key experiments that affect the differences between the CT Tier-1+2 and Tier-1 total sensitivities. Since these . The top left panel shows L tot 2 for the CJ free d.c. PDF ensemble with the standard CJ15 computation of the Hessian eigenvectors. The top right panel illustrates the same result after adjusting each CJ free d.c. eigenvector to have ∆χ 2 = T 2 = 10 in both the positive and negative direction, thereby correcting the departures from Gaussianity observed for poorly constrained eigenvectors in the standard CJ Hessian analysis. In the bottom row, the left panel shows L tot 2 for the CT no d.c. PDF ensemble computed with the CT18 two-tier uncertainty constraints with T 2 = 10. On the right, the same but including only the Tier-1 uncertainty constraint that do not introduce a priori deviations from the Gaussianity condition.
differences are comparatively small, we use a different procedure than in the CJ case, where it was sufficient to analyze the size of the experiment-by-experiment sensitivities before parameter adjustment. We notice that the departure from L tot 2 ≈ 0 is determined by the χ 2 imbalance, between the PDF error sets in each positive and negative eigenvector direction, see Eq. (6). We can therefore quantify the impact of a given experiment on the observed differences between the total sensitivities for the Tier-1+2 and Tier-1 CT analysis by calculating Note that we could also have considered other metrics, such as the quadratic distance between the respective χ 2 imbalances. However, our goal here is to identify the experiments with the largest impact on the reduction of the total sensitivity, rather than performing a detailed quantitative analysis and ranking of each experiment.
We collect the largest values of the δτ E metric in Table 4, where experiments have been ordered from the highest to lowest impact. Intriguingly, the most affected experiment in the CT case is also the HERA I+II combined data set. We believe that the underlying reason(s) for this commonality with the CJ analysis are the high precision and large kinematic coverage of the HERA data, which are therefore sensitive to small variations in the PDFs and subject to secondary constraints, such as those imposed on the gluon distribution by PDF sum rules.  Table 4: Impact of a given experiment on the difference between the Tier-1+2 and Tier-1 CT total sensitivities, quantified by the δτ E distance defined in Eq. (12). The first 3 experiments are well separated from the remaining ones, which are more closely spaced.