Probing the strange content of the proton with charm production in charged current at LHeC

We study charm production in charged-current deep-inelastic scattering (DIS) using the xFitter framework. Recent results from the LHC have focused renewed attention on the determination of the strange-quark parton distribution function (PDF) and the DIS charm process provides important complementary constraints on this quantity. We examine the current PDF uncertainty, and use LHeC pseudodata to estimate the potential improvement from this proposed facility. As xFitter implements both fixed-flavor- and variable-flavor-number schemes, we can compare the impact of these different theoretical choices; this highlights some interesting aspects of multi-scale calculations. We find that the high-statistics LHeC data covering a wide kinematic range could substantially reduce the strange PDF uncertainty.


Introduction
The deep-inelastic-scattering (DIS) experiments traditionally have provided important tests of perturbative QCD (pQCD) and are essential to precisely determine the parton distribution functions (PDFs) of the nucleon. In addition to the numerous dedicated fixed-target DIS experiments that have been performed so far, the HERA accelerator used colliding beams of leptons (electrons and positrons) and protons to investigate the nucleon structure. The broad kinematic coverage of the HERA charge-current (CC) and neutral-current (NC) DIS data in terms of the negative virtuality Q 2 of the exchanged vector boson and the Bjorken variable x Bj is such that these data have significant impact on the determinations of the PDFs [1][2][3][4][5].
In the Standard Model (SM), the charm quark plays an important role in the investigation of the nucleon structure [6][7][8][9]. In the NC case, the photon-gluon fusion process for charm production was calculated at O(α 2 s ) with the full heavy-quark mass dependence included in the DIS hard cross sections [10,11]. The heavy-quark mass effects in the CC process have been calculated to O(α s ) in Refs. [12][13][14][15][16], and the recent work of Ref. [17] provides results up to O(α 2 s ). The large-Q 2 contributions of heavy flavors to the xF 3 structure function had already been computed in Ref. [18]. In many of the posited models which extend the SM, the coupling to "new physics" is proportional to the particle mass; hence, the heavy quarks will have an enhanced coupling and provide an optimal testing ground for these searches.
Heavy quarks also play a critical role in helping us fully characterize the SM, and the charm quark is especially useful in this respect as it can provide us direct access to the strange-sea quark distribution. The strange sea has been extensively investigated in a number of experiments including the associated production of a W boson with a charm-jet final state, which (at LO) arises from strangegluon initial states [19][20][21][22][23][24][25][26]. Additionally, charm production in neutrino/antineutrino-nucleon DIS has been studied by a number of experiments including: CCFR [27], NuTeV [28], CHORUS [29], CDHSW [30] and NOMAD [31]. With a sign-selected beam (ν/ν), these experiments can separately probe the strange s(x) and anti-stranges(x) distributions. While the neutrino DIS experiments provide detailed information on the shape of the strange distribution, the normalization is a challenge, as that is tied to the beam flux. Separately, the HERMES collaboration used charged-lepton DIS production of charged kaons to provide a complementary extraction of s(x) +s(x) at LO [32]. Recently, charm production in CC DIS was measured for the first time in e ± p collisions by ZEUS [33].
Additionally, charm production mediated by electroweak gauge boson at hadron colliders provides important information on the strange-and charm-quark distributions, and is complementary to the DIS final-state charm-quark experiments [34]. The Tevatron measured the charm-quark cross section in association with a W boson at CDF [19,35,36] and D0 [20], but these results were limited by low statistics.
In lieu of significant experimental constraints, many global QCD analyses tie the strange distribution to the lightsea quarks via the relation s =s = r sd . While in principle r s depends on both x Bj and Q 2 , it is often set to a fixed value [37,38].
Using inclusive leptonic decays of W and Z bosons, the ATLAS experiment has obtained a value of r s = 1.19 ± 0.16 at x = 0.023 and Q 2 0 = 1.9 GeV 2 [39]. Additionally, using the cross section ratio for W ± + c final states they also find a comparably large value for r s [23]. In contrast, CMS results generally prefer lower r s values [22,24]. However, a recent analysis using both ATLAS and CMS data suggests that the LHC data support unsuppressed strangeness in the proton. While the result is dominated by ATLAS, this is not in contradiction with the CMS data [22,23,39,40].
Looking to the future, it is clearly important to reduce the uncertainty of the strange-quark PDF as we strive to make increasingly precise tests of the SM and search for what might lie beyond. The proposed Large Hadron Electron Collider (LHeC) program has the ability to provide high statistics measurements of electrons on both protons and nuclei across a broad kinematic range to address many of these outstanding questions.
In this investigation, we make use of the XFITTER tools [41] (version 2.0.0) to study the present constraints on the strange-quark PDFs, and then use LHeC pseudodata [42] to infer how these might improve. Furthermore, as XFITTER implements both fixed-flavor-and variable-flavor-number schemes, we can examine the impact of these different theoretical choices. This paper is organized as follows. In Sect. 2 we outline the theoretical details of the different heavy-flavor schemes. In Sect. 3 we compare the theoretical predictions of the different schemes across the kinematic range, and examine the individual partonic contributions. In Sect. 4 we study the impact of the LHeC pseudodata on the PDFs using a profiling technique. In Sect. 5 we provide some discussion and summarize the results. Finally, in Appendix A we discuss some of the more subtle theoretical issues that we encounter at higher orders.

Theoretical predictions for CC charm production at the LHeC
The proposed Large Hadron Electron Collider (LHeC) [42] would collide a newly built electron beam with the LHC hadron beam at a center of mass energy of √ s = 4E e E p ; thus the 7 TeV proton beam on a 60 GeV electron beam provide √ s ∼ 1.3 TeV. Compared to HERA, the LHeC extends the covered kinematic range by an order of magnitude in both x Bj and Q 2 with a nominal design luminosity of 10 33 cm −2 s −1 .
Theoretical predictions are calculated for electroweak charged-current (CC) charm production in ep collisions at the LHeC at centre-of-mass energy √ s = 1.3 TeV, using a variety of heavy-flavor schemes. The predictions are provided for unpolarized beams in the kinematic range 100 < Q 2 < 100000 GeV 2 , 0.0001 < x Bj < 0.25. They are calculated as reduced cross sections at different Q 2 , x Bj and inelasticity (y) points. The covered y range is 0.0024 < y < 0.76.
Experimentally, however, not charm quarks but charmed hadrons (or rather their decay products) are registered in the detectors. Therefore, extrapolation to the inclusive charmproduction cross section has to be carried out in a modeldependent way. Furthermore, CC production of charm quarks in the final state can happen via both electroweak and QCD processes. The former leads to an odd number of charm quarks in the final state with the W boson having the same electric charge as the sum of the electric charges of final-state charm quarks, while the latter creates an even number of charm quarks with total electric charge equal to zero. If the electric charge of the tagged charm quark can be accessed experimentally (e.g. when reconstructing D mesons), the QCD contribution can be subtracted by taking the difference of the yields in the events with odd and even numbers of charm quarks, otherwise the QCD contribution can be estimated only in a model-dependent way.
The CC charm process directly depends on the CKM matrix [50]. Here, the CKM matrix elements V cd and V cs are particularly relevant and we use the values V cd = 0.2252 and V cs = 0.9734. Three different heavy-flavor schemes are employed, all including a full treatment of charm-mass effects up to NLO, i.e. O(α s ); in the following we describe them in detail for the particular application to CC electronproton reactions.

The heavy-flavor schemes
The standard "A" variant of the fixed-flavor number scheme (FFNS), which we identify as FFNS A, uses three light flavors in both PDFs and α s evolution for all scales, while heavy flavors (here, charm) are produced exclusively in the matrix-element part of the calculation. This scheme has been used for the PDF determinations and cross section predictions of the ABM(P) group [4,[43][44][45], as well as in the FF3A variant of the HERAPDF analysis [2], and implemented in XFITTER through the OPENQCDRAD package [46]. Next, the "B" variant of the FFNS (FFNS B), known as the "mixed" or "hybrid" scheme [6] is also used. In this scheme, the number of active flavors is still fixed to three in the PDFs, relying exclusively on O(α s ) fully massive matrix elements for charm production, while the number of flavors is allowed to vary in the virtual corrections of the α s evolution. Corrections to the α s evolution involving heavy-flavor loops are thus included and resummed to all orders, while no resummation is applied to other higher order corrections. This procedure will catch a fraction of the "large logs" which might spoil the fixed-flavor scheme convergence at very high scales, and is possible since the masses of the charm and beauty quarks provide natural cutoffs for infrared and collinear divergences. This scheme was used in the HERAPDF FF3B variant [2] and in applications of the HVQDIS program [6]. In general, the transition from the FFNS A to the FFNS B requires a readjustment of the treatment of matrix elements involving heavy-flavor loops. In the specific case of CC production, no such loops occur up to NLO (at NNLO they do), so that the same matrix elements can be used for both schemes; thus the only difference is in the α s evolution.
Finally, for the variable-flavor-number scheme (VFNS) we use the "B" variant of the fixed-order-next-to-leading-log scheme (FONLL-B) [47] which combines the NLO O(α s ) massive matrix elements of the FFNS with the O(α s ) massless results of the zero-mass variable-flavor-number scheme (ZM-VFNS), allowing the number of active flavors to vary with scale, and all-order next-to-leading log resummation of (massless) terms beyond NLO. It thus explicitly includes charm and beauty both in the PDFs and in the evolution of the strong coupling constant. Whenever terms would be double-counted in the merging of the two schemes, the massless terms are eliminated in favour of the massive ones. The FONLL scheme is commonly used by the NNPDF group [5] and implemented in XFITTER through the APFEL package [48].
In summary, the schemes used are: • FFNS A: a NLO FFNS with n f = 3 at all scales, used with the ABMP16 [45] or HERAPDF2.0 FF3A [2] NLO PDF sets.
• FFNS B: a NLO FFNS with n f = 3 for the PDFs and variable n f for α s , used with the HERAPDF2.0 FF3B [2] NLO PDF set.
The PDF sets are available via the LHAPDF interface (version 6.1.5) [49].

The reduced cross section
The reduced CC charm-production cross sections can be expressed as a linear combination of structure functions: with In the quark-parton model, when we neglect the gluons, the structure functions become: The terms xU, xD, xU and xD denote the sum of parton distributions for up-type and down-type quarks and antiquarks, respectively. 1 The ± superscript on σ and F corresponds to the sign of W ± . Below the b-quark mass threshold, these sums are related to the quark distributions as follows: In the FFNS the charm-quark densities are zero. In the phase-space corners y → 0 and y → 1 and using the same quark-parton model approximation, we have the following asymptotic relations: Thus the contribution from the strange-quark PDF is suppressed at high y.

XFITTER implementation
All calculations are interfaced in XFITTER and available with MS heavy-quark masses. The reference value of the MS charm mass is set to m c (m c ) = 1.27 GeV [50], and α s is set to the value used for the corresponding PDF extraction: α s (M Z ) = 0.1191 for ABMP16 and α s (M Z ) = 0.118 for NNPDF3.1. The renormalization and factorization scales are chosen to be µ 2 r = µ 2 f = Q 2 . To estimate theoretical scale uncertainties, µ r and µ f are simultaneously varied up and down by a factor of two. In the case of the FONLL-B calculations, also the independent µ r and µ f variations are checked. Furthermore, the PDF uncertainties are propagated to the calculated theoretical predictions, while the uncertainties arising from varying the charm mass m c (m c ) = 1.27 ± 0.03 GeV by one standard deviation are smaller than 1% and therefore neglected. In the FONLL-B scheme, as a cross check, the calculation was performed with the pole charm mass m pole c = 1.51 GeV which is consistent with the conditions of the NNPDF3.1 extraction [5]. The obtained theoretical predictions differ from the ones calculated with m c (m c ) = 1.27 GeV by less than 1%. The total theoretical uncertainties are obtained by adding in quadrature scale and PDF uncertainties.

Comparison of theoretical predictions
We now provide some numerical comparisons of the heavyflavor schemes using their separate input conditions and associated PDF sets. Caution is necessary in these comparisons as the PDF sets are extracted with different input assumptions, data sets, and tolerance criteria; this is, in part, why we shall separately display the µ r , µ f and PDF uncertainties in the following.

Comparison of theoretical predictions in the FFNS A and FONLL-B schemes
Figs. 1, 2 and 3 show theoretical predictions for the FFNS A and FONLL-B schemes calculated as described in the previous sections with their total uncertainties. The FFNS A and FONLL-B results agree reasonably well within uncertainties in the bulk of the phase space. However, in phase-space corners such as Q 2 10000 GeV 2 or small y the predictions in the two schemes differ by more than 50%, exceeding the theoretical uncertainties.
To examine these differences further, in Fig. 4 we separately compute PDF and scale uncertainties (setting µ r = µ f = µ) of the charm CC cross section as a function of Q 2 for different values of x Bj calculated in the FFNS A and FONLL-B scheme.
Comparing the two schemes, the larger variation of the FONLL-B scheme reflects the larger PDF uncertainty of the underlying PDF sets used: ABMP16 for FFNS A and NNPDF3.1 for FONLL-B. This difference is most evident in Fig. 4 which specifically separates out the PDF uncertainty, and reflects the independent inputs and assumptions used in the different PDF extractions.     Examining the results of Fig. 4, we also observe some other interesting features. For both of the calculations, the PDF uncertainties are relatively stable across the Q 2 range for fixed x Bj , but tend to increase at larger x Bj values. As is well known, in pQCD calculations the effect of scale variations is indicative of the convergence of the series. We observe that the scale uncertainties for the FONLL-B scheme uniformly decrease with increasing Q 2 . For the FFNS A scheme, the scale uncertainties decrease for small x Bj values but increase with Q 2 at intermediate values of x Bj . Additional details are shown in Fig. 5 where we separately vary µ r and µ f for the FONLL-B scheme. Here we note that the uncertainty associated to µ r is very small and the total scale uncertainty is dominated by the variations of µ f which is tied to the PDFs, f i (x, µ f ). For the FFNS A in XFITTER, it is not possible to separately vary µ r and µ f in the current implementation, so the separate uncertainties can only be inferred by comparison to the FONLL-B case.

Additional comparisons
To further explore whether the differences between the two sets of theoretical predictions are due to the different treatment of heavy quarks or to the different PDF sets, theoretical calculations in FFNS A and FONLL-B are repeated with the HERAPDF2.0 PDF sets extracted from the HERA DIS data [2]. Predictions in the FFNS B scheme are  Figure 5 The impact of separate scale variations on charm CC predictions for the LHeC as a function of Q 2 for different values of x Bj calculated in the FFNS A and FONLL-B schemes. also produced using the HERAPDF2.0 FF3B PDF set and the FFNS B matrix elements, which are equivalent to the FFNS A matrix elements at NLO for CC charm production. The results are displayed in Fig. 6. The differences between FFNS A and FONLL-B are similar to those displayed in Figs. 1-3 and demonstrate that these differences arise from the different treatment of the heavy quarks in the two schemes. The FFNS B predictions lie between the FFNS A and FONLL-B predictions, indicating that a large part of the difference is due to the different treatment of heavy quarks in the running of α s at high x Bj or low y.
Furthermore, to investigate the impact of the NNLO corrections available at Q m c for the FFNS calculation, approximate NNLO predictions are obtained using the ABMP16 NNLO PDF set [4]. The results for the cross section as a function of Q 2 for different values of x Bj are shown in Fig  To better understand the differences between the FFNS and VFNS calculations, Fig. 6 is particularly instructive. We see that at low Q 2 the FFNS (FFNS A and FFNS B) and VFNS (FONLL-B) results agree within uncertainties (as demonstrated in Fig. 2). When the scale µ is below the charm-threshold scale µ c (typically taken to be equal to m c (m c )) the charm PDFs vanish and the FFNS and VFNS reduce to the same result. 2 For increasing scales, the VFNS resums the α s ln(µ 2 /µ 2 c ) contributions via the DGLAP evolution equations and the FFNS and VFNS will slowly diverge logarithmically. This behavior is observed in Fig. 6 and is consistent with the characteristics demonstrated in Ref. [52].
More precisely, Ref. [52] used a matched set of n f = 3 and n f = 5 PDFs to study the impact of the scheme choice at large scales. They found that the resummed contributions in the VFNS yielded a larger cross section than the FFNS (the specific magnitude was x-dependent), and that for Q 2 scales more than a few times the quark mass, the differences due to scheme choice exceeded the differences due to (estimated) higher-order contributions. Thus, we have identified the source of the scheme differences at large Q 2 . The source of the scheme differences at large x Bj is a bit more subtle. The VFNS includes a resummation of higherorder logarithms of the form α s ln(µ 2 /µ 2 c ). In Fig. 18 of the Appendix we display the separate contributions of the VFNS for a choice of {x Bj , Q 2 }; the difference between the LO and SUB curves is indicative of the additional contribution of the resummed logarithms. This contribution depends on the particular x Bj value and we find (c.f., Fig. 11 of Ref. [52]) that these terms increase for larger x Bj values. Figure 6 indicates that a large fraction of these seems to be caught by the FFNS B scheme. Thus, we have identified the source of the scheme differences at large x Bj .

Contributions from different partonic subprocesses
The fundamental difference between the FFNS and the VFNS is the treatment of the heavy partons, the charm in particular. In the FFNS the charm is not included in the PDFs as an active parton, so charm quarks only arise from gluon splitting, g → cc. In contrast, the VFNS does include the charm as an active partonic flavor, and thus allows for charm-initiated subprocesses. To better appreciate these differences, we will study the individual partonic contributions to the cross section as functions of the kinematic variables x Bj , Q 2 , and y.
Figs. 8, 9 and 10 show the contributions from separate partonic subprocesses to the CC charm production cross section in the FFNS A and FONLL-B schemes as a function of: x Bj for different values of Q 2 , Q 2 for different values of x Bj , and y for different values of Q 2 , respectively.
In these figures we observe that the gluon contribution to the FFNS is strikingly similar to the charm contribution to the VFNS. This is explained by the fact that in the FFNS the charm is present only in the final state and produced predominantly in the hard process γg → cc. In contrast, in the VFNS the charm is present also in the initial state and mainly produced by g → cc collinear splitting through DGLAP evolution. The fundamental underlying process is (and has to be) the same in both the FFNS and VFNS, but the factorization boundary between PDFs and hard scattering cross section,σ ⊗ f , (determined by the scale µ and the scheme choice) is different. 3 These figures highlight another interesting feature of the QCD theory; we observe that for the VFNS the gluon contribution (green curves) can become negative in particular kinematic regions. This is because in the VFNS we combine the gluon-boson fusion process (the NLO terms of Figs. 16 and 17) with the counter-term (the SUB terms), and this combination can be negative. This behavior underscores the fact that the renormalization scale µ is simply "shuffling" contributions among the separate sub-pieces, but the total physical cross section remains positive and stable, cf., Fig. 18 and Ref. [53]. This is a triumph of the QCD theory.
Next, turning our attention to the strange PDF contribution, it is notable that the FFNS and VFNS behave qualitatively very similar as functions of Q 2 , x Bj , and y. In particular, we observe that the strange fraction increases for x Bj and decreases for Q 2 and y. In particular, at high y the strange PDF contribution drops to zero in favor of the gluon or charm quark PDFs (see Fig. 10 and Eq. (5)). Similar phenomena (although less pronounced) are observed at low x Bj and/or high Q 2 . In these phase-space regions, the dominant contributions to the cross section are proportional to the gluon PDF in the FFNS or to the charm-quark PDFs in the VFNS.

PDF constraints from charm CC pseudodata
Now we turn to examine how the LHeC can reduce the PDF uncertainties and thus improve our predictive power.  The impact of charm CC cross section measurements at the LHeC on the PDFs is quantitatively estimated using the profiling technique [54]. This technique is based on minimizing the χ 2 between data and theoretical predictions taking into account both experimental and theoretical uncertainties arising from PDF variations. Two NLO PDF sets were chosen for this study: ABMP16 [45] and NNPDF3.1 [5]. All PDF sets are provided with uncertainties in the format of eigenvectors. In the presence of strong constraints (the LHeC data is very precise), it is preferable to use the eigenvector representation as only a few MC replicas would survive the Bayesian reweighting.

The CC charm pseudodata
For this study, pseudodata for charm CC production cross section differential in Q 2 and x Bj and corresponding to an where D and T are the column vectors of the measured (data) and predicted (theory) values, respectively. The correlated theoretical PDF uncertainties are included using the nuisance parameters b β ,th with their influence on the theory predictions described by Γ β ,th , where the index β runs over all PDF eigenvectors. For each nuisance parameter a penalty term is added to the χ 2 , representing the prior knowledge of the parameter. No theoretical uncertainties except the PDF uncertainties are considered. The full covariance matrix Cov representing the statistical and systematic uncertainties of the data is used in the fit. The statistical and systematic uncertainties are treated as additive, i.e. they do not change in the fit. The systematic uncertainties are assumed uncorrelated between bins.
The values of the nuisance parameters at the minimum, b min β ,th , are interpreted as optimized, or profiled, PDFs, while uncertainties of b min β ,th determined using the tolerance criterion of ∆ χ 2 = 1 correspond to the new PDF uncertainties. The profiling approach assumes that the new data are compatible with the theoretical predictions using the existing PDFs, such that no modification of the PDF fitting procedure is needed. Under this assumption, the central values of the measured cross sections are set to the central values of the theoretical predictions.

The profiled PDFs
The profiling study is performed using two sets of LHeC charm CC pseudodata: the full set, a restricted set with data points for which the difference between the FFNS A and FONLL-B are smaller than the present PDF uncertainties. The latter is taken for simplicity as the sum of the ABMP16 and NNPDF3.1 uncertainties, but for the most data points it is dominated by the NNPDF3.1 uncertainties (see Fig. 4).  Figure 11 The full (∆ scheme < ∆ PDF , ∆ scheme > ∆ PDF ) and restricted (∆ scheme < ∆ PDF ) sets of data points which are used for PDF profiling.
Given the sizable differences observed between the FFNS A and FONLL-B predictions, the study with the restricted data set (also referred to as 'with cuts') aims to check whether or not model independent constraints on the strange PDF can be extracted using the charm CC reaction at LHeC. The two sets of data points are shown in Fig. 11 as functions of Q 2 and x Bj .
The original and profiled ABMP16 and NNPDF3.1 PDF uncertainties are shown in Figs. 12-15. The uncertainties of the PDFs are presented at the scales µ 2 f = 100 GeV 2 and µ 2 f = 100000 GeV 2 . A strong impact of the charm CC pseudodata on the PDFs is observed for both PDF sets. In particular, the uncertainties of the strange PDF are strongly reduced once the pseudodata are included in the fit. Also the gluon PDF uncertainties are decreased. Furthermore, in the case of the NNPDF3.1 set, the charm PDF uncertainties are reduced significantly. For all PDF sets, only small differences can be noticed between the PDF constraints obtained using the full or restricted set because the whole x Bj range is covered in both cases (see Fig. 11) despite the fact that the number of data points in the restricted set is roughly half of the total number of data points.
Additionally, in the case of the NNPDF3.1 set, it is possible to check the constraints on the strange quark and anti-quark distributions separately, because no assumption s =s is used in NNPDF3.1. The LHeC e − p pseudodata provide direct constraints only ons. Nevertheless due to the apparently strong correlation between s ands in the NNPDF3.1 fit, quite strong constraints are present on both the s ands distributions once the direct constraints ons are provided by the LHeC pseudodata. However, only mild constraints are put on the ratio s/s. This indicates that for precise determination of s/s both e − p and e + p data will be needed.
Comparing the results of profiled PDFs in the FFNS and the VFNS, we find both analyses are able to significantly improve the constraints on the strange quark PDF. This result gives us confidence that the general features we observe here are independent of the details of the heavy flavor scheme.

Discussion and summary
The recent performance of the LHC has exceeded expectations and produced an unprecedented number of precision measurements to be analyzed; thus, it is essential to improve the theoretical calculations to match. The uncertainty for many of these precision measurements stems primarily from the PDFs. Hence, our ability to measure fundamental parameters of the Standard Model (SM), such as the W boson mass and sin 2 θ W , ultimately comes down to how accurately we determine the underlying PDFs [56]. Additionally, our ability to characterize and constrain SM processes can indirectly impact beyond-standard-model (BSM) signatures.
We have focused on the strange-quark distribution which, at the LHC, can have a significant impact on the W /Z cross section: one of the "standard candle" measurements. If we can reduce the uncertainty for these predictions, we can set stringent limits on any admixture of physics at higher scales. Unfortunately, at present the strange PDF has a comparably large uncertainty because measurements from the LHC and HERA, as well as older fixed-target This situation has prompted us to examine the CC DIS charm production at the LHeC to determine the impact of this data set on the PDF uncertainty. We considered the LHeC as this high-energy ep/A facility could potentially run in parallel with the LHC and provide insights into these issues at low x and high Q 2 in advance of a FCC program.
This case study of the CC DIS charm production at the LHeC provides a practical illustration of the many features of XFITTER. As the XFITTER framework is designed to be a versatile open-source software framework for the determination of PDFs and the analysis of QCD physics, we can readily adapt this tool to address the impact and influence of new data sets. Furthermore, as both FFNS and VFNS calculations are implemented, we can use XFITTER as a theoretical "laboratory" to study the resummation of large logarithms and multi-scale issues. We have outlined some of these issues in the Appendix. In particular, the CC DIS charm production involves a flavor-changing W ± boson, multiple quark masses enter the calculation, and this introduces some subtle theoretical issues to properly address the disparate mass and energy scales.
Using the XFITTER framework, we find that the LHeC can provide strong constraints on the strange-quark PDF, especially in the previously unexplored small-x Bj region. 4 A large reduction of uncertainties is observed also when restricting the input data to the kinematic range where the differences between the FFNS A and FONLL-B schemes are not larger than the present PDF uncertainties, indicating that the obtained PDF constraints are stable and independent of the particular heavy-flavor scheme. As noted above, a reduction of the strange-PDF uncertainties influences the W /Z production, and thus the Higgs production; hence, the LHeC CC DIS charm production data represent a valuable addition for the future global PDF fits.
However, since charm CC production in e − p collisions mostly probes, only mild constraints are put on the ratio s/s using the NNPDF3.1 PDF set as reference; therefore for a precise determination of this ratio, both e − p and e + p data will be needed.
In conclusion, we find that CC DIS charm production at the LHeC can provide strong constraints on the strange PDF which are complementary to the current data sets. As the PDF uncertainty is the dominant factor for many precision analyses, a reduction of these uncertainties will allow for more accurate predictions which can be used to constrain both SM and BSM physics processes.
Appendix A: F c 2 Beyond leading-order The multi-scale problem: The CC DIS charm production process involves some interesting issues that we will explore here in detail. In particular, there are multiple mass and energy scales which span a wide kinematic range, and it becomes an intricate puzzle to treat them all properly.
For this current illustration, we will focus on the contribution to the DIS F c 2 structure function from the process involving the strange and charm quark; other quark combinations can be addressed in a similar manner. The fully inclusive F 2 can be studied using the energy and angle of the outgoing lepton; in contrast, F c 2 also requires information about the final hadronic state, and this introduces some subtleties. In particular, we will show that as we go to higher orders the F c 2 structure function must be defined carefully so that: i) theoretically it is free of divergences and independent of the renormalization scales when calculated to all orders, and ii) experimentally it matches what is measured by the detector.

The mass scales:
What makes this process complex is that we encounter a number of different mass scales. Furthermore, there is no fixed hierarchy for the mass scales, and we will need to compute both in the low-Q region, where Q m c , as well as in the high-Q region, Q m c .
The Q scale is related to the invariant mass of the virtualboson probe (W + in this case), and can be expressed in terms of the energy and angle of the lepton; this is a physically measurable kinematic variable.
In contrast, the scale µ is an unphysical scale which implements the separation between the PDF and the hardscattering cross section, and the scale at which α s is evaluated; thus, the physics should be insensitive to a variation of µ. As our calculations typically involve the dimensionless combination ln(µ/Q), we generally choose µ ∼ Q to avoid large logarithms.
The strange quark is a "light" active parton with an associated PDF s(x) and mass m s < Λ QCD . The strangequark mass is comparable to or less than other hadronic scales which are neglected; as such, it serves only as a regulator and plays no physical role. Effectively, we can take m s → 0 if we choose. We treat the up and down quarks masses m u,d in a similar manner.
The charm quark is a "heavy" object; its associated mass m c > Λ QCD does play a physical role and cannot generally be neglected. There may or may not be a PDF associated with the charm. In a n f = 3 FFNS scheme, we will assume the charm PDF to be zero. 5 In a VFNS there is a charm PDF only when the µ scale is above the scale where the charm PDF is activated; we call this the matching scale, µ c . It is 5 It is possible to extend this to incorporate an intrinsic-charm PDF. common 6 to set µ c = m c , but this is not required. 7 In this study, however, we will adopt this common choice.
Because there are two different quark masses involved (m s and m c ) in the CC DIS process, we can examine the mass singularities of the t-channel and u-channel separately. This separation is particularly useful to understand how the individual mass singularities are addressed, and how the FFNS and the VFNS organize the contributions to the total structure function.
The n f = 3 FFNS: To be specific, we will consider CC DIS production of a charm quark. We first compute this in the n f = 3 FFNS where {u, d, s} are light "active" partons in the proton, and the charm c is considered an external "heavy" particle. This can be implemented in the ACOT scheme [53] for example by using a CWZ renormalization [61] where the light "active" partons are renormalized with normal MS, and the "heavy" quarks use a zero-momentum subtraction. In this scheme, the leadingorder (LO) process is sW + → c as illustrated in Fig. 16. At next-to-leading-order (NLO), we then include gW + → cs which has both t-channel (Fig. 16) and u-channel (Fig. 17) contributions. 8

t-channel:
The t-channel process has an intermediate squark exchanged, and if we use the strange quark mass m s to regulate the singularities, this will yield a contribution proportional to ln(Q/m s ). This mass singularity arises from the region of phase space where the exchanged s-quark becomes collinear and close to the mass shell; that is, when the phase space of the gW + → cs process begins to overlap with that of the sW + → c process. This "double counting" is resolved by a subtraction (SUB) counter-term given by: Here, f g→s is the perturbative splitting of the gluon into an ss pair; the leading term is proportional to: 9 [59] and references therein. 7 By displacing the matching scale to larger values µ c > m c , one can have the advantage of avoiding delicate cancellations in the region µ ∼ m c ; this flexibility was explored in Refs. [51,60]. 8 Note, there are also corresponding quark-initiated processes; we will focus on the gluon-initiated processes as this is sufficient to illustrate our points. Both the gluon-and quark-initiated contributions are included in our calculations. 9 The scale of the SUB term is µ as the relevant scale here is the renormalization scale of the PDF: f (x, µ) ⊗σ (x, Q, µ). where P (1) g→s (x) is the O(α s ) DGLAP splitting kernel for g → s.
The complete contribution to the structure function is given by: The complete O(α s ) contribution is the combination (NLO − SUB); our separation into NLO and SUB is simply to illustrate the interplay of these components. Both the NLO and SUB terms have ln(m s ) divergences, but these precisely cancel and yield a well-defined result even if we take the m s → 0 limit. 10 u-channel: We next examine the u-channel NLO contribution to the gW + → cs process. This has an intermediate c-quark exchanged and is proportional to ln(Q/m c ). In the FFNS where the charm is a "heavy" non-parton, there is no counter-term for this graph, and the resulting observables will retain the ln(Q/m c ) dependence. In principle, this means that when we go to large Q scales, these terms will begin to degrade the convergence of the perturbative series. In practice, while this degradation only grows logarithmically, at large scales (such as at the LHC energies) we do find it convenient to treat the charm on an equal footing as the u, d, s partons.
The VFNS: We now turn to the VFNS scheme where we include the charm quark as an "active" parton and compute its associated PDF. In this case, there is a u-channel counter-term (SUB) given by f g ⊗ f g→c ⊗ σc W + →s which is proportional to ln(µ/m c ). The NLO u-channel contribution will have a ln(Q/m c ) factor, so the combination (NLO − SUB) is also free of mass singularities. 11 What is less obvious is that we must also include the LO processcW + →s. There are two ways we can understand why this is necessary.
Explanation #1: matching of LO and SUB: Recall that in the t-channel case, the subtraction term SUB removed the double counting between the LO sW + → c and NLO gW + → cs subprocesses.
The u-channel case is analogous in that this subtraction term removes the double counting between the LOcW + →s and NLO gW + → cs subprocesses; both contributions are required to ensure that the resulting cross section is insensitive to the scale µ. This is apparent in Fig. 18 where we plot the individual terms versus µ for fixed values of x Bj and Q. In the region µ ∼ m c , the charm PDF f c (x, µ) (and hence, the LO contribution) rises very quickly as the DGLAP evolution is driven by the very large gluon distribution via g → cc splitting, and combined with a large α s (µ). The SUB subtraction also rises quickly as this is driven by the logarithmic term ln(µ 2 /m 2 c ). The difference (LO − SUB) is the physical contribution to the total [T OT = LO + NLO − SUB], and it is this combination that is smooth across the "turn on" of the charm PDF at the matching scale µ c = m c . We now see that if we neglect the LO (cW + →s) contribution, we lose the cancellation between LO and SUB in the region µ ∼ m c , and our structure function (or cross section) would have an 11 Specifically, the combination (NLO − SUB) is free of mass singularities and finite in the limit m c → 0. Note that the VFNS fully retains the charm quark mass m c and (in contrast to some claims in the literature) the factorization holds up to O(Λ 2 /Q 2 ) corrections; all terms of order (m 2 c /Q 2 ) are fully included [62].
anomalous shift at the arbitrarily location (µ c ) where we turn on the charm PDF.
As we vary the unphysical scale µ, we are simply shifting contributions between the separate {LO, NLO, SUB} terms which individually exhibit a large µ-dependence. However, the total combination (T OT ), which represents the physical observable, is relatively insensitive to µ (up to higher orders), and this property is evident in Fig. 18.
Explanation #2: removing "double counting:" A second way to understand why we require the LO processcW + →s is to consider the regions of phase space covered by each of the subprocesses. The singularity of the u-channel NLO gW + → cs processes arises from the phase-space region where the intermediatec-quark becomes collinear and close to the mass shell. 12 This is precisely the phase-space region of the LO processcW + →s where the partonicc-quark is collinear to the hadron. The SUB term then removes the "double counting" between the LO and NLO contributions; hence, all three contributions {LO, NLO, SUB} are necessary to cover the full phase space.
This is also apparent if we consider the transverse momentum (p T ) of the final-state charm in the Breit frame. For the LOcW + →s process in the Breit frame, the incoming W + andc are collinear, and the produceds must have zero p T in this frame.
For the NLO gW + → cs process, we integrate over the complete phase space for the exchangedc quark, and this will include the region where thec-quark is emitted nearly collinear to the gluon and nearly on-shell; in this region thē c-quark will have p T ∼ 0 and we encounter a singularity from the internalc-quark propagator. The p T ∼ 0 region is precisely that subtracted by the SUB counter term 13 and this ensures that the combination (NLO − SUB) is free of divergences.

Recap:
To recap, i) the combination of the LO and SUB terms ensure a minimal µ variation at low µ, and ii) the combination of SUB and NLO ensures that the mass singularities are cancelled at high µ.
This interplay of terms illustrates some of the intricacies of QCD, especially since this exchange is across different orders of α s . Furthermore, note that in the u-channel for both the LO and SUB contributions, the charm quark is collinear to the incoming hadron, and thus exits in the hadron remnants. While this may be experimentally difficult to observe, because we are asking for a "fully inclusive" F c 2 , these contributions cannot be simply ignored. We will discuss this further in the following section.
Defining F c 2 : The LO u-channelcW + →s process foreshadows difficulties that we encounter if we try and extend the concept of "fully inclusive" F c 2 to higher orders. We note that in Ref. [62] Collins extended the proof of factorization to include heavy quarks such as charm and bottom for an inclusive structure function F 2 ; analysis of a "fully inclusive" F c 2 is more complex for a number of reasons. Whereas F 2 only requires measurement of the outgoing lepton energy and angle, F c 2 also requires information on the hadronic final state. At the parton level, this introduces complications including when the charm is in the hadronic remnants and brings in both fragmentation and fracture functions.
To characterize the theoretical issues involved in constructing F c 2 , we can imagine starting from the (welldefined) inclusive F 2 , and then dividing the contributions into two sets: one for F c 2 for the "heavy" charm quark, and the rest into F u,d,s 2 for the "light" quarks. We will show that this theoretical procedure encounters ambiguities.
The LO u-channelcW + →s process does not have any "apparent" charm quark in the final state, but this contribution is essential to balance with the SUB process f g ⊗ f g→c ⊗ σc W + →s . Note that for the SUB process the charm quark arises from a gluon splitting into a collinear cc pair which is then part of the hadron remnants. For the LO process, presumably ourc quark also came from a gluon splitting into a collinear cc pair. Thus, our F c 2 must include those cases where the charm is contained in the hadron remnants.
This issues touches on the fact that, because the charm parton ultimately fragments into a charmed hadron (typically a D meson), we must introduce a set of fragmentation functions (FFs) which are scale-dependent and will factorize final-state singularities in a similar manner as the PDFs factor the initial-state singularities. 14 Specifically, we may also allow for the possibility that a gluon or a light quark fragments into a charmed hadron.
The bubble diagram: Some of the theoretical intricacies of defining a "fully inclusive" F c 2 are illustrated in Fig. 19 which shows a higher-order DIS process with a quarkantiquark loop.
Let us compute this diagram in the n f = 3 FFNS where the internal loop is a massive cc-pair and the external quark is a light quark {u, d, s}. If the final state is represented by Cut-A, then we have charm quarks in the final state, and this should be included in F c 2 . However, if we instead use Cut-B as a final state, there is no charm in the final state, so this should not be included 14 For the NLO quark-initiated contributions (not shown) we will have final state singularities from processes such as c → cg which will be factorized into the FFs.
A B Figure 19 A higher order Feynman graph illustrating the complications in defining a "fully inclusive" F charm 2 . A light quark (q) scatters from a vector boson (V ) with a cc in the internal loop. If we cut the amplitude at "A" we have charm in the final state and this must be included in F charm 2 . If we cut the amplitude with cut "B" there is no charm in the final state. Additionally, since this diagram contributes to the beta function, this highlights the complications of using an α S and hard scatteringσ with differing N eff .
in F c 2 .
[More precisely, when we renormalize the charm loop with zero-momentum subtraction, this contribution effectively decouples.] Thus, the contribution from Cut-A will be included in F c 2 , but the contribution from Cut-B will not.
This diagram generates additional complications in that multiple quark flavors are involved. For example, the bubble diagram involves quarks of both q = {u, d, s} and c flavors, so this contribution cannot be uniquely assigned to F q 2 or F c 2 . We can introduce theoretical definitions to make the choice, but then we have to be careful about double-counting contributions and introducing uncancelled singularities. For example, the bubble diagram of Fig. 19 is encountered in the F c 2 heavy-quark calculations of Refs. [63,64]; here, an additional scale ∆ is introduced to subdivide the contributions.
The running of α s in the FFNS: The bubble diagram of Fig. 19 also highlights the difficulty of using a n f = 3 FFNS with a VFNS running of α s . In a n f = 3 FFNS, internal cc loops decouple from the theory and are not included in the calculation; 15 however, the β -function with n f = 4 requires precisely these cc loop contributions. This deficiency can be patched order by order by expanding the β -function and inserting the required terms at each order [65][66][67]. Once again, we cannot unambiguously divide the inclusive F 2 into separate "light" and "heavy" quantities.

Extensions to bottom and top:
While we have used the charm quark to illustrate these features, the same properties can, in principle, be applied to both the bottom and top quark. 16 For the case of the bottom quark, the larger mass m b yields a smaller α s (µ) for µ ∼ m b and the evolution of f b (x, µ) is thus reduced. Nevertheless, for large-scale processes (such as at the LHC) we often find it convenient to make use of f b (x, µ) and treat the bottom on an equal footing as the other light quarks. For the case of the top quark, the very large mass m t yields a much smaller α s (µ) for µ ∼ m t and the evolution of f t (x, µ) is comparatively reduced.

Summary
To properly define F c 2 at higher orders, we encounter the theoretical issues discussed above: as the charm quark fragments into a charmed meson, we must be careful to ensure that the theoretical quantity matches what is actually measured experimentally. This is more complex than simply asking for the portion of F 2 has has a charm in the final state, and is an issue for both the FFNS and VFNS as we move to higher orders. We can perform the computation in the FFNS but in the large energy limit we encounter ln(Q 2 /m 2 c ) divergences and this, in part, contributes to the observed differences at large Q.
The VFNS includes the charm quark as an active parton for µ scales above a matching scale µ c . For large Q scales, the mass singularities of NLO and SUB terms will cancel to yield a result free of divergences. For scales µ ∼ m c , cancellation between the LO and SUB contributions ensures a minimal µ dependence; however, as this can be delicate to implement numerically, we have the option of displacing the matching scale µ c to a larger scale where the cancellation is more stable [51,60].