PDF Profiling Using the Forward-Backward Asymmetry in Neutral Current Drell-Yan Production

Non-perturbative QCD effects from Parton Distribution Functions (PDFs) may be constrained by using high-statistics Large Hadron Collider (LHC) data. Drell-Yan (DY) measurements in the Charged Current (CC) case provide one of the primary means to do this, in the form of the lepton charge asymmetry. We investigate here the impact of measurements in Neutral Current (NC) DY data mapped onto the Forward-Backward Asymmetry ($A_{\rm FB}$) on PDF determinations, by using the open source fit platform {\tt{xFitter}}. We demonstrate the potential impact of $A_{\rm FB}$ data on PDF determinations and perform a thorough analysis of related uncertainties.

which is related to the single-lepton pseudorapidity and, once combined with di-lepton mass and rapidity, would qualitatively correspond to triple-differential cross sections. Precision measurements of triple-differential observables have been presented in [19], while a recent study of DY differential cross sections in the context of PDFs has been presented in [20]. Furthermore, recently the ATLAS and CMS Collaborations determined the weak mixing angle in Refs. [16,18] through their DY measurements using methods which constrain PDF uncertainties. The CMS paper [18] uses the Bayesian χ 2 reweighting technique [15,21,22] to constrain PDF uncertainties, while profiling of PDF error eigenvectors is used as a cross check. In the ATLAS note [16] the PDF uncertainties are included in the likelihood fit and thus constrained.
The DY triple-differential cross section for di-lepton production at LO is given by: where s is the square of the centre-of-mass energy of the colliding protons and x 1,2 = M e ±y / √ s are the parton momentum fractions, f q,q (x i , Q 2 ) are the PDFs of the involved partons (either quark or anti-quark), Q 2 is the squared factorization scale (in our analysis always set equal to the di-lepton centre of mass energy), and M and y are the invariant mass and rapidity of the final state di-lepton system. The function P q contains the propagators of the neutral SM gauge bosons and their couplings to the involved fermions: where θ W is the Weinberg angle, M Z and Γ Z are the mass and the width of the Z boson, e and e q are the lepton and quark electric charges, v = − 1 4 + sin 2 θ W , a = − 1 4 , v q = − 1 2 I 3 q − e q sin 2 θ W , a q = 1 2 I 3 q are the vector and axial couplings of leptons and quarks respectively, with I 3 q the third component of the weak isospin; the angle θ * is the lepton decay angle in the partonic centre-of-mass frame. The first and third terms in Eq. (1.2) are the square of the s-channel diagram with photon and Z boson mediators respectively, while the second term is the interference between the two.
The A * FB is defined as: From this expression it follows that the dominant contribution is given by the interference term, and in particular by the linear term in cos θ * [23], which does not cancel in the numerator of Eq. (1.3). The contribution of up-type and down-type quarks varies with the invariant mass and with the rapidity of the system as shown in Ref. [11]. The A * FB is sensitive to the chiral couplings combination v a v q a q and is proportional to valence quark PDFs. In particular we expect the A * FB to be sensitive to the linear combination: However, when constraining valence quark PDFs we get constraints on sea PDFs too, since other data are sensitive to the sum of the valence and sea quark PDFs. In particular we note a strong complementarity of the constraints coming from DY CC asymmetry, which is sensitive to the combination u V − d V at LO [6]. This paper is devoted to investigating the impact of the A FB data on PDF extractions by using the open-source QCD fit platform xFitter [24]. We consider three different scenarios for luminosities, ranging from Runs 2, 3 to the HL-LHC stage [25]. We perform PDF profiling [26] with xFitter and present results for several PDFs, i.e., we quantitatively estimate the impact of the A FB data on the uncertainties of these PDF sets, including different scenarios corresponding to different selection cuts for the di-lepton rapidity.
The paper is organised as follows. In Sec. 2 we describe technical aspects of the xFitter implementation and treatment of A FB pseudodata, while in Sec. 3 we describe the inclusion of NLO QCD corrections in the analysis. In Sec. 4 we present result of the PDF profiling. In Sec. 5 we discuss theoretical and systematic uncertainties affecting the A FB observable. We give our conclusions in Sec. 6.

A FB in xFitter and pseudodata generation
In this section we describe the implementation of the A FB observable in xFitter [24], the generation of the pseudodata and the fitting procedure.
A suitable C++ code has been developed and integrated in the xFitter environment for the analysis of the reconstructed forward-backward asymmetry (A * FB ) of two leptons with opposite charges in the final state from DY production in the NC channel.
Initially we implemented the observable at LO, where the initial state interaction occurs between a quark and an anti-quark of the same flavour (qq) and the angle θ * is defined with respect to the direction of the incoming quark. The latter is reconstructed accordingly to the direction of the boost of the di-lepton system, as discussed in Refs. [23,[27][28][29][30].
Using the analytical expression in Eq. (1.1) for the hadronic triple differential cross section, numerical integrations for the calculation of the A * FB in different invariant mass bins and rapidity regions are performed using the GSL public library, adopting the "Adaptive Gauss-Kronrod" rule with 61 points within each integration interval [31,32]. This choice provides a sufficient precision in all integration intervals, including the more problematic high rapidity regions and Z-peak resonance neighbourhood. Adaptive methods in principle could be problematic for fits using numerical estimation of derivatives, however there are no issues for profiling purposes. Adjustable parameters of the analysis, such as collider energy, acceptance and rapidity cuts, have been implemented in the associated parameter card. The mass effects of charm and bottom quarks in the matrix element are neglected, as appropriate for a high-scale process, and the calculation is performed in the n f = 5 flavour scheme [33]. Acceptance cuts reflect the usual ATLAS and CMS detector fiducial region, defined by |η | < 2.5 and p T > 20 GeV. The input theoretical parameters have been chosen to be the ones from the Electro-Weak (EW) G µ scheme [34]. The explicit values for the relevant parameters in our analysis are the following: M Z = 91.188 GeV, Γ Z = 2.441 GeV, M W = 80.149 GeV, α em = 1/132.507 and sin 2 θ W = 0.222246 (the last one does not matter for this specific profiling exercise).
Suitable datafiles with pseudodata have been generated for the analysis. An important component contained in the datafiles is the statistical precision associated to the A * FB experimental measurements in each invariant mass bin. The statistical error on the observable is given by: where N is the total number of events in a specific invariant mass interval. In order to obtain estimates as close as possible to the projected experimental accuracy, we have computed the number of events by convoluting the LO cross section without any acceptance cut with an acceptance times efficiency factor with typical value ∼20% corresponding to realistic detector response [35], and with a mass dependent k-factor reproducing the NNLO QCD corrections [36,37]. We stress that the latter is used in the evaluation of the number of events in Eq. 2.1, not in the evaluation of the observable itself. The pseudodata have been generated according to this procedure fixing the collider centre-of-mass energy to 13 TeV for the three projected integrated luminosities of: 1. 30 fb −1 , a subset of the currently available LHC data after the end of Run 2; 2. 300 fb −1 , the designed integrated luminosity at the end of the LHC Run 3; 3. 3000 fb −1 , the designed integrated luminosity at the end of the HL-LHC stage [38].
In order to study the effects of data in the high di-electron rapidity region, the pseudodata have also been generated imposing various low rapidity cuts as |y | > 0 (no rapidity cut), |y | > 1.5 and |y | > 4.0 (the last one required the extension of the detector acceptance region up to pseudorapidities |η | < 5). Despite the possibility of exploring the impact of the A * FB in rapidity bins, instead of rapidity cuts, we opted for the latter choice in order to have data with larger statistic, which benefits the profiling of the PDFs.

NLO study
For the calculation of the NLO A * FB , the MadGraph5_aMC@NLO [39] program was used, interfaced to APPLgrid [40] through aMCfast [41]. These NLO theoretical predictions correspond to the analysis cuts of the ATLAS data from Ref. [19]. These NLO calculations are not supplemented by any k-factors to match higher-order accuracy.
The asymmetry distribution is provided in 62 bins 2.5 GeV wide between M = 45 and 200 GeV * (the pseudodata are prepared for the same invariant mass interval and bin size) for 5 different di-lepton rapidity |y | regions: 0.0 < |y | < 0.5, 0.5 < |y | < 1.0, 1.0 < |y | < 1.5, 1.5 < |y | < 2.0, and 2.0 < |y | < 2.5. The asymmetry distribution is defined as a function of the angular variable cos θ * between the outgoing lepton and the incoming quark in the Collins-Soper (CS) frame [43], in which the decay angle is measured from an axis symmetric with respect to the two incoming partons. The decay angle θ * in the CS frame is given by: where p ± i = E i ± p Z,i and the index i = 1, 2 corresponds to the positive and negative charged lepton respectively. Here, E and p Z are the energy and the z-components of the leptonic four-momentum, respectively; p Z, is the di-lepton z-component of the momentum and p T, is the di-lepton transverse momentum. Then, the experimental measurement of the A * FB is obtained differentially in M according to Eq. (1.3) for the five aforementioned di-lepton rapidity regions.
Because of the definition of the A * FB observable, NLO corrections largely cancel in the ratio of cross sections, thus there is no significant difference between the observable calculated at LO or NLO. In Fig. 1 we show the A * FB curves from xFitter obtained with the LO analytical code and when employing the LO and NLO grids. As visible in the lower panel, the differences between the results obtained with the LO analytical code and with LO grids match very well up to purely statistical fluctuations, while NLO corrections slightly dilute the A * FB shape, being positive (negative) in the region below (above) the Z peak where the A * FB is negative (positive). We have verified that no differences are visible when comparing the profiled curves obtained using either LO or NLO calculations. The results that follow have been obtained by means of the described NLO grids, unless stated differently.

PDF profiling and numerical results
In this section we present the results of the profiling on the aforementioned PDF sets, using various combinations of A * FB pseudodata, varying the integrated luminosity and the rapidity cut. The qualitative behaviour of the profiled distributions does not change when varying the Q 2 scale, thus, unless otherwise stated, in the following, results will be shown for a reference scale Q 2 = M 2 Z . A more extensive discussion on the effects of the choice of the scales (both factorisation and renormalisation) is presented in Sect. 5. * In this paper we work in the region near the Z boson mass and assume this region to be free of BSM effects. See [42] for a recent study of cross-contamination effects between BSM and PDF analyses.

PDF profiling
The profiling technique [26] is based on minimizing χ 2 between data and theoretical predictions. The PDF uncertainties are included in the χ 2 using nuisance parameters. The values of the PDF nuisance parameters at the minimum are interpreted as optimised, or profiled, PDFs, while their uncertainties determined using the tolerance criterion of ∆χ 2 = 1 correspond to the new PDF uncertainties. The profiling approach assumes that the new data are compatible with the theoretical predictions using the existing PDF set and, under this assumption, the central values of the data points are set to the central values of the theoretical predictions. No theoretical uncertainties except the PDF uncertainties are considered when calculating the χ 2 . Fig. 2 shows the impact of the profiling on the CT14nnlo PDF set. For this specific PDF set we also rescale the error bands to 68% Confidence Level (CL), for a better comparison with the results obtained with the other PDF sets. As visible, the largest reduction of the uncertainty band is obtained for the u-valence quark distribution. As the luminosity grows, also the distribution for the d-valence quark displays a visible improvement. The main effects are concentrated in the region of intermediate and small momentum fraction x. The sea quark distributions show a moderate improvement, progressively increasing with the integrated luminosity. While the contraction of the error band in the u-sea distribution seems to saturate above 300 fb −1 , the d-sea quark distribution appears to show continued improvement with an integrated luminosity of 3000 fb −1 . For the sea quark distributions, these effects are concentrated in the region of intermediate x. HERAPDF2.0nnlo sets, obtained using pseudodata with an integrated luminosity of 300 fb −1 .
The NNPDF3.1nnlo set shows an intermediate sensitivity to the A * FB data. The distributions that are more affected are those of the u-valence and d-valence quarks in the intermediate and small x regions. Also the u-sea distribution displays some sensitivity in the region of intermediate x.
The MMHT2014nnlo set appears as the least sensitive to the new data. A mild improvement on the error bands is visible in the distribution of the u-valence, d-valence and u-sea quark distributions in the small x region. The ABMP16nnlo set is the most sensitive to A * FB data. A remarkable improvement is visible especially in the distribution of the d-valence quark in the region of small to intermediate x. A visible improvement is also obtained in the distribution of the u-valence quark, while the sea quarks are less affected. In the HERAPDF2.0nnlo set, a noticeable reduction of the error bands is obtained for the valence quarks in the small and intermediate x regions, while the sea quarks appear not as sensitive to the new data.
In the following we study the effects on the profiling from the application of low rapidity cuts on the data. Since this procedure in general reduces the amount of data, thus increasing the statistical uncertainty of the measurements, we carry out the following analysis adopting  an integrated luminosity of 3000 fb −1 , and we select a PDF set which showed an intermediate sensitivity to the A * FB data, such as the HERAPDF2.0nnlo set. For an exhaustive discussion on the differences between the various PDF sets, we refer to the PDF reviews in Ref. [44,45].
In Figs. 4 and 5 are presented the effects on the profiling when imposing rapidity cuts on the pseudodata. Comparing those profiled error bands, we note some improvement in the distribution of the d-valence quark, especially in the region of small x. A visible reduction of the error bands can also be appreciated in the distribution of both u-sea and d-sea quarks in the region of intermediate x.
In Fig. 5 we instead consider the profiling obtained when imposing a rapidity cut |y | > 4.0 on the data. In order to analyse this scenario, which probes the very high rapidity region, we need to enlarge the acceptance region of the detector. Experimentally it is possible to explore pseudorapidity regions up to |η | < 5 in the di-electron channel. However, in this case the experimental analysis requires that at least one lepton falls in the usual acceptance region |η | < 2.5 [19]. We drop this requirement and we impose instead a symmetric acceptance cut |η | < 5 on both leptons. The profiled curves in this case have been obtained by means of the LO code implemented into xFitter, while the pseudodata contains 120 bins of 1 GeV covering the invariant mass region 80 GeV < M < 200 GeV.
In the curve obtained in this scenario we notice how the reduced statistics due to the phase space cut leads to an overall poorer profiling compared to the previous cases. Conversely, in this setup the reduction of uncertainty is concentrated in the region of high x, which was not accessible before. The high rapidity cut indeed forces more asymmetric combination of x 1 and x 2 of the incoming interacting partons, such that one parton has to lie in the high x region while the other in the small x region, as it was already pointed out in Ref. [10,11]. In particular, we observe a remarkable improvement on the distribution of d-valence and

Eigenvectors rotation
In this section we want to determine the PDFs (and their combinations) which are more sensitive to the A * FB data. We perform a reparameterisation of the eigenvectors of selected PDF sets [46]. The new set of eigenvectors will be the result of a rotation of the original set, and they will be sorted according to their sensitivity to the new data. We have performed this exercise on two sets with Hessian eigenvectors: the CT14nnlo and HERAPDF2.0nnlo PDFs.
In Tab  for the two PDF sets. We observe that the first two eigenvectors almost completely determine the error bands for the distribution of the u-valence and d-valence quarks and their sum.
In particular we observe that u-valence and d-valence eigenvectors are very correlated and the A * FB data will constrain their charge weighted sum 2 . This is in contrast to CC lepton asymmetry data which at LO constrain instead the combination u V − d V [6]. In the light of these results, we conclude that A * FB data will mostly constrain the distribution of the valence quarks and this outcome is in agreement with the results presented in the previous section.
We conclude this section by noting that, following the observation made in Refs. [10,11], in which detailed comparisons were made between statistical errors and PDF errors for various scenarios with different selection cuts and luminosities, in this paper we have obtained for the first time quantitative results for the reduction of PDF uncertainties from using the A FB asymmetry, and we have identified the charge-weighted sum of u-valence and d-valence PDFs as the distribution which is most sensitive to A FB . We arrived at this result by analyzing the structure of the axial and vector couplings in the part of the differential DY cross section which contributes to the asymmetry, and this has been confirmed by the explicit numerical exercise of eigenvectors rotation carried out in this section.
We present further new results analyzing theoretical and systematic uncertainties in the next section: in particular, the correlation between different choices of factorization and renormalization scales in the forward and backward regions, and the impact on PDFs from the most accurate LEP/SLD θ W measurement and global fit of EW parameters.

Theoretical and systematic uncertainties on the A * FB predictions
In this section we discuss the dependence of the A * FB observable on the most important sources of theoretical uncertainty. We first check the theoretical uncertainty from the choice of factorisation (µ F ) and renormalisation (µ R ) scales. For this purpose we employed the "seven points" method, which considers the predictions obtained for the combinations obtained with a relative factor no larger than two between the two scales, from µ F,R /M = 0.5 to µ F,R /M = 2.0.
For this exercise, we have employed the HERAPDF2.0nnlo PDF set. The predictions for the A * FB and their deviation with respect the baseline represented by the "central" (µ F,R /M = 1.0) are visible in Fig. 7(a). Here we have omitted the curves with µ F,R /M = 0.5 and µ F,R /M = 2.0 since they produced the smallest variations with respect to the baseline.
In Fig. 7(b) are shown instead the predictions for the A FB with factorisation scale µ F /M = 2.0 and and renormalisation scale µ R /M = 1.0, and the corresponding curves when the factorisation scale is chosen differently in the phase space regions with cos θ * > 0 and cos θ * < 0. In the bottom panel are also shown their differences.
It is worth to mention that recently a dedicated study on the errors in the PDFs propagating from the missing higher order uncertainty, and an extensive discussion on the analysis of scale variations to quantify their weight has been proposed in Ref. [47].
Another source of uncertainty lies in the employed value of the Weinberg mixing angle. The most accurate measurement comes from LEP and SLD data [48] and gives an absolute error ∆ sin 2 θ W = 16 × 10 −5 , while the most precise estimate is obtained from a global fit of EW parameters [49] resulting in the uncertainty ∆ sin 2 θ W = 6 × 10 −5 . The deviations of the A * FB observable due to the variations of sin 2 θ W are generally small when compared to statistical or the other systematical uncertainties, however, they lead to visible differences in the PDF central values. Again, we use the HERAPDF2.0nnlo PDF set to estimate this effect. In the invariant mass region under analysis, using predictions obtained at LO, we obtain |∆A FB | < 10 −4 when including the error from LEP and SLD measurements or |∆A FB | < 4 × 10 −5 when employing the uncertainty from the global EW fit. When adopting values for sin 2 θ W at the extremes of the LEP-SLD confidence interval, we obtain some differences in the profiled curves, due to the shift of the central value predictions. We show the results of the profiling in the two cases in Fig. 8, adopting the upper limit of the value of sin 2 θ W . Using instead the lower limit of the value of sin 2 θ W one obtains profiled curves mirrored to those with respect to the longitudinal axis. The deviations are clearly more visible in the first case with LEP and SLD accuracy, while we observe smaller differences when employing EW global fit estimates. It is important to mention that historically measurements of the A FB have been used to set constrains on the θ W angle [16][17][18]. One very interesting proposal, to which the results of this work provide strong support, is the implementation of a simultaneous fit of both PDFs and sin 2 θ W .
In the analysis carried out so far, we have neglected any EW radiative corrections to the considered process. Terms of O(α) have nowadays been included in the Dokshitzer-Gribov-Lipatov-Altarelli-Parisi (DGLAP) evolution equations, and Quantum Electro-Dynamics (QED) PDF sets, which consistently account for a photon component within the proton, are well established. In this work we do not include QED or EW corrections, and we limit ourselves to estimating the impact on our analysis when going from a PDF set which includes QED PDFs to a set which does not.
More precisely, we want to check whether in these sets we would obtain substantial differences when importing A * FB data in the profiling (while no QED corrections are taken into account in the matrix elements in both cases). The NNPDF collaboration has recently released a QED PDF set, compatible with the NNPDF3.1 fit, adopting the LUXqed prescription [50] (NNPDF31_nnlo_as_0118_luxqed [51]). We have checked that the differences in the A * FB predictions obtained between the QED and non-QED sets are small, |∆A * FB | < 2 × 10 −4 . Furthermore, as the LUXqed prescription has been widely accepted, it has been shown that the contribution of photon initiated processes to the Drell-Yan spectrum is negligible [30].
The profiling of the NNPDF31_nnlo_as_0118_luxqed set unfortunately cannot be done within the profiling technique implemented in xFitter, because of the "replicas" error method employed in this set whilst no equivalent Hessian PDF set is available. For this reason we have chosen to study the variations from the QED PDF set in the form of a k-factor that was used to rescale the A * FB central value obtained with the NNPDF3.1nnlo set, and we found that the impact on the profiled PDFs is very small. The results of the profiling are visible in Fig. 9.
Higher order EW corrections have been shown to be relevant in the TeV region [52][53][54][55][56][57][58], however, they could also have an impact in the region around the Z peak, where the high statistics allow for very precise measurements, as well as for W W production [16]. Since they are not included in the current analysis, we want to study the impact of these specific  Figure 9. Profiled curves obtained with the NNPDF3.1nnlo and its central value predictions rescaled with a K-factor to match the NNPDF31_nnlo_as_0118_luxqed predictions. The pseudodata corresponds to an integrated luminosity of 3000 fb −1 .
subsets of data in the profiling. For this purpose, we employ again the HERAPDF2.0nnlo PDF set. In the top row of Fig. 10 we show the profiled curves removing the data in the invariant mass interval 84 GeV < M < 98 GeV, corresponding to M Z ± 3Γ Z , while in the bottom row we repeat the same exercise removing the data above the W W production threshold, that is, M > 161 GeV.
In the first case there is a small enlargement of the error bands in the u-valence and d-valence quark distributions, showing some impact of the Z peak data, which is expected because of the large statistic in this invariant mass interval. In the second case instead only the error band of the u-valence quark distribution shows a small increment, meaning that the high invariant mass data has a smaller impact on the profiling, having a worse statistical precision.

Conclusions
High-statistics measurements from the LHC Runs 2, 3 and the HL-LHC stage can be exploited to place constraints on the PDFs. DY processes yielding di-lepton production are a primary channel which may be used to this end. Both cross section and asymmetry distributions can be used for such a purpose.
Concerning the latter, as a counterpart to the lepton charge asymmetry of the CC channel of DY production, in this work we have studied the Forward-Backward Asymmetry A * FB , which can be defined in the NC channel of DY production, and we have performed PDF profiling calculations in the xFitter framework to investigate the impact of A * FB pseudodata on PDF determinations. We have found that new PDF sensitivity arises from the di-lepton mass and rapidity spectra of the A * FB , which encodes information on the lepton polar angle, or pseudorapidity.
With the partial Run 2 integrated luminosity that we have used in this paper (L = 30 fb −1 ) we observe a significant reduction in PDF uncertainties on the u-valence and d-valence distributions in the intermediate x region, which can be further improved exploiting the  Figure 10. Profiled curves obtained with the HERAPDF2.0nnlo using the full set of data, and when removing the data in the invariant mass region around the Z peak (top row) and when removing the data in the invariant mass region above W W production threshold (bottom row). The pseudodata corresponds to an integrated luminosity of 3000 fb −1 .
full Run 2 data set (L = 150 fb −1 ). Adopting the luminosity of Run 3 (L = 300 fb −1 ), we predict the observation of a moderate reduction in PDF uncertainties also on the sea quark distributions. Above this threshold we observe a saturation effect such that when adopting the projected HL stage luminosity (L = 3000 fb −1 ) we notice a smaller reduction of the uncertainties bands compared to the previous cases. Furthermore, we have shown that we obtain very different levels of improvement on each PDF, both in magnitude and in range of x, depending on the specific PDF set under analysis. We have also studied the impact of applying cuts on the di-lepton rapidity. By increasing the rapidity cut, we obtain enhanced sensitivity to quark distributions in the high x region. In this case the high statistic collected during the HL stage will be crucial in order to achieve a sufficient precision in the measurement of the A * FB . Performing a rotation of the eigenvectors and sorting them according to their sensitivity to the A * FB data, we noted a strong correlation between u-valence and d-valence eigenvectors, and that the new data is most sensitive to their charge weighted sum ((2/3)u V + (1/3)d V ), oppositely to the CC lepton asymmetry data, which are instead mostly used to constrain (u V − d V ).
In summary, A * FB revealed itself a new powerful handle in the quest to contain the systematics associated to PDF determination and exploitation in both SM and BSM studies.