Vector boson production in proton-lead and lead-lead collisions at the LHC and its impact on nCTEQ15 PDFs

We provide a comprehensive comparison of W/Z vector boson production data in proton-lead and lead-lead collisions at the LHC with predictions obtained using the nCTEQ15 PDFs. We identify the measurements which have the largest potential impact on the PDFs, and estimate the effect of including these data using a Monte Carlo reweighting method. We find this data set can provide information about both the nuclear corrections and the heavy flavor (strange) PDF components. As the proton flavor determination is dependent on nuclear corrections (from heavy target DIS, for example), this information can also help improve the proton PDFs.

The kinematic (x 1 , x 2 ) space explored by the measurements in this study. We display lines of constant τ = M V / √ s where M V is the invariant mass of the produced W ± /Z vector boson, as well as the center of mass (CM) rapidity y. In case of pPb collisions, we use the standard convention where x 1 corresponds to the proton and x 2 to the Pb momentum fraction.

Introduction
Vector boson production in hadron collisions is a well understood process and serves as one of the "standard candle" measurements at the LHC. W ± and Z bosons are numerously produced in heavy ion pPb and PbPb collisions at the LHC and can be used to gain insight into the structure of nuclear parton distribution functions (nPDFs). As the W ± and Z bosons couple weakly, their interaction with the nuclear medium is negligible which makes these processes one of the cleanest probes of the nuclear structure available at the LHC. The possibility of using vector boson production data to constrain nPDFs was previously considered [11], and this demonstrated the strong potential for the proton-lead data (especially the asymmetries) to constrain the nuclear PDFs. The current LHC measurements for W ± and Z production include rapidity and transverse momentum distributions for both proton-lead (pPb) and lead-lead (PbPb) collisions [1][2][3][4][5][6][7][8][9][10]. Some of these data were already used (along with jet and charged particle production data) in a recent analysis [12,13] employing a reweighting method to estimate the impact of these data on EPS09 [14] and DSSZ [15] nPDFs. 1 The LHC heavy ion W ± /Z data set is especially interesting as it can help to resolve the long-standing dilemma regarding the heavy flavor components of the proton PDFs. Historically, this has been an important issue as nuclear target data (especially ν-DIS) have been essential in identifying the individual parton flavors [17][18][19][20]; however, this means that the uncertainties of the heavy flavors are intimately tied to the (large) nuclear uncertainties. The LHC heavy ion W ± /Z data has the potential to improve this situation due to the following two key features. First, this data is in a kinematic regime where the heavier quark flavors (such as strange and charm) contribute substantially. Second, by comparing the proton W ± /Z data with the heavy ion results we have an ideal environment to precisely characterize the nuclear corrections. The combination of the above can not only improve the nuclear PDFs, but also the proton PDFs which are essential for any LHC study.
In this work we present predictions for vector boson production in pPb and PbPb collisions at the LHC obtained using nCTEQ15 nuclear parton distributions, and perform a comprehensive comparison to the available LHC data. We also identify the measurements which have the largest potential to constrain the nPDFs, and perform a reweighting study which allows us to estimate the effects of including these data in an nPDF fit.
The rest of the paper is organized as follows. Sec. 2 is devoted to predictions of vector boson production at the LHC in nuclear collisions. In particular, we provide an overview of the kinematic range probed by the W ± /Z data and discuss the tools we will use for the calculation. Then we present our predictions for pPb and PbPb collisions at the LHC and compare them with the experimental data and other theoretical predictions. In Sec. 3 we perform a reweighting using nCTEQ15 distributions to assess the impact of the nuclear data on the nPDFs. Finally, Sec. 4 summarizes our results and observations.

W ± /Z Production at the LHC
We begin by presenting our predictions for W ± and Z boson production in nuclear collisions at the LHC using the recently published nCTEQ15 PDFs [21].

Experimental data and theoretical setup
For the theoretical calculations in our study we use the FEWZ (Fully Exclusive W, Z production) [22,23] program version 2.1. Even though FEWZ can compute W and Z production with decays up to next-tonext-to-leading order, we work at next-to-leading order (NLO) to be consistent with the order of evolution of the nPDFs. 2 As FEWZ is designed to handle pp or pp collisions, we have extended it so that two different PDF sets can 2 The CT10 proton PDFs used in the theoretical calculations are also at NLO. be used for the two incoming beams as required for the pPb collisions.
For the lead PDFs we use the nCTEQ15 nPDFs [21], while we use the CT10 distributions [24] for the free protons; the only exception is the use of MSTW2008 PDFs [25] for the LHCb Z boson measurement [5] in order to match the original LHCb publication. Additionally, we compare these results with predictions calculated using nuclei made out of free proton PDFs, and in some cases free proton PDFs supplemented with EPS09 nuclear corrections [14].
We will consider LHC data on W ± and Z boson production from the ALICE, ATLAS, CMS, and LHCb experiments. The exhaustive list of data sets that we use is provided in Table 1 along with the experimental kinematical cuts implemented in the analysis. While there are measurements for both the rapidity and transverse momentum distributions, for this study we will focus only on the rapidity measurements.
Using the transverse momentum (p T ) distributions to study the PDFs is more intricate as it requires resummations in the low p T region where the cross section is maximal; we reserve this for a future study.
In Fig. 1 we display the kinematic space probed by the W ± /Z production process [26]. We translate between the {x 1 , x 2 } and the {y, τ } variables for three values of the collider center of mass (CM) energy, √ s.  Fig. 2: Range of the pPb data used for reweighting. y is rapidity in the CM frame and x 2 is momentum of the parton from the lead beam. Table 2 lists the CM energy per nucleon as a function of the nominal proton beam energy which is determined from the relation: where in case of lead we have A = 208 and Z = 82. Additionally for asymmetric collisions there is a rapidity shift, δy, between the CM and the laboratory (LAB) frame: and in particular for the case of pPb collisions, For the asymmetric case of pPb, we use the convention where x 1 is the proton momentum fraction, and x 2 is the lead momentum fraction. Thus, for pPb at large y CM we have a large proton x 1 and a small lead x 2 ; conversely, at small y CM we have a small proton x 1 and a large lead x 2 .
In Fig. 1, the pair of lines with √ s=2.76 TeV corresponds to PbPb collisions with a beam energy of 3.5 TeV per proton, and √ s=5.02 TeV corresponds to pPb collisions with a beam energy of 4 TeV per proton.

Comparison to Proton-Lead (pPb) data
We first consider the LHC pPb collisions at √ s = 5.02 TeV. The distributions are shown in the CM frame, and include the appropriate rapidity shift according to Eq. (2). In Fig. 2, we display the kinematic range of the pPb  data bins (central values) in the plane (y, x 2 ) where y is the rapidity in the CM frame of the relevant vector boson or lepton, and x 2 the lead parton momentum fraction. As expected, there is little data below x ∼ 10 −3 and most of the constraints from these LHC data are in the low-to mid-x region. Figs. 3, 4 and 5 show our predictions for the AT-LAS [1], CMS [3] and LHCb [5] Z boson production measurements, respectively. In all three cases, results obtained with the nCTEQ15 nPDFs are shown along with those obtained with a lead nucleus composed by Z protons and A − Z neutrons, assuming isospin symmetry and using CT10 PDFs; the ratio of predictions over the data is shown in the lower panel. Note that the errors shown for the nCTEQ15 predictions are for nuclear uncertainties only (and only for the beam with momentum fraction x 2 ) which means that the PDF error of the proton beam is not accounted for. 3 Furthermore, the errors shown for the pPb predictions using lead nuclei constructed from CT10 and MSTW2008 proton PDFs are only for the beam with momentum fraction x 2 . By comparing the proton uncertainties (CT10 and MSTW2008) to the nuclear uncertainties, we see that the nuclear uncertainties are much larger. Examining Figs. 3, 4 and 5, it is interesting to note the following.
1. The data and theory are generally compatible (without significant tension) both with and without nuclear corrections; this situation may change as the experimental errors and nuclear uncertainties are reduced.
2. Focusing on the ATLAS and CMS comparison of Figs. 3 and 4, we observe that the distributions peak at negative rapidities y Z ∼ −1. Referring to Fig. 1, this corresponds to an enhancement of the qq proton-lead luminosity over the pure proton one in the x 2 region ∼ 0.05.
3. Focusing on the LHCb data of Fig. 5, we find good agreement for negative y, but large differences at positive y. Despite these differences, the large uncertainties will yield a reduced impact in our subsequent reweighting procedure.
We now turn our attention to W + and W − production at the LHC. In Figs. 6, 7 and 8 we compare the data obtained by CMS [4], ATLAS [2] and ALICE [6] for W ± production with theoretical predictions obtained with nCTEQ15 and CT10 PDFs. 3 For the symmetric case of PbPb collisions the errors on both beams are taken into account.
We find the W − CMS and ATLAS data are adequately described in the negative rapidity range (y − < 0), but the tensions grow as we move to larger rapidity. This effect is magnified for the case of W + where we see substantive deviations at large rapidity (y + > 1). Referring to Fig. 1, these deviations are in the smaller x 2 region (∼ 3 × 10 −3 ) where we might expect nuclear shadowing of the ud and dū luminosities. 4 However, this low x 2 range is unconstrained by the data currently used in nPDF fits, so these results come from an extrapolation of the larger x 2 region. It is interesting to observe that a delayed shadowing (which shifts the shadowing down to smaller x 2 values) would improve the comparison of the data with the theory in the larger y ± region; this type of behavior was observed in the nuclear corrections extracted from the neutrino-DIS charged current data [27][28][29]. Taking into account the errors from both the experimental data and the theoretical predictions, no definitive conclusions can be drawn at the present time. Notwithstanding, this data has the potential to strongly influence the nPDF fits, especially in the small x 2 region, if the uncertainties could be reduced.
Finally, the ALICE data ( Fig. 8) currently have large uncertainties, and we expect they will have a minimal impact on the reweighting.

Comparison to Lead-Lead data
We now consider the LHC PbPb collisions at √ s = 2.76 TeV. As these beams are symmetric, we now have y CM = y lab . Again, we will use nCTEQ15 [21] and CT10 [24] PDFs for the theoretical predictions. Results from AT-LAS and CMS collaborations are available in the form of either event yields (Z boson production) or charge asymmetries (A ).
In Fig. 9a and 9b we present the comparison of the ATLAS [7] and CMS [9] data with theoretical predictions with nCTEQ15 and CT10 PDFs. Note that the differential cross sections have been normalized to the total cross section. The PbPb data generally exhibits no tension as the distributions are well described across the kinematical range; however, this is in part due to the large uncertainties due to two nuclei in the initial state.  The measurement of charge asymmetries can provide strong constraints on the PDF fits as many of the systematic uncertainties cancel in such ratios. In Fig. 10 we compute the lepton ( = [µ, e]) charge asymmetry A (y ): for W + and W − bosons as measured by the ATLAS [8] and CMS [10] experiments. Unfortunately, it appears that the dependence on the nuclear corrections largely cancels out in the ratio as the nuclear nCTEQ15 result is indistinguishable from the CT10 proton result. Hence, these charge asymmetry ratios cannot constrain the nuclear corrections at the present time.

W ± /Z Cross Section Correlations
In order to analyze our results more quantitatively, it is very useful to look at PDF correlations. In particular, we are interested in assessing the importance of the strange quark in our results. We first review some standard definitions before presenting our analysis.  The definition of the correlation cosine of two PDFdependent observables X and Y is [30] cos φ = ∇X · ∇Y ∆X∆Y where ∆X is the PDF error of the corresponding observable. For the nCTEQ15 PDFs this corresponds to the symmetric error given by is the observable evaluated along the ± error PDF eigenvector i, and the summation runs over all eigenvector directions.
In our case we are interested in observables X, Y ∈ {σ Z , σ W + , σ W − }. Here, we focus on the planes formed by the (W + , W − ) and the (Z, W ± ) boson production cross sections to visualize the correlations. Fig. 11 shows the correlations of the W + and W − production cross sections for pPb collisions at the LHC in comparison with the CMS and ATLAS measurements. Similarly, in Fig. 12 we display the results for Z and W ± bosons. The results are shown for three different rapidity regions, y < −1, |y| < 1, y > 1, and for several PDFs sets. For the proton side we always use the CT10 PDFs and for the lead side we examine three results: i) nCTEQ15, ii) CT10, and iii) CT10 PDFs supplemented by the nuclear corrections from EPS09 (CT10+EPS09). Finally, the central predictions are supplemented with uncertainty ellipses illustrating correlations between the cross sections. The ellipses are calculated in the following way [30], where X, Y represent PDF-dependent observables, X 0 (Y 0 ) is the observable calculated with the central PDF, ∆X (∆Y ) is defined in Eq. (5), φ is the correlation angle defined in Eq. (4), and θ is a parameter ranging between 0 and 2π. From Figs. 11 and 12 one can generally observe that the ellipses for the different PDF sets overlap. Furthermore, the central predictions for all three PDF sets lie in the overlapping area of the three ellipses. However, a trend can be observed as a function of the rapidity: 1. For negative rapidities (y < −1), the central predictions from the nuclear PDFs (nCTEQ15, EPS09) are closer to the experimental data as they yield larger cross sections than the uncorrected (proton) CT10 PDFs. This can be understood because the lead x 2 values probed in this rapidity bin lie in the region x 2 ∼ 10 −1 where the nPDFs are enhanced due to anti-shadowing (cf., Fig. 9 in Ref. [21]). Due to the larger uncertainties associated with the nCTEQ15 predictions, the ATLAS and CMS cross sections lie within the 1σ ellipse. Conversely, the measured data lie outside the uncorrected (proton) CT10 error ellipsis.
2. For the central rapidity bin (|y| < 1), the predictions from all three PDF sets lie generally very close together. In this case, the probed x 2 values lie in the range 0.007 ≤ x 2 ≤ 0.05 which is in the transition zone from the anti-shadowing to the shadowing region. We find the LHC W + and W − cross sections in Fig. 11 tend to lie above the theory predictions. Examining the Z cross section of Fig. 12, we find the CMS data agrees closely with the theory predictions, while the ATLAS data is larger by approximately 1σ.
3. For the positive rapidity bin (y > 1), we find the central predictions from CT10 match the W ± data very closely, but slightly overshoot the Z data. The nuclear PDFs (nCTEQ15, EPS09) undershoot the W ± data by a bit more than 1σ, but agree with the Z cross section within 1σ. Here, the probed x 2 values are 0.007; in this region the lead PDFs are poorly constrained and the corresponding cross sec- tions are dependent on extrapolations of the PDF parameterization in this region.
Interpreting the above set of results appears complicated, so we will try and break the problem down in to smaller components. We now compute the same results as above, but using only 2 flavors (one family) of quarks: {u, d}; specifically, these plots are produced by zeroing the heavy flavor components (s, c, b), but keeping (u, d) and the gluon. For the Z production this eliminates the ss and (the smaller) cc contributions, while for W + /W − production it is thesc/sc contribution which drives the change. While the charm PDF does play a role in the above (the bottom contribution is minimal), c(x) is generated radiatively by the process g → cc (we assume no intrinsic component); thus, it is essentially determined by the charm mass value and the gluon PDF. In contrast, the "intrinsic" nature of the strange PDF leads to its comparably large uncertainties. For example, if we compare the free-proton PDF baselines (CTEQ6.1, CT10), the strange quark exhibits substantial differences while the charm (and bottom) distributions are quite similar; this pattern then feeds into the nPDFs. Therefore, the strange quark PDF will be the primary focus of the following discussion.
Examining Fig. 13, the shift of the 2 flavor results compared to the 5 flavor results can be as large as ∼30% and reflects, the contributions of the strange and charm quarks.
For the 5 flavor case ( ), the calculations are scattered to the low side of the data in both W + and W − . The CT10 result is the closest to the data, but due to the larger uncertainties of nCTEQ15, the data point is within range of both of their ellipses. We also observe that the CT10+EPS09 and CTEQ6.1+EPS09 results bracket the nCTEQ15 value; again, this is due to the very different strange PDF associated with CT10 and CTEQ6.1.
For the 2 flavor case (•), all the nuclear results (nCTEQ15, CT10+EPS09, CTEQ6.1+EPS09) coalesce, and they are distinct from the non-nuclear result (CT10). This pattern suggests that the nuclear corrections of nCTEQ15 and EPS09 for the {u, d} flavors are quite similar, and the spread observed in the 5 flavor case comes from differences of s(x) in the underlying base PDF. Thus we infer that the difference between the nuclear results and the proton result accurately represents the nuclear corrections for the 2 flavor case (for {u, d}), but for the 5 flavor case it is a mix of nuclear corrections and variations of the underlying sea quarks. except it is divided into rapidity bins. As we move from negative y to positive y we move from high x where the nPDFs are well constrained to small x where the nPDFs have large uncertainties (cf., Fig. 2). Thus, it is encouraging that at y < −1 we uniformly find the nuclear predictions yield larger cross sections than the proton results (without nuclear corrections) and thus lie closer to the LHC data.
Conversely, for y > 1 we find the nuclear predictions yield smaller cross sections than the proton results. The comparison with the LHC data varies across the figures, but this situation suggests a number of possibilities.
First, the large nPDF uncertainties in this small x 2 region could be improved using the LHC data.
Second, the lower nPDF cross sections are partly due to the nuclear shadowing in the small x region; if, for example, this shadowing region were shifted to even lower x values, this would increase the nuclear results. Such a shift was observed in Refs. [27][28][29] using charged current neutrino-DIS data, and this would move the nuclear predictions of Fig. 11     Finally, we note that measurements of the strange quark asymmetry [31] indicate that s(x) =s(s) which is unlike what is used in the current nPDFs; this would influence the W ± /Z cross sections separately as (at leading-order) [26] W + ∼sc, W − ∼ sc, and Z ∼ss. As the strange PDF has a large impact on the W ± /Z measurements, this observation could provide incisive information on the individual s ands distributions.
These points are further exemplified in Fig. 15 which displays W ± production for both 2 and 5 flavors as a function of lepton rapidity y ± . For large y ± , (small lead x 2 ) the CT10 proton result separates from the collective nuclear results; presumably, this is due to the nuclear shadowing at small x 2 . Again, we note that in this small x 2 region there are minimal experimental constraints and the nPDFs come largely from extrapolation at higher x 2 values. Additionally, by comparing the 2 and 5 flavor results, we clearly see the impact of the heavier flavors, predominantly the strange quark PDF.
Furthermore, different strange quark PDFs in the baseline PDFs compared in Figs. 11 and 12, make it challenging to distinguish nuclear effects from different strange quark distributions. Thus, we find that the extraction of the nuclear corrections is intimately intertwined with the extraction of the proton strange PDF, and we must be careful to separately distinguish each of these effects. Fortunately, the above observations can help us disentangle these two effects.

Reweighting
In this section we perform a reweighting study to estimate the possible impact of the W ± /Z data on nCTEQ15 lead PDFs. For this purpose we will use only the pPb data sets.
We refrain from using PbPb data as typically the agreement of these data with current nPDFs is much better (in part due to the large uncertainties), so the impact in the reweighting analysis will be minimal. Secondly the factorization in lead-lead collisions is not firmly established theoretically [32] such that the interpretation may be complicated.

Basics of PDF reweighting
In this section we summarize the PDF reweighting technique and provide formulas for our specific implementation of this method. Additional details can be found in the literature [33][34][35][36][37].
In preparation for the reweighting, we need to convert the nCTEQ15 set of Hessian error PDFs into a set of PDF replicas [12,38] which serve as a representation of the underlying probability distribution. The PDF replicas can be defined by a simple formula, 5 where f 0 represents the best fit (central) PDF, f are the plus and minus error PDFs corresponding to the eigenvector direction i, and N is the number of eigenvectors defining the Hessian error PDFs. Finally, R ki is a random number from a Gaussian distribution centered at 0 with standard deviation of 1, which is different for each replica (k) and each eigen-direction (i).
After producing the replicas, we can calculate the average and variance of any PDF-dependent observable as moments of the probability distribution: In particular, it can be done for the PDFs themselves; we should be able to reproduce our central PDF f 0 by the average f , and the (68% c.l.) Hessian error bands ) 2 by the corresponding variance δ f . Of course, the precision at which we are able to reproduce Hessian central PDFs and corresponding uncertainties depends on how well we reproduce the underlying probability distribution, and this will depend on the number of replicas, N rep , we use. In the following we use N rep = 10 4 which allows for a very good reproduction of both central and error PDFs (within ∼ 0.1% or better).
We note here that since the nCTEQ15 error PDFs correspond to the 90% confidence level (c.l.) we need to convert the obtained uncertainties such that they correspond to the 68% c.l. 6 The conversion is done using the following approximate relation between the 68% c.l. and 90% c.l. Hessian uncertainties: Fig. 16 we perform the above exercise and determine if our procedure is self consistent. Specifically, in Fig. 16a we display the central value and uncertainty bands for the original gluon PDF and those generated from the replicas; they are indistinguishable. Additionally, in Fig. 16b we demonstrate the convergence of   Having defined the replicas we can apply the reweighting technique to estimate the importance of a given data set on our current PDFs. The idea is based on Bayes theorem which states that the posterior distribution representing the probability of a hypothesis (new probability distribution representing the PDFs if we would perform a fit including the new data set we are using in the reweighting) is a product of the prior probability (PDFs without the new data set) and an appropriate likelihood function. This allows us to assign a weight to each of the replicas generated earlier according to eq. (7).
In the context of Hessian PDFs using a global tolerance criterion the appropriate weight definition is given by a modified Giele-Keller expression [12,33,36,37] where T is the tolerance criterion used when defining Hessian error PDFs 8 and χ 2 k represents the χ 2 of the data sets considered in the reweighting procedure for a given replica k. The pPb W ± and Z data do not provide correlated errors (the published errors are a sum of statistical and systematic errors in quadrature) 9 so it is sufficient for our analysis to use a basic definition of the χ 2 function given by where index j runs over all data points in the data set(s), N data is the total number of data points, D j is the experimental measurement at point j, σ j is the corresponding experimental uncertainty, and T k j is the corresponding theoretical prediction calculated with PDFs given by replica k.
With the above prescription we can now calculate the weights needed for the reweighting procedure. The expectation value and variance of any PDF-dependent observable can now be computed in terms of weighted sums: For our reweighting analysis we will only use the pPb data sets. Because the uncertainty of the nuclear PDFs dominates the proton PDFs, it is sufficient to only vary the lead PDFs. Consequently, the pPb cross sections are linear in the lead uncertainties, and we can compute the reweighting by evaluating cross sections only on the Hessian error PDFs (32+1 in case of nCTEQ15) instead of the individual replicas (N rep = 10 4 ) A similar decomposition can be used for pp or PbPb data to reduce the number of necessary evaluations. However, because of the quadratic dependence on the PDFs, the reduction is smaller and does not necessarily lead to lower computational costs. We will compare the χ 2 for each experiment calculated with the initial PDFs (before reweighting) and with the PDFs after the reweighting procedure; this will allow us to estimate the impact of each individual data set. We do this using the following formula where T j is a theory prediction calculated as an average over the (reweighted or not-reweighted) replicas according to eq. (11) (with or without weights). Finally, the effectiveness of the reweighting procedure can be (qualitatively) estimated by computing the effective number of replicas defined as [35]: N eff provides a measure of how many of the replica sets are effectively contributing to the reweighting procedure. By definition, N eff is restricted to be smaller than N rep . However, when N eff N rep it indicates that there are many replicas whose new weight (after the reweighting procedure) is sufficiently small that they provide a negligible contribution to the updated probability density. This typically happens when the new data is not compatible with the data used in the original fit, or if the new data introduces substantial new information; in both cases, the procedure becomes ineffective and a new global fit is recommended.

Reweighting using CMS W ± rapidity distributions
As an example, we consider the reweighting using the CMS W ± production data from pPb collisions [4]. In this example we use rapidity distributions of charged leptons originating from the decay of both W + and W − bosons with N rep = 10 4 replicas leading to N eff = 5913. In Fig. 17 we display the distribution of the weights obtained from the reweighting procedure. We see that the magnitudes of the weights are reasonable; they extend up to ∼ 9 with a peak at the lowest bin. It will be useful to compare this distribution with later results using different observables and data sets.
In Fig. 18 we show the comparison of the data to theory before and after the reweighting procedure. 10 As expected, we see that after the reweighting procedure the description of the data is improved. This is true for both the W + (left figure) and W − (right figure) Fig. 18: Comparison of data and theory before and after the reweighting procedure for the rapidity distributions of charged leptons in W ± production measured by CMS [4].    by examining the χ 2 /N data for the individual distributions. For the W + case, the χ 2 /N data is improved from 5.07 before reweighting to 3.23 after reweighting. Similarly, for W − the χ 2 /N data is improved from 4.57 to 3.44. The amount of change due to the reweighting procedure should be proportional to the experimental uncertainties of the incorporated data; this is the same as we would expect from a global fit. For W ± production investigated here, the uncertainties are quite substantial, and the effects are compounded by the lack of correlated errors. Finally, we show the effect of the reweighting on the PDFs themselves. In Fig. 19, we display PDFs for the up quark and gluon at a scale of Q = 10 GeV. We can see that the reweighting has the largest effects in the low x region, and this holds also for the other flavors as well. Generally the effects at intermediate and large x values are limited, with the exception of the gluon which is poorly constrained and exhibits a substantial change for large x.
In Figs. 18 and 19, in addition to the reweighting results, we also show results calculated using the Hessian profiling method [37]. The Hessian profiling should agree precisely with our reweighting calculations, and this can serve as an independent cross-check of our results. Indeed, in the figures we observe that the profiling exactly matches the reweighted results. In the following figures we will display only the reweighting results, but in all presented cases we have checked that these two methods agree.

Using Asymmetries instead of differential cross sections
In this section we will re-investigate the reweighting analysis from the previous section employing the CMS W ± production data. Instead of using rapidity distributions (as in the previous section), we will use two types of asymmetries which are constructed with the charged leptons. The lepton charge asymmetry is and is defined per bin in the rapidity of the charged lepton where N l ± represents the corresponding number of observed events in a given bin. For the purpose of the theory calculation, N l ± will be replaced by the corresponding cross-section in a given bin. It is useful to consider the expression for the charge asymmetry at leading order in the parton model assuming a diagonal CKM matrix: Here, the partons with momentum fraction x 1 are in the proton, and those with momentum fraction x 2 are inside the lead. At large negative rapidities (small x 1 , large x 2 ), we have f (x 1 ) =f (x 1 ) for all parton flavors (f = u, d, s, c) and the expression for the asymmetry simplifies to the following form Assuming as it is the case in all the existing nPDF sets, the expression further simplifies In the last equation, we have used the fact that the small x 1 up and down PDFs are very similar [40]. Since the d Pb v (x 2 ) > u Pb v (x 2 ), we expect the asymmetry to be negative at large negative rapidities. One can also observe that the asymmetry calculated with either n f = 2 or n f = 5 will be the same. A non-zero strange asymmetry (s(x 2 ) >s(x 2 )) would lead to a decrease of the A asymmetry, thereby improving the description of the CMS data.
Conversely, at large positive rapidities (small x 2 , large x 1 ), we have f (x 2 ) =f (x 2 ) for all parton flavors (f = u, d, s, c) and the expression for the asymmetry becomes Again, assuming c(x 1 ) =c(x 1 ) and s(x 1 ) =s(x 1 ), this expression further simplifies to where we have again used u(x 2 ) d(x 2 ) at small x 2 . Since u v (x 1 ) > d v (x 1 ) in the proton, we expect a positive asymmetry in the kinematic region of large positive rapidities. Furthermore, the reweighting of the nuclear PDFs will have very little impact on the charge asymmetry in this limit even if the precision of the data will increase in the future.
Another asymmetry used by CMS is the forwardbackward asymmetry. This is defined as a ratio of the number of events in the forward and backward region in a given rapidity bin: This asymmetry is defined separately for the W + and W − cases. It can also be combined into a single quantity, the forward-backward asymmetry of charge-summed W bosons: This is the quantity we will use for our analysis in this section.
We now use the asymmetries of Eqs. (15) and (22) to perform a reweighting of the nCTEQ15 lead PDFs. These asymmetries are just combinations of the rapidity distributions used in Sec. 3.2, and if both are employed at the same time they should encode similar information to the rapidity distributions themselves. In the literature it is sometimes argued that the asymmetries are more sensitive to the PDFs and in turn are better suited to performing PDF fits [4,11,12]. We will empirically check this statement by comparing reweighting predictions using rapidity distributions and the above mentioned asymmetries.
In the following, we present the results of the reweighting using the lepton charge asymmetry and forwardbackward asymmetry of charge-summed W bosons. In this case, the effective number of replicas is N eff = 7382.
The distribution of weights is displayed in Fig. 20, and we can see that compared to the reweighting using directly the rapidity distributions (Fig. 17), the weights are smaller extending only to around ∼ 2.7 and more evenly distributed.
In Fig. 21 we show a comparison of data and theory before and after the reweighting procedure. In the case of the charge asymmetry we do not see a large improvement, but this is not surprising as there is already good agreement between the data and theory before the reweighting. We note that the χ 2 /N data before the reweighting is 1.44 and 1.27 after the reweighting.
In the case of the forward-backward asymmetry the initial agreement between data and theory is not as good and the corresponding improvement is much larger; χ 2 /N data changes from 4.03 to 1.31.
We now show the effect of the reweighting procedure on the PDFs. In Fig. 22 we display the PDFs for the up quark and gluon at a scale of Q = 10 GeV. We can see that in both cases the effect is limited to the low x region and does not exceed few percent. The results for other flavors are similar, and overall the asymmetries with the current experimental uncertainties seem to have rather small effect on the nPDFs.
In particular it seems that using asymmetry ratios yields a reduced impact, at least compared to the rapidity distributions of Sec. 3.2. This is possibly due to the fact that much of the information on the nuclear corrections is lost when constructing the ratios. However, asymmetries can be still useful to explore the very forward and backward regions of the rapidity distributions (corresponding to higher/lower x values) where experimental uncertainties are typically large but can cancel in the ratios.

Including all the data sets
Due to large experimental uncertainties, the effect of individual data sets presented in Sec. 2 on the lead PDFs is rather limited. The largest constraint is obtained from the CMS W ± data [4] (Secs. 3.2), and from (preliminary) ATLAS W ± data [2]. In order to maximize the effects on the PDFs, we now employ all protonlead data sets from Tab. 1 to perform the reweighting of the nCTEQ15 lead PDFs. Note that we use both the rapidity distributions and the asymmetries; although this can be regarded as "double counting", it is a common practice in proton PDF analyses, e.g. [19].
As the impact of the reweighting on the theory predictions for ALICE W ± production data [6], LHCb Z data [5] and both ATLAS [1] and CMS [3] Z production data is very small, we will not show the corresponding comparisons of theory predictions before and after the reweighting. We do note that in the majority of these cases the χ 2 /N data has improved indicating that the data sets are compatible, cf. Fig. 24. However, the initial χ 2 for these data sets was already very small which reflects the large experimental uncertainties of these data sets and their limited constraining power on the nPDFs.
We start by examining the distribution of weights of the new replicas which is displayed in Fig. 23. We see that the distribution is steeply falling in a similar manner to the one from Fig. 17 obtained using only CMS W ± rapidity distributions, but it extends to higher values of ∼ 17. These results are not very surprising as the CMS W ± data set is the one introducing the most constraints. We also note that the reweighting procedure results in the effective number of replicas N eff = 3603 which is around 40% of the number of initial replicas. This suggests that the reweighting procedure should still yield reliable results. Now we turn to the comparison of data with the theory predictions before and after the reweighting procedure. In Fig. 25 we show the predictions for the CMS W ± data [4], and in Fig. 26 we show the corresponding predictions for the ATLAS W ± data [2]. We can see that in both cases we observe an improvement in the data description that is confirmed by the corresponding χ 2 /N data values (see figures). The χ 2 values tell us also that the largest effect comes from the CMS data which has smaller errors and for which the initial description (before the reweighting) was worse than in the ATLAS case.
Furthermore, comparing the values of χ 2 /N data for the CMS W ± data after the reweighting using all data sets and using only CMS data (Sec. 3.2) we see further improvement of χ 2 /N data when more data is included. This shows that the different data sets are compatible with each other and that they pull the results in the same direction.
In addition, we show in Fig. 24 the χ 2 /N data before and after the reweighting for each of the experiments, as well as the χ 2 /N data combining all 102 data points from the different experiments. This highlights the fact that the CMS W ± measurement yields the largest impact on the PDFs out of all the considered data sets.        When considering the ratios of PDFs, the effects of the reweighting appear to be quite substantial at large x, especially for the gluon; however, as is evident from looking at the plots of the PDFs directly, they are approaching zero at large x so the impact for physical observables is minimal.
Furthermore, when interpreting the results of the reweighting analysis it is important to remember that this method can only estimate the effects a given data set might have on the PDFs; it is not equivalent to a full fit. For example, a reweighting analysis cannot be used to explore new parameters or other dimensions that are not already spanned by the original PDF uncertainty basis. In particular, this study has shown us that the strange quark PDF can play an important role in the LHC pPb production of W/Z. As our current s(x) is parameterized proportional toū(x) +d(x), this restricts our ability to vary the strange PDF independently; 11 hence, an independent fit (in progress) is needed to better the impact of this data on the nPDFs. 11 This point was explored in more detail in ref. [41].

Comparison with EPPS16
During the course of our analysis, a new global fit including LHC data (EPPS16 [16]) has been released. This gives us an opportunity to compare the results of our reweighting study with these new PDFs. We note here that this is a qualitative comparison as the data sets used in these two studies are different. Another important difference is that the EPPS16 fit has more parameters to describe the sea-quark PDFs as compared to the nCTEQ15 analysis; this provides EPPS16 additional flexibility to accommodate all the considered data. As mentioned earlier, our reweighting of nCTEQ15 cannot compensate for our more restrictive parametrization, so this must be considered when evaluating these comparisons.
In Figs. 30 and 31 we present a comparison of u, d, u,d, g and s for the nCTEQ15 PDFs before and after the reweighting, with the EPPS16 distributions at the scale of 80 GeV. There are a number of trends which emerge.  Fig. 30: Comparison of the nCTEQ15 PDFs before and after the reweighting using all pPb data sets with the EPPS16 PDFs including LHC data. The EPPS16 error bands include only the nuclear errors (unlike what is provided in LHAPDF where also the proton baseline errors are included) and they are calculated using symmetric formula.
data is able to significantly adjust the PDFs in this region.
2. In the intermediate x range (∼ 3 × 10 −2 ), the central values of the EPPS16 and both reweighted and initial nCTEQ15 PDFs coincide, and their uncertainty bands are also similar (except for the strange quark). This region was previously constrained by pre-LHC data, and we observe minimal changes in this region.
3. On the contrary, where x is large, the differences are more important with no consistent pattern. This is a challenging region as the absolute value of the PDFs is small, and the nCTEQ15 parameterization may not be sufficiently flexible to accommodate the new data. Additionally, the inclusion of certain data sets in the EPPS16 analysis (such as the CHORUS ν-Pb data [42]) can have a significant impact.
Finally, we also see that the EPPS16 PDFs have consistently larger uncertainty bands (especially at low x). As the nCTEQ15 uncertainty bands in this region are essentially extrapolated from larger x results, the EPPS16 uncertainties are probably a more realistic assessment. The issue of PDF parameterization is a perennial challenge for the nuclear PDFs as there is less data and more degrees of freedom as compared to the proton PDFs. The common solution is to impose assumptions on the nPDF parameters, or to limit the flexibility of the parameterization, and thereby underestimate the uncertainty. These issues highlight the importance of including this new LHC data in the nPDF analyses as they not only will help determine the central fits, but also provide for more reliable error estimation.

Conclusions
We have presented a comprehensive study of vector boson production (W ± , Z) from lead collisions at the LHC. This LHC lead data is of particular interest for a number of reasons.
1. Comparisons with LHC proton data can determine nuclear corrections for large A values; this is a kinematic {x, Q 2 } range very different from nuclear corrections provided by fixed-target measurements.
2. The W ± , Z lead data are sensitive to the heavier quark flavors (especially the strange PDF), so this provides important information on the nuclear flavor decomposition.
3. Improved information on the nuclear corrections from the LHC lead data can also help reduce proton PDF uncertainties as fixed-target nuclear data is essential for distinguishing the individual flavors.
Predictions from the recent nCTEQ15 nPDFs are generally compatible with the LHC experimental data; however, this is partially due to the large uncertainties from both the nuclear corrections and the data. We do see suggestive trends (for example W ± production in pPb at large y + ) which may impose influential constraints on the nPDF fits as the experimental uncertainties are reduced. Intriguingly, the large rapidity W/Z data seem to prefer nuclear PDFs with no or delayed shadowing at small x, similar to what has been observed in ν-Fe DIS. This observation was validated by our reweighting study that demonstrated the impact of the W/Z pPb data on nPDFs.
The uncertainties of the currently available data are relatively large, and correlated errors are not yet available. Fortunately, we can look forward to more data (with improved statistics) in the near future as additional heavy ion runs are scheduled.
While the above reweighting technique provides a powerful method to quickly assess the impact of new data, there are limitations. For example, the reweighting method cannot introduce or explore new degrees of freedom. Thus, if the original fit imposes artificial constraints (such as linking the strange PDF to the up and down sea distributions), this limitation persists for the reweighted PDF [41].
Most importantly, our correlation study (Sec. 2.4) demonstrated the importance of the strange distribution for the vector boson (W/Z) production at the LHC, possibly even pointing to a nuclear strangeness asymmetry (s(x) >s(x)). The comparison of the 2 flavor and 5 flavor results illustrates how flavor decomposition and nuclear corrections can become entangled. Therefore, it is imperative to separately control the strange PDF and the nuclear correction factor if we are to obtain unambiguous results. The investigations performed in this paper provide a foundation for improving our determination of the PDFs in lead, especially the strange quark component. Combining this information in a new nCTEQ fit across the full A range can produce improved nPDFs, and thus yield improved nuclear correction factors. These improved nuclear correction factors, together with the LHC W/Z production data for pp, can refine our knowledge of the strange PDF in the proton.