A QCD analysis of LHCb D-meson data in p+Pb collisions

We scrutinize the recent LHCb data for D$^0$-meson production in p+Pb collisions within a next-to-leading order QCD framework. Our calculations are performed in the SACOT-$m_{\rm T}$ variant of the general-mass variable-flavour-number scheme (GM-VFNS), which has previously been shown to provide a realistic description of the LHC p+p data. Using the EPPS16 and nCTEQ15 nuclear parton distribution functions (PDFs) we show that a very good agreement is obtained also in the p+Pb case both for cross sections and nuclear-modification ratios in the wide rapidity range covered by the LHCb data. Encouraged by the good correspondence, we quantify the impact of these data on the nuclear PDFs by the Hessian reweighting technique. We find compelling direct evidence of gluon shadowing at small momentum fractions $x$, with no signs of parton dynamics beyond the collinear factorization. We also compare our theoretical framework to other approaches and are led to conclude that a full GM-VFNS calculation is most essential in constraining general-purpose PDFs with D-meson data.


Introduction
In the collinear-factorization approach to describe scattering of protons and heavier nuclei in Quantum Chromodynamics (QCD), the non-perturbative structure of the hadrons -parton distribution functions (PDFs) -is factorized from the perturbatively calculable coefficient functions [1,2]. The PDFs are typically extracted from experimental data via global analysis and their accurate determination has been a long-standing effort in the community [2,3]. For the free proton PDF fits there are plenty of accurate data available and the most recent global analyses [4][5][6][7][8] result with PDFs that are reasonably well constrained within the typical kinematics probed at the Large Hadron Collider (LHC).
For PDFs in heavier nuclei, nuclear PDFs (nPDFs), the available data have been rather sparse until very lately [9]. Indeed, even some recent analyses still rely only on older fixedtarget deep inelastic scattering (DIS) and Drell-Yan (DY) data [10,11]. Due to the relatively low center-of-mass (c.m.) energy √ s, these data provide constraints only for momentum fractions x 0.01, and the gluons are constrained only indirectly via scale-evolution effects and momentum sum rule [12]. To obtain better gluon constraints, the potential of inclusive pion production in d+Au collisions at RHIC [13][14][15][16] was first discussed in ref. [17] and eventually the data were incorporated into the global fits [18][19][20][21]. The x reach was still, however, rather similar to the available DIS data. The currently most comprehensive nPDF analysis, EPPS16 [22], includes also LHC Run-I data for electroweak-boson (W ± and Z 0 ) [23][24][25] and dijet production [26] in p+Pb collisions. Because of the large masses of the W ± and Z 0 bosons, the interaction scale is high and a significant sensitivity to gluons via evolution effects will eventually set constraints on gluons, as has been shown in ref. [27] (sect. 10.4.2). However, the Run-I W ± and Z 0 data have still a rather limited impact due to the low statistics. The dijet production, on the other hand, probes the gluon density much more directly and already the Run-I data clearly helps to narrow down the gluons in the x 0.002 region [28]. All this still leaves the small-x region only weakly constrained.
To probe gluons at small x, almost any conceivable observable at lowish interaction scales and forward rapidity y 0 would do. Good candidates at hadron colliders include e.g. low-mass Drell-Yan dilepton and isolated-photon production at low transverse momentum p T [29][30][31][32][33][34][35]. Isolated photons in p+Pb collisions have already been measured at central rapidities [36], and the large-y measurements appear to be within the capabilities of the LHCb collaboration [37]. In further future, measurements of isolated-photon production would be a central goal of the ALICE FoCal upgrade [38].
Another promising observable for gluon constraints is the inclusive D-and B-meson production where the heavy-quark mass provides the hard scale even at zero p T . In fact, the LHCb collaboration has published low-p T data on D-meson production at forward kinematics in p+p collisions at different √ s [39][40][41], and recently also in the p+Pb case at √ s = 5 TeV [42]. The use of these D-meson data as a free proton and nuclear PDF constraint has been advocated e.g. in refs. [43][44][45][46][47][48] and studied otherwise [49], but for the moment the default sets of globally fitted general-purpose PDFs [4-8, 21, 22] do not include any D-meson data. Here, our purpose is to provide a first estimate of the impact the recent LHCb p+Pb data have on globally fitted nPDFs within a rigorous next-to-leading order (NLO) perturbative-QCD framework. We will focus only on the LHCb measurements [42], as the the central-rapidity ALICE [50] data are not as precise and as the ATLAS central-rapidity data [51] are only preliminary. The cross sections are calculated in the SACOT-m T general-mass variable-flavour-number scheme (GM-VFNS) presented in ref. [52]. Our framework takes fully into account the D mesons produced by gluon fragmentation -something that has been overlooked in the above-mentioned works [43][44][45][46][47][48] -and thus provides a realistic estimate of the data impact. To quantify the impact on the EPPS16 [22] and nCTEQ15 [21] nPDFs, we will use the Hessian reweighting technique [28,[53][54][55] that facilitates an estimate of the data impact without re-doing the complete global analysis. The paper will now continue as follows: In section 2, we introduce our theoretical setup, including the GM-VFNS framework and the applied reweighting machinery. Then, in section 3, we compare the resulting cross sections and nuclear modification ratios with the LHCb data, demonstrate the impact these data have on nPDFs, and discuss their sensitivity to small-x gluons. We summarize our findings in section 4.

SACOT-m T scheme for heavy-quark production
The general idea of D-meson hadroproduction in the GM-VFNS approach [52,56] is to reproduce the results of (3-flavour) fixed flavour-number scheme (FFNS) at the small p T limit and match to the massless calculation at high values of p T . Let us first discuss the FFNS limit, in which the cross section for inclusive production of a heavy-flavoured hadron h 3 at a given transverse momentum P T and rapidity Y in a collision of two hadrons, h 1 and h 2 , can be written as dσ ij→Q+X dp T dy (τ 1 , τ 2 , m, µ 2 ren , µ 2 fact ) . (2.1) In this expression, f h 1,2 i,j are the PDFs (in 3-flavour scheme) for partons i and j in hadrons h 1 and h 2 with momentum fractions x 1 and x 2 , and dσ ij→Q+X /dp T dy denote the perturbatively calculable coefficient functions for inclusive heavy-quark Q (here charm) production [57] with fixed rapidity y and transverse momentum p T of Q. The renormalization and factorization scales are denoted by µ 2 ren , µ 2 fact and m is the heavy-quark (here charm) mass. The fragmentation of a heavy-quark to hadron h 3 is described by a scale-independent fragmentation function (FF) D Q→h 3 (such as in ref. [58]). The invariants τ i can be calculated from the partonic transverse mass m T = p 2 T + m 2 and rapidity y as where p 1 and p 2 are the momenta of the incoming massless partons, and p 3 is the final-state heavy-quark momentum. When masses are neglected, the relation between partonic and hadronic variables is simply y = Y and P T = zp T . However, when the masses of the heavy quark and the final-state hadron are taken into account, the definition of z becomes ambiguous [59]. Adopting the choice made in [52], where P i is the momentum of hadron h i , the z variable can be interpreted as the fraction of partonic energy carried by the outgoing hadron in the c.m. frame of the initial-state hadrons h 1 and h 2 . The relations between partonic and hadronic variables become somewhat more involved, but eq. (2.1) stays intact. When the transverse momentum of the produced hadron h 3 is large, P T m, the heavy-quark mass can be neglected and thus the zero-mass description becomes the most relevant. In this limit, the cross section can be written as [60], dσ ij→k+X dp T dy (τ 0 1 , τ 0 2 , µ 2 ren , µ 2 fact , µ 2 frag ) .

(2.4)
The formal difference with respect to eq. (2.1) is that now the FFs are fragmentation-scale µ 2 frag dependent, and a summation over all partonic channels is included. For massless partons the invariants τ 0 i are obtained as The GM-VFNS technique [52,56] provides a general framework to match the two extremes of eq. (2.1) and eq. (2.4) in a way that is consistent with collinear factorization. If we start from the FFNS description and increase P T , the cross sections will quickly be dominated by log(p T /m) terms whose origin is in the initial-and final-state partons' collinear splittings into QQ pairs. In GM-VFNS these logarithms are resummed to the scale-dependent heavy-quark PDFs and scale-dependent FFs. Because the FFNS expressions already contain the first of the resummed logarithmic terms, subtraction terms are needed to avoid double counting and ensure the correct zero-mass limit of eq. (2.4). For example, the inclusion of the gluon production channel gg → gg, on top of eq. (2.1), must be accompanied by a subtraction term which has otherwise the same expression as eq. (2.6) but where the gluon-to-h 3 FF is replaced by which is the first term in the definition of scale-dependent FFs with massive quarks. In an NLO-accurate O(α 3 s ) calculation, only the leading-order part of dσ gg→g+X is included in the subtraction term. However, the exact form of dσ gg→g+X in the equation above is not fixed by this construction. The only condition is that we recover the standard zero-mass MS expression at p T → ∞ to meet eq. (2.4). This means that we can include mass-dependent terms in dσ gg→g+X as we like, and a specific choice defines a scheme. The difference between the added and subtracted contributions discussed above is formally of order O(α 4 s ), so that different schemes are formally equivalent up to O(α 3 s ). Here we adopt the so-called SACOT-m T scheme [52]. It is rooted in a simple observation that in order to make a heavy-flavoured hadron in QCD, a QQ pair must be first produced. That is, the relevant invariants to describe the process are the massive ones,τ 1,2 = τ 1,2 , even for seemingly massless partonic contribution (like the gg → gg channel). Importantly, the mass then prevents the partonic cross sections from diverging towards small p T exactly in the same way as the FFNS cross section are finite at p T = 0. In the previous GM-VFNS approach [56] such a physical behaviour is obtained only by a particular choice of QCD scales [61,62].
The final differential cross sections are then calculated by using the FFNS expressions for the explicit QQ production, and for all other channels zero-mass expressions with the mentioned massive kinematics. The subtraction terms discussed above are included to avoid double counting and to ensure proper matching between α s and PDFs in 3-and 4-flavour schemes. The switch from 3-to 4-flavour scheme is done at the charm-mass threshold. The bottom decays to D 0 are an order of magnitude smaller [63] than the "direct" charm fragmentation to D 0 . Thus, the treatment of the bottom mass is not as critical, and in our present setup we switch from 4-to 5-flavour scheme at the bottom-mass threshold with no matching conditions and ignoring the bottom mass. For the numerical implementation of the described SACOT-m T scheme the massless NLO matrix elements are obtained from the incnlo [60] code and the FFNS part with explicit heavy-quark production is obtained from the mnr code [64]. As presented in refs. [52,63], this framework is in a very good agreement with the ALICE [50,63] and LHCb [39][40][41] data for inclusive D-meson production in p+p collisions in a broad rapidity range.

Powheg+Pythia approach
We will also contrast our results in the SACOT-m T framework with a Monte-Carlo based NLO computation that is often applied to heavy-meson phenomenology at the LHC in the context of PDFs [44,45,65]. This approach is based on the Powheg method [66] to combine NLO matrix elements with a parton shower and hadronization from a general-purpose Monte-Carlo event generator. The underlying idea is to generate the partonic 2 → 2 and 2 → 3 events with the NLO-correct matrix elements. These events are then passed to any parton shower generator that provides the rest of the partonic branchings, accounting for the fact that the first one may already have occurred. The parton shower can be considered as being analogous to the scale evolution of FFs and PDFs as the splitting probabilities are based on the DGLAP evolution equations in both cases.
We generate the partonic events with the heavy-quark pair production (hvq) scenario [67] of the Powheg Box framework [68] which we pass on to Pythia 8 [69] for showering and hadronization. As Powheg generates only events where the heavy-quark pair is produced in the Born-level process or in the first (hardest) splitting, it ignores the component where the QQ would be created only later on in the shower e.g. starting from a hard gg → gg process. Such contributions are, however, effectively included in any GM-VFNS framework via the scale-dependent PDFs and FFs. Since charm quarks are abundantly produced in parton showers at the LHC energies [70], truncating the resummation of the splittings to the first one will underestimate the charmed-meson cross section as pointed out in ref. [52]. Similarly the results in ref. [44] show that the cross sections obtained with this method are below the D-meson data measured by LHCb and a compatibility can be concluded only due to large scale uncertainties. Furthermore, this is bound to result as an overestimate of the sensitivity to low-x PDFs as the neglected contributions with several emissions would always require a higher value of x to produce a heavy meson at a fixed P T and Y . Therefore these comparisons should be taken as an estimation for the effect of truncating the chain of partonic splittings.

Reweighting machinery
We will quantify the impact of the single inclusive D 0 -meson production data in p+Pb collisions on nuclear PDFs by the Hessian reweighting method [28,[53][54][55]. The method has recently been discussed at length e.g. in ref. [28] so here we only outline the basic underlying idea. Let us consider a global PDF analysis whose fit parameters a i are tuned to minimize a global χ 2 function, The χ 2 function is expanded around the best fit as where H ij is the Hessian matrix, H ij = 1 2 ∂ 2 χ 2 /(∂a i ∂a j ). Denoting by O the orthogonal matrix that diagonalizes the Hessian matrix, OHO T = I, the z i variables are linear . We refer to the best-fit as S 0 , and it corresponds to the point z = 0. The Hessian error sets S ± k can then be defined by z i (S ± k ) = ± ∆χ 2 δ ik , where ∆χ 2 is the estimated tolerance. It follows [71] that for any PDF-dependent quantity X there are unique points in the z space that extremize its positive and negative deviations from the central value X(S 0 ). These deviations, ∆X ± , are given by This, or its asymmetric version (see later), is normally quoted as the uncertainty in Hessian PDF fits. In a global analysis, the χ 2 contributions of individual data sets are simply summed in the overall χ 2 . Thus, if we wish to include a new set of data into our global fit, we just add its contribution to eq. (2.8), where y data i denote the new data points with a covariance matrix C ij . The PDF-dependent values y i {z} can now be approximated linearly as and by substituting this into eq. (2.10), we see that χ 2 new is still quadratic in variables z k and has therefore a unique minimum which we denote by z k = z min k . Note that we do not need to know the value of χ 2 0 . The PDFs f new i (x, Q 2 ) that correspond to this new minimum are obtained by replacing y i in eq. (2.11) by PDFs, Since we now know χ 2 new analytically, we can repeat the original treatment by computing the new Hessian matrix and diagonalizing it exactly the same way as outlined above. As a result, we have an approximation of how a new set of data has affected a set of PDFs and its errors. In comparison to a full global analysis, the advantage of the reweighting technique is that it avoids the time-consuming fitting procedure which, in practice, is only available to the people that performed the PDF analysis itself. In addition, and also importantly, there is no need to implement a potentially CPU-expensive cross-section computation as a part of the fitting framework or to compute partial cross sections to form three dimensional (x 1 ,x 2 ,µ 2 fact ) grids to facilitate a rapid cross-section evaluation. The downside is that since the reweighting method relies completely on the assumptions made in the prior PDF analysis, including e.g. a specific parametrization which may artificially overestimate the impact in a kinematic region beyond the reach of a given observable.
The Hessian reweighting method sketched above relied on a linear approximation for the PDFs and observables in the z space, and on a quadratic expansion of the original χ 2 function. These are not always good approximations and, as described in ref. [28], the results can be refined by taking into account higher order terms in z. The results presented in this paper (section 3.3) have been obtained using a quadratic extension of the approximation made in eq. (2.11). In the case of EPPS16 we also take into account cubic terms in the original χ 2 profile, eq. (2.8). See ref. [28] for further technical details.

Results
Throughout this section, we will use two recent globally-fitted nPDF sets, EPPS16 [22] and nCTEQ15 [21] in our calculations. In the case of EPPS16 we use CT14NLO [5] as the free proton PDF set and with nCTEQ15 we use its own proton PDF (with no uncertainties on it). As a default setup for the GM-VFNS calculation we adopt the KKKS08 [72] parton-to-hadron FFs and set the renormalization and factorization scales as µ ren = µ fact = P 2 T + m 2 c with m c = 1.3 GeV for the charm quark mass. For the fragmentation scale we set µ frag = P 2 T + (1.5 GeV) 2 as the KKKS08 analysis assumed this slightly higher value for the charm-quark mass. In the matrix elements we always use m c = 1.3 GeV. For the D 0 mass, relevant for transforming the partonic kinematics to hadronic ones, we adopt the value M D 0 = 1.87 GeV [73]. With the Powheg approach, we use the same nuclear and proton PDFs and the same value for the charm mass but the renormalization and factorization scales are fixed to transverse mass of the produced charm quark, p 2 T + m 2 c . At the time of generating the partonic events with Powheg it is not yet known which P T the D meson will have (if formed at all), so relating the scales to the partonic variables is the only reasonable option. The parton shower and hadronization for the Powheg events are generated with the Pythia version 8.235 [69] using parameters from the default Monash tune [74].

Double-differential cross section for D 0 production in p+Pb collisions
To benchmark our GM-VFNS framework in p+Pb collisions we first compare our calculations with the double-differential single-inclusive D 0 production cross section measured by LHCb [42]. This comparison is important since a good agreement with the measured cross sections would indicate that the framework includes e.g. all the relevant partonic processes. In this way we ensure that the framework is realistic.
In figure 1 we compare the calculated cross sections with the LHCb data at backward rapidities (Pb-going direction) in five different rapidity bins spanning −5.0 < Y < −2.5 in the nucleon-nucleon (NN) c.m. frame. The resulting cross sections with the default setup are shown for both the EPPS16 and nCTEQ15 nPDFs, whereas the theoretical uncertainties are quantified with EPPS16 only. These include now scale variations and PDF uncertainties. The former are calculated by varying the three QCD scales independently by a factor of two around the default choice. In addition, ratios µ fact /µ ren and µ frag /µ ren are required to stay within [0. 5,2] and the mass of the charm quark is used as a lower limit for all scales. For the PDF uncertainties the error bands from proton and nuclear PDFs are added in quadrature as they are approximately independent in the EPPS16 global analysis. Here, we use the asymmetric error prescription where the sum now runs over both the EPPS16 and CT14NLO error sets. Uncertainties due to the mentioned ambiguity in defining the fragmentation variable z, FFs, or e.g. variation in charm-quark mass are not considered. In addition to the GM-VFNS results, comparison with the Powheg+Pythia setup is shown. The correspondence between the data and the GM-VFNS calculation with both EPPS16 and nCTEQ15 is found to be very good, though the theoretical uncertainties become large at P T < 3 GeV. Interestingly the PDF uncertainty at small P T is large above the central result but small below it. This can be traced back to the parametrization applied in the CT14 analysis where the requirement for positive-definite PDFs limits the small-x behaviour as already the central set for gluons near the initial scale Q 2 0 at small x is close to zero. Since similar positivity restriction was not applied in NNPDF3.1 [7], the PDF uncertainties shown in ref. [52] behave in a different manner at small values of P T . As in the p+p case [52], the cross sections obtained from the Powheg+Pythia setup fall below the data. As discussed in ref. [52] and mentioned in the preceding section, this likely follows from truncating the collinear splittings producing QQ pairs after the hardest one. The corresponding cross sections at forward rapidities (p-going direction) are shown in figure 2. Here the five rapidity bins cover the range 1.5 < Y < 4.0. The conclusions are very similar as at backwards rapidities, the agreement between the GM-VFNS calculation and the data being very good, particularly at P T 3 GeV where the theoretical uncertainties are in control. The comparisons with the absolute cross sections lead us to conclude that the SACOT-m T framework [52] works very well also for p+Pb collisions and can be faithfully applied to study the nPDF constraints -at least for P T 3 GeV.

Nuclear modification ratio for D 0 production in p+Pb collisions
To constrain nPDFs with D mesons, it is useful to consider an observable in which theoretical uncertainties related to scale variations, free proton PDFs, and FFs cancel out to a large extent. In the case of single-inclusive hadron production a suitable observable is the nuclear modification factor R h 3 AB , defined for inclusive D 0 meson production in p+Pb collisions at  the LHC as We compare our calculations with the measured R D 0 pPb in figures 3 and 4 at backward and forward rapidities, respectively. The LHCb data span over four Y bins in a range −4.5 < Y < −2.5 at backward rapidities and 2.0 < Y < 4.0 at forward rapidities. Comparisons with the EPPS16 and nCTEQ15 nPDFs using the GM-VFNS framework and Powheg+Pythia setup are separately shown in each panel, and the uncertainty bands correspond to the nPDF errors calculated in the GM-VFNS approach. Furthermore, also the GM-VFNS result using the zero-mass definition for the fragmentation variable, and the scale variation band, are shown in each kinematic bin.
First observation is that the data uncertainties are in most of the cases smaller than the nPDF-originating ones with both nPDF sets considered. Especially at forward rapidities the EPPS16 nPDF uncertainty bands are much larger than the experimental uncertainties due to the poorly-constrained small-x nuclear gluon distributions. This demonstrates the potential of these data to significantly constrain the current nPDFs at small-x where no other data currently exist. Also, the good overall agreement with the calculated and measured R D 0 pPb over the wide rapidity range provides a strong indication of the applicability of factorization-based approach in this previously unconstrained kinematic region. The large uncertainties from scale variations observed for the differential cross sections largely cancel out in the nuclear modification ratio. However, at P T < 3 GeV they start to grow and the downward uncertainty is limited by the minimum scale Q = 1.3 GeV of EPPS16 and nCTEQ15. If the PDF parametrizations would extend to lower values, the downward uncertainty would probably be much larger. Similarly, the use of massless definition for the fragmentation variable z -taken here as an indicator of the associated uncertaintycan lead to a significant variation in the calculated R D 0 pPb at small values of P T at backward rapidities. The reason is that the definition of z provides the link between hadronic and partonic kinematics and therefore the probed x regions are slightly different from one definition to another. In backward direction we are sensitive to the mid-x region where the slope in both EPPS16 and nCTEQ15 nuclear gluon modifications is somewhat steepish (see figures 9 and 11 ahead), and changes in the probed x regions matter. To make sure that we stay in a region where these theoretical uncertainties are in control, it seems sufficient to discard the data points below P T = 3 GeV.
Since many theoretical uncertainties get suppressed in R D 0 pPb , we might expect that the Powheg+Pythia results would be very close to GM-VFNS ones. While the two are indeed very similar, we find that the Powheg+Pythia results tend to lie systematically below the GM-VFNS calculations. In part, the differences can be explained by the different scale choices (p T instead of P T ) but since the differences persist even at the largest P T bins, this cannot be the full explanation. Indeed, the main factor seems to be, as argued also in ref. [52], that Powheg+Pythia framework misses the contributions in which the cc pair would be produced only at later stages of the shower and therefore biases the kinematics to lower values of x 2 compared to the GM-VFNS calculation. Thus, the nuclear effects in the Powheg+Pythia predictions at a given P T come from smaller x 2 than in GM-VFNS. This explains why, when compared to the GM-VFNS results, the nuclear effects in Powheg+Pythia predictions are seemingly shifted towards higher values of P T in all rapidity bins, apart from the very lowest P T bins where the impact of the scale choice becomes important. We again emphasize that the difference between the two frameworks should not be taken as an additional theoretical uncertainty but as a measure of the effect arising from truncating the series of collinear partonic splittings.

Impact of the LHCb data on nPDFs
The observed consistency between the measured and calculated R D 0 pPb indicates that these data could be used in a global nPDF analysis. As a preparation for this, we now estimate the impact of the LHCb data for R D 0 pPb on the EPPS16 and nCTEQ15 nPDFs by applying the reweighting method outlined in section 2.3. By excluding the data points at P T < 3 GeV we are left with N data = 48 data points. The level of agreement is quantified by calculating the standard figure-of-merit χ 2 before and after reweighting. The numbers are presented in table 1. Before the reweighting, the central nCTEQ15 value is somewhat high, but upon performing the reweighting both the EPPS16 and nCTEQ15 values are close to unity, indicating a good agreement with the data. To further study the statistical properties of our results, histograms of the data residuals are shown in figure 5. The residuals are calculated (for uncorrelated errors) as a difference between the theory value T i and corresponding data point D i normalised with the experimental uncertainty δ i . Ideally the distribution of the residuals should follow a Gaussian distribution with standard deviation of one and zero mean to which the calculated values are compared to. In addition, Gaussian fits are performed for the residuals obtained after reweighting to ease the comparison with the ideal distributions. With the original central EPPS16 and nCTEQ15 results the distributions show a behaviour diverting from the ideal Gaussian, but after reweighting a closer resemblance to that is obtained. With both nPDF sets the resulting distributions are slightly narrower than the ideal distribution but the mean is close to zero, confirming a reasonable statistical behaviour.
The results for R D 0 pPb after reweighting, compared with the data and original predictions, are shown in figures 6 and 7. As expected, the reweighted results are in an excellent agreement with the data across the wide rapidity range covered by the data, the only exception being the most backward bin where the data show a stronger enhancement than the reweighted PDF predictions. The new nPDF uncertainties computed from the reweighted nPDFs are significantly reduced in comparison to the original error bands. This holds especially at forward rapidities where the small-x region with no previous data    constraints, is probed. For the EPPS16 nPDFs an improvement of a factor of three is observed whereas for nCTEQ15 the improvement is somewhat more modest. This difference follows from a bit more rigid functional form of the nCTEQ15 parametrization which leads to smaller errors to begin with. Interestingly, even though the lowest-P T bins were not included in the analysis, the agreement remains very good also with the data points in the P T < 3 GeV region. We can thus conclude that to describe these data, no physics outside collinear factorization is needed.
In figures 8 -11 we finally compare the EPPS16 and nCTEQ15 nuclear modifications in bound protons, R , before and after reweighting. We present the results at two different scales: the initial scale of the original analyses, Q 2 = 1.69 GeV 2 , and a somewhat higher scale Q 2 = 10 GeV 2 directly probed by the considered observable when reweighting to the P T > 3 GeV subset of data.  quark flavours but for gluons a somewhat stronger shadowing and slightly weaker EMC suppression are preferred by the data. At the parametrization scale Q 2 = 1.69 GeV 2 the uncertainty bands remain practically unchanged for quarks but a drastic reduction is observed for small-x gluons. At Q 2 = 10 GeV 2 also the sea-quark uncertainties are slightly reduced due to the DGLAP evolution which correlates sea quarks with gluons. For gluons the strong shadowing at the initial scale is reduced to around 0.7 at x 0.01 due to the evolution effects. Incidentally, the changes in the EPPS16 gluon PDFs are remarkably similar as found in ref. [28] based on the recent CMS dijet data [75]. In addition, since the central values are only slightly modified, the good agreement with the recent W ± data at √ s NN = 8.16 TeV [76] is expected to persist. We should also mention that the gluon errors at Q 2 = 1.69 GeV 2 dropping negative is of no concern. Indeed, a backward evolution by the DGLAP equations will make any gluon PDF negative at sufficiently low scales, and demanding a positive-definite gluon distribution at any arbitrary scale would be an unphysical requirement. At a deeper level, the resummation of log(1/x) terms in the DGLAP splitting functions [77] may slow down the evolution speed particularly at low Q 2 and thereby better retain the gluons positive. For nCTEQ15 the original and D-meson updated nuclear modifications are plotted in figures 10 and 11. As was the case with EPPS16, the quark nuclear modifications remain more or less the same after reweighting with the LHCb data. The originally strong shadowing for small-x gluons becomes slightly weaker after reweighting and is now rather similar to the gluon shadowing obtained with the reweighted EPPS16. The resulting uncertainties for the gluon shadowing are also on the same ballpark with with EPPS16. In addition, the reweighted nCTEQ15 nuclear modifications for gluons tend to have somewhat less anti-shadowing (the bump around x ∼ 0.1) than in the original analysis and the uncertainties are significantly reduced also in this regime.

Impact without the lower cut on P T
The agreement between the measured and calculated R D 0 pPb was found to be very good also at P T < 3 GeV which we excluded from the reweighting due to theoretical concerns. To check how much potential constraints we threw away, we have repeated the reweighting procedure this time including all the LHCb data. The resulting gluon nPDFs at Q 2 = 1.69 GeV 2 and Q 2 = 10 GeV 2 are shown in figure 12 for EPPS16 and nCTEQ15. Effect for quark nPDFs was found negligible at Q 2 = 1.69 GeV 2 . In both cases the reweighted central results remain practically unchanged but the uncertainties are further reduced at small x in the case of EPPS16 and also at larger x in the case of nCTEQ15. However, the bulk part of the uncertainty reduction still comes from the data in the "safe region" P T > 3 GeV such that inclusion of the P T < 3 GeV data is not critical. As we will argue next, including the lower P T data would not even increase the sensitivity to the small x region significantly.

Sensitivity to small-x region
The x values probed by a given P T and Y are often in the literature estimated with simplified leading-order kinematics, see e.g. ref. [46]. To get a more complete understanding on the small-x sensitivity of D 0 production at forward rapidities we show the contributions from    rw with P T cut rw no P T cut nCTEQ15 rw with P T cut rw no P T cut Figure 12. The EPPS16 (left) and nCTEQ15 (right) nuclear modifications for bound-proton PDFs in Pb nucleus before (EPPS16 blue, nCTEQ15 purple), after reweighting with the LHCb data with P T > 3 GeV (EPPS16 red, nCTEQ15 blue), and including all data points (dotted curves). The results are shown at Q 2 = 1.69 GeV 2 (upper panels) and at Q 2 = 10 GeV 2 (lower panels). different values of x 2 (momentum fraction in nucleus) to the D 0 cross section in figure 13. These distributions are based on full NLO GM-VFNS calculation with EPPS16 including the convolution with fragmentation functions. The results are compared to distributions from a "matrix-element fitting" approach similar to the one introduced in ref. [78] and applied in ref. [48] to study the impact of the LHCb data on nPDFs. In the latter method the squared matrix element |M| 2 for D-meson production is parametrized and the parameters are fitted to data from p+p collisions assuming that the only contribution is gluon-gluon initiated 2 → 2 scattering. The parameters used for the result in figure 13 are obtained from ref. [78] but the correspondence is not guaranteed to be exact since the details of the applied two-body phase space are not explicitly defined in the reference. However, the main point here is that the assumed x 1,2 dependence which, together with PDFs, dictates the shape of the x distributions is rather trivial, of the form |M| 2 ∝ x 1 x 2 . The x distributions from the full NLO GM-VFNS calculation are shown for P T -integrated case with and without the lower cut of P T > 3 GeV. As expected, the D 0 meson production at forward rapidities is indeed sensitive to small-x region reaching down to 10 −5 in the  Figure 13. Contributions to differential D 0 cross section from different values of x 2 at 3.0 < Y < 3.5 from the GM-VFNS in P T ranges of [0, 10] GeV (solid green) and [3,10] GeV (short-dashed blue) and from matrix-element fitting approach for same P T ranges (long-dashed red and dot-dashed purple). considered 3.0 < Y < 3.5 bin. However, there is still a significant contribution from larger x. These large-x tails mainly arise from the convolutions with the fragmentation functions which smears the connection between partonic and hadronic kinematics. Also the NLO corrections contribute to the tail as discussed in ref. [52]. Maybe a bit surprisingly, the tail extends to higher values of x when no lower cut on P T is applied. A very similar behaviour has been seen in the case of inclusive photon production [33]. In part, this can be explained by the valence-like gluons at low scales which shift the cross section to higher x region. In addition, the nuclear effects in EPPS16 are most pronounced at low scales and the shadowing further suppresses the contributions from small x, whereas anti-shadowing tends to increase the larger-x tail. All this dilutes the extra small-x constraints that could be obtained by releasing the P T > 3 GeV cut. Thus, a significant part of the reduced small-x uncertainties in figure 12 can be explained just by the increased statistics (24 data points more) rather than pushing to smaller x. These long large-x tails are not visible in the distributions obtained with the matrix-element fitting approach as it assumes leading-order partonic kinematics and, in particular, a naive |M| 2 ∝ x 1 x 2 behaviour of the coefficient function. Thus the matrix-element fitting approach would overestimate the sensitivity of the LHCb data on the small-x PDFs and would lead to an overly optimistic impact at small x if used in a global analysis. This underlines the importance of using a proper calculation in order to realistically estimate the impact of D-meson data on PDFs.

Summary
We have presented the first direct QCD analysis of the recent LHCb data [42] for D 0 meson production in p+Pb collisions and their impact on nuclear PDFs. To accomplish this we have used the Hessian reweighting method and the cross sections calculated within GM-VFNS using the recently introduced SACOT-m T scheme at NLO [52]. The advantage of the new scheme over the previous GM-VFNS implementations is that by explicitly including the heavy-quark masses in the kinematics also for processes where the QQ pair is produced from light-flavour fragmentation, a sensible behaviour in the P T → 0 limit is always obtained. The resulting cross sections are in a very good agreement with the singleinclusive D-meson P T spectra in the wide rapidity range covered by the LHCb measurement. We also computed predictions by a frequently used Powheg approach in which the heavy quarks are first produced in the partonic 2 → 2 and 2 → 3 scattering events, and then showered and hadronized with Pythia. This approach undershoots the absolute differential cross sections roughly by a factor 2. We attribute this to the omission of contributions in which the heavy-quark is produced in 2 → 4 processes and beyond. These are resummed in GM-VFNS.
A very good agreement with the R D 0 pPb data is found with both of the considered nPDF analyses, EPPS16 and nCTEQ15, and the data are accurate enough to set significant further constraints. For quark PDFs the modifications in the central values are weak but for gluons a somewhat stronger (weaker) small-x shadowing than originally in EPPS16 (nCTEQ15) is preferred by the data. The reweighting also brings the gluon shadowing in these two nPDF sets into a better mutual agreement. The main impact of the data is, however, the substantial reduction of the uncertainties for gluon nuclear modifications at x < 0.01. In fact, these are the first data directly sensitive to small-x gluons in heavy nuclei at clearly perturbative scales, and therefore provide the first unambiguous direct evidence for nuclear gluon shadowing in the context of a global analysis. The backward data seem to confirm the presence of a moderate gluon antishadowing at large x. We note that the effect of these data on EPPS16 are remarkably similar as recently found from dijet data at significantly higher interaction scales, though there the region x < 0.002 is not directly probed [28].
By studying how the cross section builds up from different values of nuclear x we have shown that the LHCb D 0 data constrain nPDFs down to x ∼ 10 −5 but, due to the convolution with FFs, there is still a notable contribution from the high-x region. The importance of using a full QCD calculation to quantify the impact of D-meson data was also underlined. Indeed, a simplified framework can lead to an apparent increase in the sensitivity to the small-x region and would therefore not provide a realistic estimation of the constraints. The good agreement between the nPDF calculation and the data down to P T = 0 GeV -even when rejecting data points at P T < 3 GeV from the fit -implies that the pure collinear-factorization approach is valid also in the small-x region. All in all, we conclude that the LHCb D-meson data can be included in future updates of global nPDF analyses without causing conflicts with the other existing data. To more deeply test the factorization and the universality of nPDFs, data with similar x-reach but for a different observable would be crucial.