A QCD analysis of LHCb D-meson data in p+Pb collisions

We scrutinize the recent LHCb data for D0-meson production in p+Pb collisions within a next-to-leading order QCD framework. Our calculations are performed in the SACOT-mT variant of the general-mass variable-flavour-number scheme (GM-VFNS), which has previously been shown to provide a realistic description of the LHC p+p data. Using the EPPS16 and nCTEQ15 nuclear parton distribution functions (PDFs) we show that a very good agreement is obtained also in the p+Pb case both for cross sections and nuclear modification ratios in the wide rapidity range covered by the LHCb data. Encouraged by the good correspondence, we quantify the impact of these data on the nuclear PDFs by the Hessian reweighting technique. We find compelling direct evidence of gluon shadowing at small momentum fractions x, with no signs of parton dynamics beyond the collinear factorization. We also compare our theoretical framework to a fixed-order calculation supplemented with a parton shower. While the two frameworks differ in the absolute cross sections, these differences largely cancel in the nuclear modification ratios. Thus, the constraints for nuclear PDFs appear solid.


Introduction
In the collinear-factorization approach to describe scattering of protons and heavier nuclei in Quantum Chromodynamics (QCD), the non-perturbative structure of the hadronsparton distribution functions (PDFs) -is factorized from the perturbatively calculable coefficient functions [1,2]. The PDFs are typically extracted from experimental data via global analysis and their accurate determination has been a long-standing effort in the community [2,3]. For the free proton PDF fits there are plenty of accurate data available and the most recent global analyses [4][5][6][7][8] result with PDFs that are reasonably well constrained within the typical kinematics probed at the Large Hadron Collider (LHC).
For PDFs in heavier nuclei, nuclear PDFs (nPDFs), the available data have been rather sparse until very lately [9]. Indeed, even some recent analyses still rely only on older fixed-target deep inelastic scattering (DIS) and Drell-Yan (DY) data [10,11]. Due to the relatively low center-of-mass (c.m.) energy √ s, these data provide constraints only for momentum fractions x 0.01, and the gluons are constrained only indirectly via scale-evolution effects and momentum sum rule [12]. To obtain better gluon constraints, the potential of inclusive pion production in d+Au collisions at RHIC [13][14][15][16] was first discussed in ref. [17] and eventually the data were incorporated into the global fits [18][19][20][21]. The x reach was still, however, rather similar to the available DIS data. The currently most comprehensive nPDF analysis, EPPS16 [22], includes also LHC Run-I data for electroweakboson (W ± and Z 0 ) [23][24][25] and dijet production [26] in p+Pb collisions. Because of the -1 -JHEP05(2020)037 large masses of the W ± and Z 0 bosons, the interaction scale is high and a significant sensitivity to gluons via evolution effects will eventually set constraints on gluons, as has been shown in ref. [27] (section 10.4.2). However, the Run-I W ± and Z 0 data have still a rather limited impact due to the low statistics. The dijet production, on the other hand, probes the gluon density much more directly and already the Run-I data clearly helps to narrow down the gluons in the x 0.002 region [28]. All this still leaves the small-x region only weakly constrained. To probe gluons at small x, almost any conceivable observable at lowish interaction scales and forward rapidity y 0 would do. Good candidates at hadron colliders include e.g. low-mass Drell-Yan dilepton and isolated-photon production at low transverse momentum p T [29][30][31][32][33][34][35]. Isolated photons in p+Pb collisions have already been measured at central rapidities [36], and the large-y measurements appear to be within the capabilities of the LHCb collaboration [37]. In further future, measurements of isolatedphoton production would be a central goal of the ALICE FoCal upgrade [38].
Another promising observable for gluon constraints is the inclusive D-and B-meson production where the heavy-quark mass provides the hard scale even at zero p T . In fact, the LHCb collaboration has published low-p T data on D-meson production at forward kinematics in p+p collisions at different √ s [39][40][41], and recently also in the p+Pb case at √ s = 5 TeV [42]. The use of these D-meson data as a free proton and nuclear PDF constraint has been advocated e.g. in refs. [43][44][45][46][47][48] and studied otherwise [49], but for the moment the default sets of globally fitted general-purpose PDFs [4-8, 21, 22] do not include any D-meson data. Here, our purpose is to provide a first estimate of the impact the recent LHCb p+Pb data have on globally fitted nPDFs within a rigorous next-to-leading order (NLO) perturbative-QCD framework. We will focus only on the LHCb measurements [42], as the central-rapidity ALICE [50] data are not as precise and as the ATLAS centralrapidity data [51] are only preliminary.
As the nPDF sets we consider in this work, EPPS16 [22] and nCTEQ15 [21], are of a variable-flavour type, where the charm and bottom quarks are "active" partons above their mass thresholds, our default setup for the heavy-meson cross section calculations is based on the general-mass variable-flavour-number scheme (GM-VFNS) approach. The concept of this formalism is to match the fixed-flavour-number scheme (FFNS) valid at very low p T with a massless variable-flavour calculation valid at high p T . Such an approach was first developed for leptoproduction of heavy-quarks [52][53][54][55][56][57] and has also been applied to heavy-quark hadroproduction [58][59][60][61]. In this framework the mass-dependent logarithms arising from collinear emissions are resummed into scale-dependent PDFs and fragmentation functions (FFs). Similar parton-level resummation of collinear emissions is also achieved within the FONLL formalism [62,63], which essentially constitutes a particular GM-VFNS scheme. Also parton showers in general-purpose Monte Carlo event generators, such as Pythia [64], provide an effective (leading-logarithm) resummation. In this work, we will use the SACOT-m T variant of the GM-VFNS formalism, introduced in ref. [65]. This framework takes fully into account the D mesons produced by gluon fragmentation -something that is neglected in the FFNS approach. However, when the partonic p T scale is less or close to the heavy-quark mass, p T m, the inherent uncertainties of the GM-VFNS approach grow and somewhere close to this region the pure FFNS approach -2 -JHEP05(2020)037 becomes arguably more reliable. For this and other reasons discussed later on, in our main results we restrict to region with minimum p T = 3 GeV for the produced D mesons, though also the lower p T regime is explored. To decide with confidence which p T scale sets the borderline between the two approaches is a question that would probably require calculations at next-to-NLO level which are not yet available. Thus, in parallel to the GM-VFNS calculations, we perform the cross-section calculations also in the FFNS-based approach to further chart the uncertainties. To quantify the impact on the nPDFs, we will use the Hessian reweighting technique [28,[66][67][68] that facilitates an estimate of the data impact without re-doing the complete global analysis.
The paper will now continue as follows: In section 2, we introduce our theoretical setup, including the GM-VFNS framework and the applied reweighting machinery. Then, in section 3, we compare the resulting cross sections and nuclear modification ratios with the LHCb data, demonstrate the impact these data have on nPDFs, and discuss their sensitivity to small-x gluons. We summarize our findings in section 4.

SACOT-m T scheme for heavy-quark production
The general idea of D-meson hadroproduction in the GM-VFNS approach [59,65] is to reproduce the results of (3-flavour) fixed flavour-number scheme (FFNS) at the small p T limit and match to the massless calculation at high values of p T . Let us first discuss the FFNS limit, in which the cross section for inclusive production of a heavy-flavoured hadron h 3 at a given transverse momentum P T and rapidity Y in a collision of two hadrons, h 1 and h 2 , can be written as dσ ij→Q+X dp T dy (τ 1 , τ 2 , m, µ 2 ren , µ 2 fact ) . (2.1) In this expression, f h 1,2 i,j are the PDFs (in 3-flavour scheme) for partons i and j in hadrons h 1 and h 2 with momentum fractions x 1 and x 2 , and dσ ij→Q+X /dp T dy denote the perturbatively calculable coefficient functions for inclusive heavy-quark Q (here charm) production [69] with fixed rapidity y and transverse momentum p T of Q. The renormalization and factorization scales are denoted by µ 2 ren , µ 2 fact and m is the heavy-quark (here charm) mass. The fragmentation of a heavy-quark to hadron h 3 is described by a scale-independent fragmentation function (FF) D Q→h 3 (such as in ref. [70]). The invariants τ i can be calculated from the partonic transverse mass m T = p 2 T + m 2 and rapidity y as where p 1 and p 2 are the momenta of the incoming massless partons, and p 3 is the finalstate heavy-quark momentum. When masses are neglected, the relation between partonic -3 -

JHEP05(2020)037
and hadronic variables is simply y = Y and P T = zp T . However, when the masses of the heavy quark and the final-state hadron are taken into account, the definition of z becomes ambiguous [71]. Adopting the choice made in [65], where P i is the momentum of hadron h i , the z variable can be interpreted as the fraction of partonic energy carried by the outgoing hadron in the c.m. frame of the initial-state hadrons h 1 and h 2 . The relations between partonic and hadronic variables become somewhat more involved, but eq. (2.1) stays intact. When the transverse momentum of the produced hadron h 3 is large, P T m, the heavy-quark mass can be neglected and thus the zero-mass description becomes the most relevant. In this limit, the cross section can be written as [72], dσ ij→k+X dp T dy (τ 0 1 , τ 0 2 , µ 2 ren , µ 2 fact , µ 2 frag ) .

(2.4)
The formal difference with respect to eq. (2.1) is that now the FFs are fragmentation-scale µ 2 frag dependent, and a summation over all partonic channels is included. For massless partons the invariants τ 0 i are obtained as The GM-VFNS technique [59,65] provides a general framework to match the two extremes of eq. (2.1) and eq. (2.4) in a way that is consistent with collinear factorization. If we start from the FFNS description and increase P T , the cross sections will quickly be dominated by log(p T /m) terms whose origin is in the initial-and final-state partons' collinear splittings into QQ pairs. In GM-VFNS these logarithms are resummed to the scale-dependent heavy-quark PDFs and scale-dependent FFs. Because the FFNS expressions already contain the first of the resummed logarithmic terms, subtraction terms are needed to avoid double counting and ensure the correct zero-mass limit of eq. (2.4). For example, the inclusion of the gluon production channel gg → gg, dσ gg→g+X dp T dy (τ 1 ,τ 2 , µ 2 ren , µ 2 fact , µ 2 frag ) (2.6) on top of eq. (2.1), must be accompanied by a subtraction term which has otherwise the same expression as eq. (2.6) but where the gluon-to-h 3 FF is replaced by -4 -

JHEP05(2020)037
which is the first term in the definition of scale-dependent FFs with massive quarks. In an NLO-accurate O(α 3 s ) calculation, only the leading-order part of dσ gg→g+X is included in the subtraction term. However, the exact form of dσ gg→g+X in the equation above is not fixed by this construction. The only condition is that we recover the standard zero-mass MS expression at p T → ∞ to meet eq. (2.4). This means that we can include mass-dependent terms in dσ gg→g+X as we like, and a specific choice defines a scheme. The difference between the added and subtracted contributions discussed above is formally of order O(α 4 s ), so that different schemes are formally equivalent up to O(α 3 s ). Here we adopt the so-called SACOT-m T scheme [65]. It is rooted in a simple observation that in order to make a heavy-flavoured hadron in QCD, a QQ pair must be first produced. That is, the relevant invariants to describe the process are the massive ones,τ 1,2 = τ 1,2 , even for seemingly massless partonic contribution (like the gg → gg channel). Importantly, the mass then prevents the partonic cross sections from diverging towards small p T exactly in the same way as the FFNS cross section are finite at p T = 0. In the previous GM-VFNS approach [59] such a physical behaviour is obtained only by a particular choice of QCD scales [61,73]. However, we still stress that when the partonic p T scale is less or similar to the heavy-quark mass m, the arbitrariness related to the GM-VFNS scheme choice reduces the reliability of the predictions. The arbitrariness related to the choice of the fragmentation variable z is also most prominent at low p T . For these two reasons in our main results we will concentrate on the region P T > 3 GeV where the associated uncertainties are smaller.
The final differential cross sections are then calculated by using the FFNS expressions for the explicit QQ production, and for all other channels zero-mass expressions with the mentioned massive kinematics. The subtraction terms discussed above are included to avoid double counting and to ensure proper matching between α s and PDFs in 3-and 4-flavour schemes. The switch from 3-to 4-flavour scheme is done at the charm-mass threshold. The bottom decays to D 0 are an order of magnitude smaller [74] than the "direct" charm fragmentation to D 0 . Thus, the treatment of the bottom mass is not as critical, and in our present setup we switch from 4-to 5-flavour scheme at the bottom-mass threshold with no matching conditions and ignoring the bottom mass. For the numerical implementation of the described SACOT-m T scheme the massless NLO matrix elements are obtained from the incnlo [72] code and the FFNS part with explicit heavy-quark production is obtained from the mnr code [75]. As presented in refs. [65,74], this framework is in a very good agreement with the ALICE [50,74] and LHCb [39][40][41] data for inclusive D-meson production in p+p collisions in a broad rapidity range. The GM-VFNS approach also broadly reproduces the LHCb data on double D-meson production [76].

Powheg+Pythia approach
We will also contrast our results in the SACOT-m T framework with a Monte-Carlo based NLO computation that is often applied to heavy-meson phenomenology at the LHC in the context of PDFs [44,45,77]. This approach is based on the Powheg method [78] to combine NLO matrix elements with a parton shower and hadronization from a generalpurpose Monte-Carlo event generator. The underlying idea is to generate the partonic -5 -JHEP05(2020)037 2 → 2 and 2 → 3 events with the NLO-correct matrix elements. These events are then passed to any parton shower generator that provides the rest of the partonic branchings, accounting for the fact that the first one may already have occurred. The parton shower can be considered as being analogous to the scale evolution of FFs and PDFs as the splitting probabilities are based on the DGLAP evolution equations in both cases.
We generate the partonic events with the heavy-quark pair production (hvq) scenario [79] of the Powheg Box framework [80] which we pass on to Pythia 8 [64] for showering and hadronization. As Powheg generates only events where the heavy-quark pair is produced in the Born-level process or in the first (hardest) splitting, it ignores the component where the QQ would be created only later on in the shower e.g. starting from a hard gg → gg process. Such contributions are, however, effectively included in any GM-VFNS framework via the scale-dependent PDFs and FFs. Since charm quarks are abundantly produced in parton showers at the LHC energies [81], truncating the resummation of the splittings to the first one may miss a significant source of heavy quarks, as was pointed out in ref. [65]. This interpretation is supported by noting that within the GM-VFNS framework, the fixed-order production channels (the ones included in the hvq scenario of Powheg) were observed to constitute less than 10% of the full cross section at P T 3 GeV [65] once the subtraction terms were included. In addition, as demonstrated in refs. [40,41,65], the uncertainties arising from scale variations within the Powheg+Pythia setup become considerably larger at P T 3 GeV than what they are in GM-VFNS implementations. It is thus conceivable that the logarithmic terms resummed in GM-VFNS are significant already at P T ∼ 3 GeV. However, as mentioned before, the uncertainties related to the choice of the GM-VFNS scheme are large at low P T and it is thus impossible to draw a decisive conclusion.
At high enough P T , the truncation of the chain of partonic splittings the Powheg+Pythia method potentially overestimates the sensitivity to low-x PDFs as the neglected contributions with several emissions would always require a higher value of x to produce a heavy meson at a fixed P T and Y . Within its large scale uncertainties the Powheg+Pythia method nevertheless agrees with the D-meson data measured by LHCb even at P T m charm , though the central predictions are generally below the data [44].

Reweighting machinery
We will quantify the impact of the single inclusive D 0 -meson production data in p+Pb collisions on nuclear PDFs by the Hessian reweighting method [28,[66][67][68]. The method has recently been discussed at length e.g. in ref. [28] so here we only outline the basic underlying idea. Let us consider a global PDF analysis whose fit parameters a i are tuned to minimize a global χ 2 function, The χ 2 function is expanded around the best fit as where H ij is the Hessian matrix, H ij = 1 2 ∂ 2 χ 2 /(∂a i ∂a j ). Denoting by O the orthogonal matrix that diagonalizes the Hessian matrix, OHO T = I, the z i variables are linear com- . We refer to the best-fit as S 0 , and it corresponds to the point z = 0. The Hessian error sets S ± k can then be defined by z i (S ± k ) = ± ∆χ 2 δ ik , where ∆χ 2 is the estimated tolerance. It follows [82] that for any PDF-dependent quantity X there are unique points in the z space that extremize its positive and negative deviations from the central value X(S 0 ). These deviations, ∆X ± , are given by This, or its asymmetric version (see later), is normally quoted as the uncertainty in Hessian PDF fits. In a global analysis, the χ 2 contributions of individual data sets are simply summed in the overall χ 2 . Thus, if we wish to include a new set of data into our global fit, we just add its contribution to eq. (2.8), where y data i denote the new data points with a covariance matrix C ij . The PDF-dependent values y i {z} can now be approximated linearly as and by substituting this into eq. (2.10), we see that χ 2 new is still quadratic in variables z k and has therefore a unique minimum which we denote by z k = z min k . Note that we do not need to know the value of χ 2 0 . The PDFs f new i (x, Q 2 ) that correspond to this new minimum are obtained by replacing y i in eq. (2.11) by PDFs, Since we now know χ 2 new analytically, we can repeat the original treatment by computing the new Hessian matrix and diagonalizing it exactly the same way as outlined above. As a result, we have an approximation of how a new set of data has affected a set of PDFs and its errors. In comparison to a full global analysis, the advantage of the reweighting technique is that it avoids the time-consuming fitting procedure which, in practice, is only available to the people that performed the PDF analysis itself. In addition, and also importantly, there is no need to implement a potentially CPU-expensive cross-section computation as a part of the fitting framework or to compute partial cross sections to form three dimensional (x 1 ,x 2 ,µ 2 fact ) grids to facilitate a rapid cross-section evaluation. The downside is that since the reweighting method relies completely on the assumptions made in the prior PDF analysis, including e.g. a specific parametrization which may artificially overestimate the impact in a kinematic region beyond the reach of a given observable.
The Hessian reweighting method sketched above relied on a linear approximation for the PDFs and observables in the z space, and on a quadratic expansion of the original -7 -JHEP05(2020)037 χ 2 function. These are not always good approximations and, as described in ref. [28], the results can be refined by taking into account higher order terms in z. The results presented in this paper (section 3.3) have been obtained using a quadratic extension of the approximation made in eq. (2.11). In the case of EPPS16 we also take into account cubic terms in the original χ 2 profile, eq. (2.8). See ref. [28] for further technical details.

Results
Throughout this section, we will use two recent globally-fitted nPDF sets, EPPS16 [22] and nCTEQ15 [21], in our calculations. In the case of EPPS16 we use CT14NLO [5] as the free proton PDF set and with nCTEQ15 we use its own proton PDF (with no uncertainties on it). As a default setup for the GM-VFNS calculation we adopt the KKKS08 [83] partonto-hadron FFs and set the renormalization and factorization scales as µ ren = µ fact = 3 GeV for the charm quark mass. For the fragmentation scale we set µ frag = P 2 T + (1.5 GeV) 2 as the KKKS08 analysis assumed this slightly higher value for the charm-quark mass. In the matrix elements we always use m c = 1.3 GeV. For the D 0 mass, relevant for transforming the partonic kinematics to hadronic ones, we adopt the value M D 0 = 1.87 GeV [84]. With the Powheg approach, we use the same nuclear and proton PDFs and the same value for the charm mass but the renormalization and factorization scales are fixed to transverse mass of the produced charm quark, p 2 T + m 2 c . At the time of generating the partonic events with Powheg it is not yet known which P T the D meson will have (if formed at all), so relating the scales to the partonic variables is the only reasonable option. The parton shower and hadronization for the Powheg events are generated with the Pythia version 8.235 [64] using parameters from the default Monash tune [85].

Double-differential cross section for D 0 production in p+Pb collisions
To benchmark our GM-VFNS framework in p+Pb collisions we first compare our calculations with the double-differential single-inclusive D 0 production cross section measured by LHCb [42]. This comparison is important since a good agreement with the measured cross sections would indicate that the framework includes e.g. all the relevant partonic processes. In this way we ensure that the framework is realistic.
In figure 1 we compare the calculated cross sections with the LHCb data at backward rapidities (Pb-going direction) in five different rapidity bins spanning −5.0 < Y < −2.5 in the nucleon-nucleon (NN) c.m. frame. The resulting cross sections with the default setup are shown for both the EPPS16 and nCTEQ15 nPDFs, whereas the theoretical uncertainties are quantified with EPPS16 only. These include now scale variations and PDF uncertainties. The former are calculated by varying the three QCD scales independently by a factor of two around the default choice. In addition, ratios µ fact /µ ren and µ frag /µ ren are required to stay within [0.5, 2] and the mass of the charm quark is used as a lower limit for all scales. For the PDF uncertainties the error bands from proton and nuclear PDFs are added in quadrature as they are approximately independent in the EPPS16 global analysis.

JHEP05(2020)037
Here, we use the asymmetric error prescription where the sum now runs over both the EPPS16 and CT14NLO error sets. Uncertainties due to the mentioned ambiguity in defining the fragmentation variable z, FFs, or e.g. variation in charm-quark mass are not considered. In addition to the GM-VFNS results, comparison with the Powheg+Pythia setup is shown. The correspondence between the data and the GM-VFNS calculation with both EPPS16 and nCTEQ15 is found to be very good, though the theoretical uncertainties become large at P T < 3 GeV. Interestingly the PDF uncertainty at small P T is large above the central result but small below it. This can be traced back to the parametrization applied in the CT14 analysis where the requirement for positive-definite PDFs limits the small-x behaviour as already the central set for gluons near the initial scale Q 2 0 at small x is close to zero. Since similar positivity restriction was not applied in NNPDF3.1 [7], the PDF uncertainties shown in ref. [65] behave in a different manner at small values of P T . As in the p+p case [65], the cross sections obtained with the Powheg+Pythia approach fall below the GM-VFNS results, albeit the spread is of the same order as the theoretical uncertainties in the applied GM-VFNS formalism.
The corresponding cross sections at forward rapidities (p-going direction) are shown in figure 2. Here the five rapidity bins cover the range 1.5 < Y < 4.0. The conclusions are very similar as at backwards rapidities, the agreement between the GM-VFNS calculation and the data being very good, particularly at P T 3 GeV where the theoretical uncertainties are in control. The comparisons with the absolute cross sections lead us to conclude that the SACOT-m T framework [65] works very well also for p+Pb collisions and can be faithfully applied to study the nPDF constraints -at least for P T 3 GeV.

Nuclear modification ratio for D 0 production in p+Pb collisions
To constrain nPDFs with D mesons, it is useful to consider an observable in which theoretical uncertainties related to scale variations, free proton PDFs, and FFs cancel out to a large extent. In the case of single-inclusive hadron production a suitable observable is the nuclear modification factor R h 3 AB , defined for inclusive D 0 meson production in p+Pb collisions at the LHC as We compare our calculations with the measured R D 0 pPb in figures 3 and 4 at backward and forward rapidities, respectively. The LHCb data span over four Y bins in a range correspond to the nPDF errors calculated in the GM-VFNS approach. Furthermore, also the GM-VFNS result using the zero-mass definition for the fragmentation variable, and the scale variation band, are shown in each kinematic bin. First observation is that the data uncertainties are in most of the cases smaller than the nPDF-originating ones with both nPDF sets considered. Especially at forward rapidities the EPPS16 nPDF uncertainty bands are much larger than the experimental uncertainties due to the poorly-constrained small-x nuclear gluon distributions. This demonstrates the potential of these data to significantly constrain the current nPDFs at small-x where no other data currently exist. Also, the good overall agreement with the calculated and measured R D 0 pPb over the wide rapidity range provides a strong indication of the applicability of factorization-based approach in this previously unconstrained kinematic region. The large uncertainties from scale variations observed for the differential cross sections largely cancel out in the nuclear modification ratio. However, at P T < 3 GeV they start to grow and the downward uncertainty is limited by the minimum scale Q = 1.3 GeV of EPPS16 and nCTEQ15. If the PDF parametrizations would extend to lower values, the downward uncertainty would probably be much larger. Similarly, the use of massless definition for the fragmentation variable z -taken here as an indicator of the associated uncertaintycan lead to a significant variation in the calculated R D 0 pPb at small values of P T at backward rapidities. The reason is that the definition of z provides the link between hadronic and partonic kinematics and therefore the probed x regions are slightly different from one definition to another. In backward direction we are sensitive to the mid-x region where the slope in both EPPS16 and nCTEQ15 nuclear gluon modifications is somewhat steepish (see figures 9 and 11 ahead), and changes in the probed x regions matter. To make sure that we stay in a region where these theoretical uncertainties are in control, it seems sufficient to discard the data points below P T = 3 GeV.
Since many theoretical uncertainties get suppressed in R D 0 pPb , we might expect that the Powheg+Pythia results would be very close to GM-VFNS ones. While the two are indeed very similar, we find that the Powheg+Pythia results tend to lie systematically below the GM-VFNS calculations. In part, the differences can be explained by the different scale choices (p T instead of P T ) but since the differences persist even at the largest P T bins, this cannot be the full explanation. Indeed, the main factor seems to be, as argued also in ref. [65], that Powheg+Pythia framework misses the contributions in which the cc pair would be produced only at later stages of the shower and therefore biases the kinematics to lower values of x 2 compared to the GM-VFNS calculation. Thus, the nuclear effects in the Powheg+Pythia predictions at a given P T come from smaller x 2 than in GM-VFNS. This explains why, when compared to the GM-VFNS results, the nuclear effects in Powheg+Pythia predictions are seemingly shifted towards higher values of P T in all rapidity bins, apart from the very lowest P T bins where the impact of the scale choice becomes important.

Impact of the LHCb data on nPDFs
The observed consistency between the measured and calculated R D 0 pPb indicates that these data could be used in a global nPDF analysis. As a preparation for this, we now esti- before reweighting 1.56 2.09 after reweighting 1.02 1.12 Table 1. Values of χ 2 /N data for the EPPS16 and nCTEQ15 nPDFs before and after reweighting.
mate the impact of the LHCb data for R D 0 pPb on the EPPS16 and nCTEQ15 nPDFs by applying the reweighting method outlined in section 2.3. By excluding the data points at P T < 3 GeV we are left with N data = 48 data points. The level of agreement is quantified by calculating the standard figure-of-merit χ 2 before and after reweighting. The numbers are presented in table 1. Before the reweighting, the central nCTEQ15 value is somewhat high, but upon performing the reweighting both the EPPS16 and nCTEQ15 values are close to unity, indicating a good agreement with the data. To further study the statistical properties of our results, histograms of the data residuals are shown in figure 5. The residuals are calculated (for uncorrelated errors) as a difference between the theory value T i and corresponding data point D i normalised with the experimental uncertainty δ i . Ideally the distribution of the residuals should follow a Gaussian distribution with standard deviation of one and zero mean to which the calculated values are compared to. In addition, Gaussian fits are performed for the residuals obtained after reweighting to ease the comparison with the ideal distributions. With the original central EPPS16 and nCTEQ15 results the distributions show a behaviour diverting from the ideal Gaussian, but after reweighting a closer resemblance to that is obtained. With both nPDF sets the resulting distributions are slightly narrower than the ideal distribution but the mean is close to zero, confirming a reasonable statistical behaviour.
The results for R D 0 pPb after reweighting, compared with the data and original predictions, are shown in figures 6 and 7. As expected, the reweighted results are in an excellent agreement with the data across the wide rapidity range covered by the data, the only exception being the most backward bin where the data show a stronger enhancement than the reweighted PDF predictions. The new nPDF uncertainties computed from the reweighted nPDFs are significantly reduced in comparison to the original error bands. This holds especially at forward rapidities where the small-x region with no previous data constraints, is probed. For the EPPS16 nPDFs an improvement of a factor of three is observed whereas for nCTEQ15 the improvement is somewhat more modest. This difference follows from a bit more rigid functional form of the nCTEQ15 parametrization which leads to smaller errors to begin with. Interestingly, even though the lowest-P T bins were not included in the analysis, the agreement remains very good also with the data points in the P T < 3 GeV region. We can thus conclude that to describe these data, no physics outside collinear factorization is needed.
In figures 8-11 we finally compare the EPPS16 and nCTEQ15 nuclear modifications in bound protons, R , before and after reweighting. We present the results at two different scales: the initial scale of the original analyses, Q 2 = 1.69 GeV 2 , and a somewhat higher scale Q 2 = 10 GeV 2 directly probed by the considered observable when reweighting to the P T > 3 GeV subset of data. The valence and sea quark distributions are shown separately for each partonic flavour. For the EPPS16 analysis these are plotted in figures 8 and 9. The central values remain unchanged for all quark flavours but for gluons a somewhat stronger shadowing and slightly weaker EMC suppression are preferred by the data. At the parametrization scale Q 2 = 1.69 GeV 2 the uncertainty bands remain practically unchanged for quarks but a drastic reduction is observed for small-x gluons. At Q 2 = 10 GeV 2 also the sea-quark uncertainties are slightly reduced due to the DGLAP evolution which correlates sea quarks with gluons. For gluons the strong shadowing at the initial scale is reduced to around 0.7 at x 0.01 due to the evolution effects. Incidentally, the changes in the EPPS16 gluon PDFs are remarkably similar as found in ref. [28] based on the recent CMS dijet data [86]. In addition, since the central values are only slightly modified, the good agreement with the recent W ± data at √ s NN = 8.16 TeV [87] is expected to persist. We should also mention that the gluon errors at Q 2 = 1.69 GeV 2 dropping negative is of no concern. Indeed, a backward evolution by the DGLAP equations will make any gluon PDF negative at sufficiently low scales, and demanding a positive-definite gluon distribution at any arbitrary scale would   be an unphysical requirement. At a deeper level, the resummation of log(1/x) terms in the DGLAP splitting functions [88] may slow down the evolution speed particularly at low Q 2 and thereby better retain the gluons positive.
For nCTEQ15 the original and D-meson updated nuclear modifications are plotted in figures 10 and 11. As was the case with EPPS16, the quark nuclear modifications remain more or less the same after reweighting with the LHCb data. The originally strong shadowing for small-x gluons becomes slightly weaker after reweighting and is now rather similar to the gluon shadowing obtained with the reweighted EPPS16. The resulting uncertainties for the gluon shadowing are also on the same ballpark with EPPS16. In addition, the reweighted nCTEQ15 nuclear modifications for gluons tend to have somewhat less antishadowing (the bump around x ∼ 0.1) than in the original analysis and the uncertainties are significantly reduced also in this regime.

Impact without the lower cut on P T
The agreement between the measured and calculated R D 0 pPb was found to be very good also at P T < 3 GeV which we excluded from the reweighting due to theoretical concerns. To check how much potential constraints we threw away, we have repeated the reweighting procedure this time including all the LHCb data. The resulting gluon nPDFs at Q 2 = 1.69 GeV 2 and Q 2 = 10 GeV 2 are shown in figure 12 for EPPS16 and nCTEQ15. Effect for quark nPDFs was found negligible at Q 2 = 1.69 GeV 2 . In both cases the reweighted central results remain practically unchanged but the uncertainties are further reduced at small x in the   case of EPPS16 and also at larger x in the case of nCTEQ15. However, the bulk part of the uncertainty reduction still comes from the data in the "safe region" P T > 3 GeV such that inclusion of the P T < 3 GeV data is not critical. As we will argue next, including the lower P T data would not even increase the sensitivity to the small x region significantly.

Sensitivity to small-x region
The x values probed by a given P T and Y are often in the literature estimated with simplified leading-order kinematics, see e.g. ref. [46]. To get a more complete understanding on the small-x sensitivity of D 0 production at forward rapidities we show the contributions from different values of x 2 (momentum fraction in nucleus) to the D 0 cross section in figure 13. These distributions are based on full NLO GM-VFNS calculation with EPPS16 including the convolution with fragmentation functions. The results are compared to distributions from a "matrix-element fitting" approach similar to the one introduced in ref. [89] and applied in ref. [48] to study the impact of the LHCb data on nPDFs. In the latter method the squared matrix element |M| 2 for D-meson production is parametrized and the parameters are fitted to data from p+p collisions assuming that the only contribution is gluon-gluon initiated 2 → 2 scattering. The parameters used for the result in figure 13 are obtained from ref. [89] but the correspondence is not guaranteed to be exact since the details of the applied two-body phase space are not explicitly defined in the reference. However, the main point here is that the assumed x 1,2 dependence which, together with PDFs, dictates the shape of the x distributions is rather trivial, of the form |M| 2 ∝ x 1 x 2 . rw with P T cut rw no P T cut nCTEQ15 rw with P T cut rw no P T cut Figure 12. The EPPS16 (left) and nCTEQ15 (right) nuclear modifications for bound-proton PDFs in Pb nucleus before (EPPS16 blue, nCTEQ15 purple), after reweighting with the LHCb data with P T > 3 GeV (EPPS16 red, nCTEQ15 blue), and including all data points (dotted curves). The results are shown at Q 2 = 1.69 GeV 2 (upper panels) and at Q 2 = 10 GeV 2 (lower panels). The x distributions from the full NLO GM-VFNS calculation are shown for P T -integrated case with and without the lower cut of P T > 3 GeV. As expected, the D 0 meson production at forward rapidities is indeed sensitive to small-x region reaching down to 10 −5 in the considered 3.0 < Y < 3.5 bin. However, there is still a significant contribution from larger x. These large-x tails mainly arise from the convolutions with the fragmentation functions which smears the connection between partonic and hadronic kinematics. Also the NLO corrections contribute to the tail as discussed in ref. [65]. Maybe a bit surprisingly, the tail extends to higher values of x when no lower cut on P T is applied. A very similar behaviour has been seen in the case of inclusive photon production [33]. In part, this can be explained by the valence-like gluons at low scales which shift the cross section to higher x region. In addition, the nuclear effects in EPPS16 are most pronounced at low scales and the shadowing further suppresses the contributions from small x, whereas anti-shadowing tends to increase the larger-x tail. All this dilutes the extra small-x constraints that could be obtained by releasing the P T > 3 GeV cut in our GM-VFNS scheme. Thus, a significant part of the reduced small-x uncertainties in figure 12 can be explained just by the increased statistics (24 data points more) rather than pushing to smaller x. These long large-x tails  Figure 13. Contributions to differential D 0 cross section from different values of x 2 at 3.0 < Y < 3.5 from the GM-VFNS in P T ranges of [0, 10] GeV (solid green) and [3,10] GeV (short-dashed blue) and from matrix-element fitting approach for same P T ranges (long-dashed red and dot-dashed purple).
are not visible in the distributions obtained with the matrix-element fitting approach as it assumes leading-order partonic kinematics and, in particular, a naive |M| 2 ∝ x 1 x 2 behaviour of the coefficient function. Thus, in comparison to the GM-VFNS approach, the matrix-element fitting approach would overestimate the sensitivity of the LHCb data on the small-x PDFs and would lead to an overly optimistic impact at small x if used in a global analysis.

Reweighting with Powheg
To study the impact of the terms resummed in SACOT-m T we have performed the nPDF reweighting with the LHCb data also using the Powheg+Pythia approach introduced in section 2.2. The resulting gluon distributions are compared to the results obtained within the SACOT-m T scheme in figure 14 for EPPS16 and nCTEQ15. To avoid statistical fluctuations the cross sections with the nPDF error sets were calculated from the original events by calculating a weight for each event and each error set using the event-reweighting machinery introduced in Powheg Box V2. In both cases, EPPS16 and nCTEQ15, reweighting the nPDFs with the LHCb data using the Powheg framework leads to somewhat reduced shadowing compared to SACOT-m T result. This can be explained by the fact that a FFNS calculation lacks the large-x contribution which is generated by gluon fragmentation in GM-VFNS as demonstrated in figure 5 in ref. [65] and discussed in section 3.5. Therefore in the SACOT-m T scheme a stronger shadowing is required as it needs to compensate for the enhancement arising from the contribution from the anti-shadowing regime. The separation is slightly more pronounced with the EPPS16 nPDFs but in both cases the differences are within the estimated uncertainties and the  Figure 14. The EPPS16 (left) and nCTEQ15 (right) nuclear modifications for bound-proton PDFs in Pb nucleus before (EPPS16 blue, nCTEQ15 purple) and after reweighting with the LHCb data with SACOT-m T (EPPS16 red, nCTEQ15 blue) and with Powheg (green with dashed error-band limits) frameworks with a cut P T > 3 GeV. The results are shown for gluons at Q 2 = 1.69 GeV 2 (upper panels) and at Q 2 = 10 GeV 2 (lower panels).
reduction of the small-x gluon uncertainties is similar with both theoretical setups. We can thus conclude that the constraints obtained for the nuclear PDFs are even surprisingly stable against varying theoretical approaches. We stress, however, that since the absolute cross sections are quite different, the rough agreement between the applied frameworks is due to fact that we consider data for the ratio R pPb where many effects cancel out.

Summary
We have presented the first direct QCD analysis of the recent LHCb data [42] for D 0 meson production in p+Pb collisions and their impact on nuclear PDFs. To accomplish this we have used the Hessian reweighting method and the cross sections calculated within GM-VFNS using the recently introduced SACOT-m T scheme at NLO [65]. The advantage of the new scheme over the previous GM-VFNS implementations is that by explicitly including the heavy-quark masses in the kinematics also for processes where the QQ pair is produced from light-flavour fragmentation, a sensible behaviour in the P T → 0 limit -24 -JHEP05(2020)037 is always obtained. However, the description of the very low-P T regime is still somewhat arbitrary within GM-VFNS and this is one of the reasons we have concentrated mainly on the P T ≥ 3 GeV region. The resulting cross sections are in a very good agreement with the single-inclusive D-meson P T spectra in the wide rapidity range covered by the LHCb measurement. We also computed predictions by a frequently used Powheg approach in which the heavy quarks are first produced in the partonic 2 → 2 and 2 → 3 scattering events, and then showered and hadronized with Pythia. This approach generally yields smaller differential cross sections than what we obtain with the GM-VFNS formalism. At very low P T m charm this is hardly significant due to the large scale uncertainties and scheme dependence of the GM-VFNS calculations. At large P T 3 GeV it is possible that the observed differences are due to the omission of contributions in which the heavy quark is produced in 2 → 4 processes and beyond, though within the scale uncertainties the Powheg and GM-VFNS approaches agree also there (see figure 11 of ref. [65]). Thus, higher-order calculations would be needed to improve our understanding of whether this is the principal cause for the observed differences.
A very good agreement with the R D 0 pPb data is found with both of the considered nPDF analyses, EPPS16 and nCTEQ15, and the data are accurate enough to set significant further constraints. For quark PDFs the modifications in the central values are weak but for gluons a somewhat stronger (weaker) small-x shadowing than originally in EPPS16 (nCTEQ15) is preferred by the data. The reweighting also brings the gluon shadowing in these two nPDF sets into a better mutual agreement. The main impact of the data is, however, the substantial reduction of the uncertainties for gluon nuclear modifications at x < 0.01. In fact, these are the first data directly sensitive to small-x gluons in heavy nuclei at clearly perturbative scales, and therefore provide the first unambiguous direct evidence for nuclear gluon shadowing in the context of a global analysis. The backward data seem to confirm the presence of a moderate gluon antishadowing at large x. We note that the effect of these data on EPPS16 are remarkably similar as recently found from dijet data at significantly higher interaction scales, though there the region x < 0.002 is not directly probed [28]. The nPDF reweighting was repeated also with the Powheg setup resulting in a slightly reduced gluon shadowing but otherwize very similar results are obtained as with the SACOT-m T scheme. It thus appears that our main results -constraints on nuclear PDFs -are robust against theoretical uncertainties.
By studying how the cross section builds up from different values of nuclear x we have shown that the LHCb D 0 data constrain nPDFs down to x ∼ 10 −5 but, due to the convolution with FFs, there is still a notable contribution from the high-x region. The importance of using a full QCD calculation to quantify the impact of D-meson data was also underlined. Indeed, a simplified framework can lead to an apparent increase in the sensitivity to the small-x region and would therefore not provide a realistic estimation of the constraints. The good agreement between the nPDF calculation and the data down to P T = 0 GeV -even when rejecting data points at P T < 3 GeV from the fit -implies that the pure collinear-factorization approach is valid also in the small-x region. All in all, we conclude that the LHCb D-meson data can be included in future updates of global nPDF analyses without causing conflicts with the other existing data. To more deeply test the -25 -JHEP05(2020)037 factorization and the universality of nPDFs, data with similar x-reach but for a different observable would be crucial.