1 Introduction

Accurate knowledge of the parton distribution functions (PDFs) is essential for predictions at hadron colliders. The primary source of information on the proton PDFs comes from deep-inelastic scattering (DIS). Measurements at fixed target experiments and at the HERA \(e^{\pm } p\) collider provide constraints on the quark and gluon densities, and discrimination of the quark flavours. The DIS proton data mostly constrain the u-type quark density, due to the greater couplings to the photon at low absolute four momentum transfer, \(Q^2\), whereas the d-type quark densities are only constrained at high \(Q^2\) with limited precision. Even more challenging is the separation of the d-valence quark density, which relies on the HERA \(e^+\) charge current data, which are statistically limited in the published HERA I combined data [1]. A better flavour separation is needed to challenge the limits of precision physics at the LHC.

Drell–Yan production of W and Z bosons in proton–antiproton and proton–proton collisions can provide additional information on the d-quark PDFs. At leading order (LO) in QCD, the Drell–Yan processes probe the PDFs at energy scales Q corresponding to the boson masses, \(m_V = m_W\) and \(m_V=m_Z\), and momentum fractions carried by the interacting partons of \(x_{1,2} = m_V/\sqrt{S} e^{\pm y}\), where \(\sqrt{S}\) is the centre-of-mass energy and y is the boson rapidity.

At the Tevatron proton–antiproton collider, the production of W and Z bosons is dominated by valence–quark interactions. The Z-boson production has similar couplings for \(u\bar{u}\) and \(d\bar{d}\) fusion processes, whereas W bosons are produced predominantly by \(u\bar{d}\) and \(d\bar{u}\) fusions for \(W^+\) and \(W^-\) bosons, respectively. Various measurements of Z-boson inclusive production and of W-boson charge asymmetry have been reported by the D0 and CDF collaborations [27]. Some of these data samples were included in previous PDF studies [811]. The addition of the Tevatron data resulted in improved PDFs, but some tensions were observed with global PDF fits [1215].

In this paper the data collected at the Tevatron collider in Run II are analysed to assess their impact on the PDFs. The assumptions of the correlation model of the experimental systematic uncertainties are revised with respect to the recommendation of the experiments, leading to improved agreement with the theoretical predictions. The analysis is performed using the HERAFitter framework [1, 1618] at next-to-leading order (NLO) QCD. The Tevatron W- and Z-boson measurements are also compared to predictions evaluated with the recent PDF sets CT10nlo [8], MMHT2014 [9] and NNPDF3.0 [10]. The impact of the Tevatron data on PDFs is studied using Hessian profiling [19] and Bayesian reweighting [2022] techniques. The profiling of PDF uncertainties is generalised to the case of asymmetric PDF uncertainties.

This paper is organised as follows: the data samples are introduced in Sect. 2 and the theoretical predictions are discussed in Sect. 3. The QCD analysis settings and the methods for comparing data with predictions based on existing PDFs are discussed in Sect. 4. Section 5 reports the results of the PDF analysis. The results obtained in the paper are summarised in Sect. 6.

2 Experimental measurements

2.1 Data sets

The most recent measurements of W-boson charge asymmetry and Z-boson inclusive production performed in Run II of the Tevatron collider are considered in this study. They include the Z-boson differential cross section as a function of rapidity, measured by the D0 collaboration with 0.4 fb\(^{-1}\) of integrated luminosity in the \(Z \rightarrow ee\) channel [2]; the Z-boson differential cross section as a function of rapidity, measured by the CDF collaboration with 2.1 fb\(^{-1}\) of integrated luminosity in the \(Z \rightarrow ee\) channel [3]; the charge asymmetry of muons as a function of rapidity in \(W \rightarrow \mu \nu \) decays, measured by the D0 collaboration with 7.3 fb\(^{-1}\) of integrated luminosity [4]; the W-boson charge asymmetry as a function of rapidity in the \(W \rightarrow e \nu \) decay channel, measured by the CDF collaboration with 1 fb\(^{-1}\) of integrated luminosity [5]; the W-boson charge asymmetry as a function of rapidity in the \(W \rightarrow e \nu \) decay channel, measured by the D0 collaboration with 9.7 fb\(^{-1}\) of integrated luminosity [6]. These measurements supersede the previous Run II Tevatron measurements of W-boson charge asymmetry and Z-boson inclusive production. Recently, the D0 collaboration has also released a measurement of the charge asymmetry of electrons as a function of rapidity in \(W \rightarrow e \nu \) decays [7]. However, this measurement is performed with the same data set and event selection as the measurement of Ref. [6], and it cannot be included simultaneously in a PDF fit without provision of the correlation information. The Tevatron W- and Z-boson measurements considered in this study are summarised in Table 1.

Table 1 Summary of the Tevatron W- and Z-boson measurements. For each measurement the observable, the experiment, the integrated luminosity, the phase-space definition, the inclusion in the nominal fit, and the corresponding reference are shown

Besides the Tevatron W- and Z-boson measurements, the HERA I combined measurements of the inclusive DIS neutral- and charged-current cross sections measured by the H1 and ZEUS experiments [1] are used in this study. The neutral-current measurements cover a wide range in Bjorken-x and \(Q^2\), which is essential for the determination of PDFs, whereas the charged-current measurements provide further information to disentangle the contributions in PDFs from u-type and d-type quarks and anti-quarks at \(x>0.01\). The DIS data are required to be in the kinematic region \(Q^2>Q^2_{\text {min}}=7.5\) GeV\(^2\), where perturbative QCD calculations are reliable.

2.2 Experimental uncertainties

Statistical uncertainties are considered to be uncorrelated between bins, with the exception of the D0 measurement of W-boson charge asymmetry of Ref. [6], for which bin-to-bin statistical correlations are provided.

In general, the correlation model of the experimental uncertainties recommended by the Tevatron experiments is adapted and followed in the QCD analysis, with the exception of the experimental systematic uncertainties related to trigger and lepton identification efficiencies. These uncertainties are provided by the D0 and CDF experiments in the form of total uncertainties in each bin of the measurements. However, the trigger and lepton identification corrections are estimated from data, and they are influenced, among other effects, by statistically uncorrelated bin-to-bin fluctuations. Since the exact bin-to-bin correlation pattern of these uncertainties is not provided by the experiments, a conservative approach is followed in this study, and the uncertainties related to trigger and lepton identification efficiencies are treated as uncorrelated bin-to-bin for the nominal fit. According to this prescription, the following uncertainties are treated as uncorrelated bin-to-bin: the central- and forward-electron identification efficiencies of Ref. [3], the trigger isolation efficiency of Ref. [4], the trigger and electron identification efficiencies of Ref. [5], and the electron identification, charge misidentification and positron to electron efficiency corrections of Ref. [7].

All the other experimental systematic uncertainties are considered fully correlated bin-to-bin, with the exception of the D0 measurement of W-boson charge asymmetry of Ref. [6], where the total experimental systematic uncertainty is treated as bin-to-bin uncorrelated, as recommended by the D0 experiment, and for the electron charge asymmetry in \(W \rightarrow e \nu \) decays of Ref. [7], where the uncertainty of the unfolding procedure due to the limited statistics of the Monte Carlo (MC) sample is treated as uncorrelated bin-to-bin. The dependence of the measured asymmetry on the PDF set used to reconstruct the W-boson rapidity was studied in the D0 measurement of W-boson charge asymmetry of Ref. [6]. In this paper, the W-boson charge asymmetry extracted with the CTEQ6.6 PDF set is used as the central value, and the 22 CTEQ6.6 positive and negative PDF eigenvector variations are considered as bin-to-bin correlated systematic uncertainties.

For the two measurements of Z-boson differential cross section as a function of rapidity, statistical uncertainties are scaled to the expected number of events assuming they are Poisson distributed. The experimental systematic uncertainties are treated as multiplicative and linearly scaled to the expected cross sections, except for the background uncertainties, which are treated as additive and are not scaled. For the measurements of W-boson and lepton charge asymmetry, all the uncertainties are treated as additive, and they are not scaled.

The statistical uncertainties of the HERA I data are treated as uncorrelated and scaled to the expected number of events assuming Poisson distribution, whereas the experimental systematic uncertainties are fully correlated and are scaled linearly to the expected cross sections.

3 Theoretical predictions

The theoretical predictions corresponding to the Tevatron measurements of W-boson charge asymmetry and Z-boson inclusive production are included in the fits using APPLGRID [23, 24] files. These predictions have been evaluated with MCFM [25, 26] at NLO QCD according to the phase-space definitions of each measurement, which are as follows: the D0 and CDF measurements of the Z-boson differential cross section as a function of rapidity are defined in the full kinematic range of the decay leptons, without any requirements on the rapidity and \(p_T\) of the leptons. In the D0 measurement, the invariant mass of the dielectron system is defined in the range \(71 < m_{ee} < 111\) GeV, whereas in the CDF measurement, it is defined in the range \(66 < m_{ee} < 116\) GeV. The charge asymmetry of muons as a function of rapidity in \(W \rightarrow \mu \nu \) decays, and the charge asymmetry of electrons in \(W \rightarrow e \nu \) decays, measured by the D0 experiment, are defined with \(p_T^{\ell } > 25\) GeV and \(p_T^{\nu } > 25\) GeV. The W-boson charge asymmetry as a function of rapidity in the \(W \rightarrow e \nu \) decay channel, measured by the CDF collaboration, is defined in the full kinematic range, without any requirements on the lepton rapidity and \(p_T\). The corresponding D0 measurement of the W-boson charge asymmetry in the \(W \rightarrow e \nu \) decay channel is defined in a kinematic region where the charged lepton and the neutrino are required to have \(p_T> 25\) GeV without further requirements on the lepton rapidity. The kinematic requirements of the Tevatron W- and Z-boson measurements are summarised in Table 1. Notice that the CDF and D0 measurements of W-boson charge asymmetry in the \(W \rightarrow e \nu \) decay channel of Refs. [5] and [6] are defined in different kinematic regions and they should not be compared without extrapolating them to a common phase space. Tables of the Tevatron measurements, with updated correlation model, and corresponding APPLGRID theoretical predictions are publicly available at http://herafitter.org.

The QCD predictions for the DIS cross sections are evaluated by solving the DGLAP evolution equations [2732] at NLO in the \(\overline{MS}\) scheme [33] using the QCDNUM program [34] with the renormalisation and factorisation scales set to \(Q^2\). The light quark coefficient functions are calculated in QCDNUM. The heavy c- and b-quark distributions are dynamically generated, and the corresponding coefficient functions for the neutral-current processes with \(\gamma ^*\) exchange are calculated in the general-mass variable-flavour-number scheme [3537], with up to five active quark flavours. For the charged-current processes and the neutral-current processes with a Z contribution, the heavy quarks are treated as massless.

4 QCD analysis

The impact of the Tevatron data on the PDFs is studied with two different approaches: in the first approach they are included in a PDF fit together with the HERA I data, in the second, the impact on existing PDF sets is studied using Hessian profiling [19] and Bayesian reweighting techniques [2022]. The PDF fit allows a detailed study of the settings of the QCD analysis, and in particular of various parametrisation forms. The Hessian profiling and Bayesian reweighting methods allow one to verify the consistency and constraining power of the Tevatron data with respect to PDF sets which include a larger set of experimental data as input. However, the profiling and reweighting methods are approximations with respect to a PDF fit. In particular, they cannot account for variations in the QCD analysis settings other than those which are included in the uncertainty of the already existing PDF sets.

4.1 PDF fit settings

The QCD analysis and PDF extraction is performed with the open-source framework HERAFitter. The charm mass is set to \(m_c = 1.38\) GeV, as estimated from HERA charm production cross section  [38], and the bottom mass to \(m_b = 4.75\) GeV [39]. The strong-interaction coupling constant at the Z boson mass, \(\alpha _s(M_Z)\), is set to 0.118, and two-loop order is used for the running of \(\alpha _s\).

The PDFs for the gluon, u-valence, d-valence, \(\bar{u}\), \(\bar{d}\) quark densities are parametrised at the input scale \(Q^2_0=1.7\) GeV\(^2\) as follows:

$$\begin{aligned}&xf(x) = A_f x^{B_f} (1-x)^{C_f} (1 + D_fx + E_fx^2) e^{F_fx};\\ \nonumber&f = u_v, d_v, g, \bar{u}, \bar{d}. \end{aligned}$$
(1)

The contribution of the s-quark density is taken to be proportional to the \(\bar{d}\)-quark density by setting \(x\bar{s}(x) = r_s x\bar{d}(x)\), with \(r_s=1.0\), as suggested in Ref. [40]. As a cross-check, the alternative choice \(r_s=0.5\) is considered. The strange and anti-strange quark densities are taken to be equal: \(x\bar{s}(x) = x s(x)\). The normalisation of the \(x u_v(x)\) (\(x d_v(x)\)) valence-quark density, \(A_{u_v}\) (\(A_{d_v}\)), is determined by the quark-counting sum rule, whereas the normalisation of the gluon density, \(A_g\), is determined by the momentum sum rule. The \(x\rightarrow 0\) limit of the u- to d-sea quark densities is fixed to unity by setting \(B_{\bar{u}} = B_{\bar{d}}\) and \(A_{\bar{u}} = A_{\bar{d}}\).

A \(\chi ^2\) function used for the data to theory comparison is defined as in Ref. [1], with an additional penalty term as described in Ref. [41], and minimised with MINUIT [42] to extract the PDFs from the data.

4.2 PDF profiling

The impact of a new data set on a given PDF set can be quantitatively estimated with a profiling procedure [19]. The profiling is performed using a \(\chi ^2\) function which includes both the experimental uncertainties and the theoretical uncertainties arising from PDF variations:

$$\begin{aligned} {\chi ^2(\varvec{\beta _\mathrm{exp}},\varvec{\beta _\mathrm{th}}) }= & {} \nonumber \sum _{i=1}^{N_\mathrm{data}} \frac{\textstyle \left( \sigma ^\mathrm{exp}_i + \sum _j \varGamma ^\mathrm{exp}_{ij} \beta _{j,\mathrm exp} - \sigma ^\mathrm{th}_i - \sum _k \varGamma ^\mathrm{th}_{ik}\beta _{k,\mathrm th} \right) ^2}{\varDelta _i^2}\nonumber \\&+ \sum _j \beta _{j,\mathrm exp}^2 + \sum _k \beta _{k,\mathrm th}^2. \end{aligned}$$
(2)

The correlated experimental and theoretical uncertainties are included using the nuisance parameter vectors \(\varvec{\beta _\mathrm{exp}}\) and \(\varvec{\beta _\mathrm{th}}\), respectively. Their influence on the data and theory predictions is described by the \(\varGamma ^\mathrm{exp}_{ij}\) and \(\varGamma ^\mathrm{th}_{ik}\) matrices. The index i runs over all \(N_\mathrm{data}\) data points, whereas the index j(k) corresponds to the experimental (theoretical) uncertainty nuisance parameters. The measurements and the uncorrelated experimental uncertainties are given by \(\sigma ^\mathrm{exp}_i\) and \(\varDelta _i\), respectively, and the theory predictions are \(\sigma _i^\mathrm{th}\). The \(\chi ^2\) function of Eq. 2 can be generalised to account for asymmetric PDF uncertainties:

$$\begin{aligned} \varGamma ^\mathrm{th}_{ik} \rightarrow \varGamma ^\mathrm{th}_{ik} + \Omega ^\mathrm{th}_{ik}\beta _{k, \mathrm th}, \end{aligned}$$
(3)

where \(\varGamma ^\mathrm{th}_{ik} = 0.5(\varGamma ^\mathrm{th+}_{ik} - \varGamma ^\mathrm{th-}_{ik})\) and \(\Omega ^\mathrm{th}_{ik} = 0.5(\varGamma ^\mathrm{th+}_{ik} + \varGamma ^\mathrm{th-}_{ik})\) are determined from the shifts of predictions corresponding to up (\(\varGamma ^\mathrm{th+}_{ik}\)) and down (\( \varGamma ^\mathrm{th-}_{ik}\)) PDF uncertainty eigenvectors.

The minimisation of Eq. 2 in its original form leads to a system of linear equations. The generalised function, with asymmetric PDF uncertainties, is minimised iteratively: the values of \(\varGamma ^\mathrm{th+}_{ik}\) are updated using \(\beta _{k, \mathrm th}\) from the previous iteration and following the substitution of Eq. 3. Several iterations are required to converge, and the procedure is verified using the MINUIT program which yields identical results.

The value at the minimum of the \(\chi ^2\) function provides a compatibility test of the data and theory. In addition, the values at the minimum of the nuisance parameters \(\beta ^\mathrm{min}_{k,\mathrm th}\) can be interpreted as optimisation (“profiling”) of PDFs to describe the data [19]. Explicitly, the profiled central PDF set \(f'_0\) is given by

$$\begin{aligned} f'_0 = f_0 + \sum _k \beta ^\mathrm{min}_{k, \mathrm{th}} \left( \frac{f^{+}_k - f^{-}_k}{2} - \beta ^\mathrm{min}_{k, \mathrm{th}} \frac{f^{+}_k + f^{-}_k - 2f_0}{2} \right) , \end{aligned}$$
(4)

where \(f_0\) is the original central PDF set and \(f^{\pm }_k\) represents the eigenvector sets corresponding to up and down variations.

The shifted PDFs have reduced uncertainties. In general, the shifted eigenvectors are no longer orthogonal, but can be transformed to an orthogonal representation using a standard diagonalisation procedure, as in Ref. [43]. In this method the covariance matrix C of the PDF nuisance parameters is diagonalised as

$$\begin{aligned} \varvec{\beta _\mathrm{th}^T} C \varvec{\beta _\mathrm{th}}= & {} \varvec{\beta _\mathrm{th}^T} G^T D G \varvec{\beta _\mathrm{th}} = \varvec{\beta _\mathrm{th}^T} (\sqrt{D} G)^T \sqrt{D}G \varvec{\beta _\mathrm{th}} \nonumber \\= & {} (G' \varvec{\beta _\mathrm{th}})^T G'\varvec{\beta _\mathrm{th}} = \varvec{(\beta '_\mathrm{th})^T} \varvec{\beta '_\mathrm{th}}, \end{aligned}$$
(5)

where G is an orthogonal matrix, D is a positive definite diagonal matrix, and \(\sqrt{D}\) is a diagonal matrix built of \(\sqrt{D_{ii}}\). The matrices G and D can be constructed using the eigenvectors and eigenvalues of the matrix C. The transformation \(G'\) can be adjusted, using orthogonal transformations, to keep the new eigenvector basis aligned along the original as much as possible. As a result of this adjustment, the transformation matrix can take a triangular form with all diagonal elements greater than zero.

The method can be extended to PDF sets with asymmetric uncertainties: the transformation matrix is determined using symmetrised uncertainties as in Eq. 5, and the orthogonal up and down PDF eigenvectors \(f^{+'}_i\) and \(f^{-'}_i\) are calculated as

$$\begin{aligned} f^{+'}_i =&f'_0 + \sum \limits _j G'_{ji} \left( \frac{f^{+}_j - f^{-}_j}{2} + G'_{ji} \frac{f^{+}_j + f^{-}_j - 2 f_0}{2}\right) , \\ \nonumber f^{-'}_i =&f'_0 - \sum \limits _j G'_{ji} \left( \frac{f^{+}_j - f^{-}_j}{2} - G'_{ji} \frac{f^{+}_j + f^{-}_j - 2 f_0}{2}\right) . \end{aligned}$$

4.3 Bayesian reweighting

An alternative approach to assess the impact of new data on PDFs is the Bayesian reweighting technique, first proposed in Ref. [20] and further developed by the NNPDF collaboration [21, 22]. The Bayesian reweighting can be applied to PDF sets provided in the form of MC replicas, such as the NNPDF3.0 set [10]. Recently, a variant of the method which can be used with PDFs provided in the eigenvector representation has been developed [44] and is also available in HERAFitter.

The Bayesian reweighting is based on the assumption that an ensemble of MC replicas provides a representation of the probability distribution in the space of PDFs. For a given PDF set with \(N_\mathrm{rep}\) replicas \(\{f_k\}\), with \(k=1,2,\ldots ,N_{\mathrm {rep}}\), the central value for a general observable, \(\mathcal {O}(\{f_k\})\), is estimated as the average of the predictions obtained from the ensemble:

$$\begin{aligned} \langle \mathcal {O}\rangle = \frac{1}{N_{\mathrm {rep}}} \sum _{k=1}^{N_{\mathrm {rep}}} \mathcal {O}(f_{k}). \end{aligned}$$
(6)

With the inclusion of new data, the probability distribution associated with the original PDF set is modified according to the Bayes theorem. For each replica k, a weight \(w_k\) is obtained from the \(\chi ^2\) function according to:

$$\begin{aligned} w_k = \frac{(\chi ^2_k)^{\frac{1}{2} (N_{\mathrm {data}}-1) } e^{-\frac{1}{2}\chi ^2_k}}{ \frac{1}{N_{\mathrm {rep}}} \sum ^{N_{\mathrm {rep}}}_{k=1}(\chi ^2_k)^{\frac{1}{2}(N_{\mathrm {data}}-1)} e^{-\frac{1}{2}\chi ^2_k} }, \end{aligned}$$
(7)

where \(N_{\mathrm {data}}\) is the number of new data points and \(\chi ^2_k\) is the \(\chi ^2\) value between data and predictions corresponding to the kth PDF replica.

The prediction for a given observable, after the inclusion of the new data, is evaluated as the weighted average of predictions obtained from the ensemble:

$$\begin{aligned} \langle \mathcal {O}\rangle = \frac{1}{N_{\mathrm {rep}}} \sum _{k=1}^{N_{\mathrm {rep}}} w_k \mathcal {O}(f_{k}). \end{aligned}$$
(8)

The reweighting procedure is very fast and results in a new, updated, MC PDF set. Some of the replicas of the PDF set may have very small weights (typically those which do not describe the new data), and they do not contribute to the ensemble any longer. The number of effective replicas, \(N_\mathrm {eff}\), of a reweighted set is quantified by the Shannon entropy

$$\begin{aligned} N_\mathrm {eff}\equiv \exp \left\{ \frac{1}{N_\mathrm {rep}}\sum _{k=1}^{N_\mathrm {rep}}w_k\ln (N_\mathrm {rep}/w_k)\right\} . \end{aligned}$$
(9)

An un-weighting procedure can be performed on the MC set such that PDFs with small weights are suppressed and a new set is produced, which has unit weight for all PDF replicas in addition to statistically reproducing the averages from Eq. 8.

5 Results

The QCD fit analysis described in Sect. 4 is performed on the Tevatron W- and Z-boson data, together with the HERA I data. The fit is used to study the compatibility of the data with NLO QCD predictions, and to assess the impact of the Tevatron data on PDFs. The profiling and the reweighting techniques are used to asses the impact of the Tevatron data on various PDF sets.

The optimal parametrisation for the PDF fit is found through a parametrisation scan, a procedure first introduced in Ref. [1]. The scan is performed by starting from a parametrisation with a basic polynomial form, where \(D_f\), \(E_f\), and the exponential parameters \(F_f\) of Eq. 1 are set to zero. After application of the quark-counting and momentum sum rules, and of the \(x \rightarrow 0\) constraints on \(\bar{u}\) and \(\bar{d}\), the initial PDFs parametrisation has 10 free parameters. The 15 \(D_f\), \(E_f\) and \(F_f\) additional parameters are allowed to vary, one parameter at a time, and the parameter which induces the largest reduction of \(\chi ^2_{\text {min}}\) is added as a free parameter for the next iteration of the scan. The PDF fits which lead to solutions with negative high-x PDFs are discarded. For each PDF, the exponential term \(e^{F_fx}\) and the polynomial term \((1 + D_fx + E_fx^2)\) are considered as mutually exclusive, that is, when the exponential term is preferred, the polynomial term is no longer considered in the scan, and vice versa. The procedure is stopped when the reduction in the \(\chi ^2_{\text {min}}\) value, \(\Delta \chi ^2_{\text {min}}\), is less than unity.

Table 2 shows the results of the parametrisation study: the parameters which induce the largest \(\Delta \chi ^2_{\text {min}}\) are, in order, \(F_{d_v}\), \(F_{u_v}\), \(D_g\), \(D_{\bar{d}}\), and \(D_{\bar{u}}\). The optimal parametrisation found with this procedure has 15 free parameters, and the PDFs are expressed as:

$$\begin{aligned} xg(x)= & {} A_g x^{B_g} (1-x)^{C_g} (1 + D_gx); \end{aligned}$$
(10)
$$\begin{aligned} x u_v(x)= & {} A_{u_v} x^{B_{u_v}}(1-x)^{C_{u_v}} e^{F_{u_v}x}; \end{aligned}$$
(11)
$$\begin{aligned} x d_v(x)= & {} A_{d_v} x^{B_{d_v}}(1-x)^{C_{d_v}} e^{F_{d_v}x}; \end{aligned}$$
(12)
$$\begin{aligned} x\bar{u}(x)= & {} A_{\bar{u}} x^{B_{\bar{u}}}(1-x)^{C_{\bar{u}}}(1 + D_{\bar{u}}x); \end{aligned}$$
(13)
$$\begin{aligned} x\bar{d}(x)= & {} A_{\bar{d}} x^{B_{\bar{d}}}(1-x)^{C_{\bar{d}}}(1 + D_{\bar{d}}x). \end{aligned}$$
(14)

The parametrisation of Eqs. (10)–(14) is used for a fit to the HERA I data, and for a combined fit to the HERA I and Tevatron W- and Z-boson data. Table 3 shows the \(\chi ^2_{\text {min}}\) per degrees of freedom (dof) of the two fits. The contribution to the total \(\chi ^2_{\text {min}}\) of each data set, henceforth referred to as partial \(\chi ^2\), is also shown. The inclusion of the Tevatron W- and Z-boson data in the fit, which corresponds to 93 additional points, results in an increase of about 110 in the overall \(\chi ^2_{\text {min}}\) of the fit, and the partial \(\chi ^2\) per number of points of each of the Tevatron and HERA I data set is close to unity.

Table 2 Results of the parametrisation study. For each additional free parameter D, E, and F, of the \(d_v\), \(u_v\), gluon, \(\bar{u}\), and \(\bar{d}\) PDF, the reduction of \(\chi ^2_{\text {min}}\) of a fit to the Tevatron and HERA I data, \(\Delta \chi ^2_{\text {min}}\), is shown. For each of the fit with n free parameters, with \(n=10,11,12,13,14\), the largest \(\Delta \chi ^2_{\text {min}}\) is shown in bold, and the corresponding parameter is added as a free parameter for the \(n+1\)-parameters fit. The fits which lead to negative high-x PDFs are shown in parentheses, and they are not considered in the parametrisation study
Table 3 Results of a 15-parameters fit to the HERA I data and to the HERA I and Tevatron W- and Z-boson data. The contribution to the total \(\chi ^2_{\text {min}}\) of each data set and the corresponding number of points are shown

An alternative PDF fit is performed with a value of \(r_s=0.5\) instead of the nominal value of \(r_s=1.0\). The results are similar to the nominal fit. Among the Tevatron data sets, the most significant difference is observed for the CDF Z-boson differential cross section as a function of rapidity, for which the partial \(\chi ^2\) is smaller by about two, indicating that the Tevatron data have small sensitivity to the strange quark PDF.

Figures 1 and 2 show the Tevatron Z- and W-boson measurements, respectively, compared to the theoretical predictions evaluated with the PDFs extracted from the combined fit to the HERA I and Tevatron data.

Fig. 1
figure 1

Theoretical predictions evaluated with the PDFs extracted from a fit to the HERA I and Tevatron data are compared to a Z-boson differential cross section as a function of rapidity, measured by the D0 collaboration and b Z-boson differential cross section as a function of rapidity, measured by the CDF collaboration. The red continuous lines correspond to the theoretical predictions, the red dashed lines are the theoretical predictions shifted by the experimental shift terms \(\sum _j \varGamma ^\mathrm{exp}_{ij} \beta _{j,\mathrm exp}\) of Eq. (2). The yellow bands show the total experimental uncertainty, the black vertical bars show the quadratic sum of statistical and systematic uncorrelated uncertainties. Note that the theoretical predictions and the theoretical predictions shifted by the experimental shift terms are nearly identical and virtually indistinguishable in these plots

Fig. 2
figure 2

Theoretical predictions evaluated with the PDFs extracted from a fit to the HERA I and Tevatron data are compared to a the charge asymmetry of muons as a function of rapidity in \(W \rightarrow \mu \nu \) decays, measured by the D0 collaboration, b the W-boson charge asymmetry as a function of rapidity in the \(W \rightarrow e \nu \) decay channel, measured by the CDF collaboration, and c the W-boson charge asymmetry as a function of rapidity in the \(W \rightarrow e \nu \) decay channel, measured by the D0 collaboration. The red continuous lines correspond to the theoretical predictions, the red dashed lines are the theoretical predictions shifted by the experimental shift terms \(\sum _j \varGamma ^\mathrm{exp}_{ij} \beta _{j,\mathrm exp}\) of Eq. (2). The yellow bands show the total experimental uncertainty, the black vertical bars show the quadratic sum of statistical and systematic uncorrelated uncertainties

The central value and the uncertainties of the PDFs are evaluated with MC replicas [45]: the data points are smeared using Gaussian distributions, according to their experimental uncertainties, and the PDF fit is repeated 1000 times, using different random seeds for the smearing. The central PDFs are calculated as the average of the replicas and the PDF uncertainties are calculated from their standard deviation. Figure 3 shows the comparison of the PDFs extracted with the MC-replica method by fitting the HERA I data, and by fitting the HERA I and Tevatron W- and Z-boson data. Figure 4 shows the comparison of the relative uncertainty of the two PDFs. A significant reduction of the PDF uncertainties is observed in the fit which includes the Tevatron W- and Z-boson measurements, in particular for the valence quarks and \(\bar{d}\) quarks.

Fig. 3
figure 3

PDFs at the starting scale \(Q^2 =1.7\) GeV\(^2\) as a function of Bjorken-x for a \(u_v\), b \(d_v\), c \(\bar{u}\), and d \(\bar{d}\), determined with a fit to the HERA I data (blue), and with a fit to the HERA I and Tevatron W- and Z-boson data (yellow). The bands represent the PDF uncertainty evaluated with the MC-replica method

Fig. 4
figure 4

Relative PDF uncertainties at the starting scale \(Q^2 =1.7\) GeV\(^2\) as a function of Bjorken-x for a \(u_v\), b \(d_v\), c \(\bar{u}\), and d \(\bar{d}\), determined with a fit to the HERA I data (blue), and with a fit to the HERA I and Tevatron W- and Z-boson data (yellow). The bands represent the PDF uncertainty evaluated with the MC-replica method

A fit of the HERA I and Tevatron W- and Z-boson data with the same settings, but with a correlation model in which trigger and identification uncertainties are treated as correlated bin-to-bin, yields very similar central PDFs and PDF uncertainties [not shown]. The \(\chi ^2\) of all the data sets are also very similar, except for the \(\chi ^2\) of the CDF W-boson asymmetry measurements, which is about twice as large.

The W-boson charge asymmetries rely on the reconstruction of the W-boson rapidity, which is measured assuming a fixed W-boson mass, and inferring the unmeasured longitudinal momentum of the neutrino on a statistical basis [5, 6]. The reconstruction of the W-boson rapidity introduces a model dependence in the measurement. To study the possible bias due to the W-boson rapidity reconstruction, an alternative fit is performed in which the W-boson charge asymmetries measured by CDF and D0 are excluded, and the latest D0 measurement of the electron asymmetry is included. The \(\chi ^2_{\text {min}}\)/dof and the partial \(\chi ^2\) of the fit are shown in Table 4. Also for this fit the partial \(\chi ^2\) of each of the Tevatron and HERA I data set is close to unity. The \(d_v\) PDF determined from the fit is shown in Fig. 5, and compared to the nominal fit. The fit to the lepton asymmetry data yields very compatible results, but the uncertainties on the \(d_v\) PDF are up to twice as large.

Fig. 5
figure 5

a d-valence PDF at the scale \(Q^2 = 1.7\) GeV\(^2\) as a function of Bjorken-x and b d-valence relative PDF uncertainties, determined with a fit to the HERA I data (blue), with a fit to the HERA I and Tevatron W-boson asymmetry and Z-boson data (yellow), and with a fit to the HERA I and Tevatron W-boson lepton asymmetry and Z-boson data (green). The bands represent the PDF uncertainty evaluated with the MC-replica method

Table 4 Results of a 15-parameters fit to the HERA I and Tevatron W-boson lepton asymmetry and Z-boson data. The contribution to the total \(\chi ^2_{\text {min}}\) of each data set and the corresponding number of points are shown

The impact of posterior inclusion of the Tevatron W- and Z-boson measurements on the PDF uncertainties as estimated by CT10nlo, MMHT2014, and NNPDF3.0 is assessed by profiling and reweighting. For consistency with the other PDFs, the uncertainties of the CT10nlo PDFs are scaled to 68 % confidence interval by applying a factor of 1.645. The three PDF sets already include the CDF and D0 Z-boson differential cross sections as a function of rapidity, and the MMHT2014 fit also includes the D0 muon charge asymmetry in \(W \rightarrow \mu \nu \) decays and the CDF W charge asymmetry in the \(W \rightarrow e \nu \) decay channel. Only the measurements that are not included in each of the PDF sets are considered for the corresponding profiling or reweighting. The compatibility of the Tevatron data with the CT10nlo, MMHT2014 and NNPDF3.0 sets is tested by evaluating the \(\chi ^2\) function of Eq. (2), accounting for asymmetric PDF uncertainties according to Eq. (3). To perform this calculation for the NNPDF3.0 set, the covariance matrix for the predictions is decomposed using the eigenvector representation. Table 5 shows the compatibility between the Tevatron measurements and the above PDF sets, together with the partial \(\chi ^2\) of each data set. The partial \(\chi ^2\) per number of points of each of the Tevatron data set, and the total \(\chi ^2\)/dof, are close to unity for all the PDFs, when the \(\chi ^2\) evaluation includes the PDF uncertainties. The quality of the agreement significantly deteriorates if the \(\chi ^2\) evaluation neglects the PDF uncertainty. This effect is more pronounced for the CT10nlo and NNPDF3.0 sets which include fewer data from the Tevatron. This indicates the significant constraining power of the Tevatron data. Among the Tevatron data sets, the D0 W charge asymmetry in the \(W \rightarrow e \nu \) decay channel provides the strongest constraints.

Table 5 Comparison between the Tevatron W-boson measurements and the CT10nlo, MMHT2014, and NNPDF3.0 PDFs. The partial \(\chi ^2\) of each data set, the total \(\chi ^2\), and the total \(\chi ^2\) without PDFs uncertainties are shown

The CT10nlo and MMHT2014 PDFs are profiled according to Eq. (4). The results of the profiling on the d-valence PDFs, and on their relative uncertainties, are shown in Fig. 6. The profiling affects the shape of the distribution more for the CT10nlo when compared to MMHT2014 set. Significant reduction of the uncertainties is observed for both sets, in particular in the low- and medium-x range. The NNPDF3.0 PDFs are reweighted to the Tevatron data. The number of effective replicas remaining after reweighting, \(N_\mathrm {eff}\), is only 1 and hence the resulting PDFs are not shown.

Fig. 6
figure 6

d-valence PDF at the scale \(Q^2 = 1.7\) GeV\(^2\) as a function of Bjorken-x before and after profiling for the a CT10nlo and c MMHT2014 PDFs and the corresponding relative uncertainties for b CT10nlo and c MMHT2014

The original and profiled d-valence PDFs, and the result of the fit to the HERA I and Tevatron W- and Z-boson data, are compared in Figs. 7 and 8, respectively. The profiling using Tevatron data improves agreement of the d-valence distribution between the MMHT2014 and CT10nlo PDF sets.

Fig. 7
figure 7

a d-valence PDF at the scale \(Q^2 = 1.7\) GeV\(^2\) as a function of Bjorken-x determined from a fit to the HERA I and Tevatron W- and Z-boson data, and from the CT10nlo and MMHT2014 PDFs, b ratio of d-valence PDFs central values and uncertainties with respect to the d-valence PDF determined from a fit the HERA I and Tevatron W- and Z-boson data

Fig. 8
figure 8

a d-valence PDF at the scale \(Q^2 = 1.7\) GeV\(^2\) as a function of Bjorken-x determined from a fit to the HERA I and Tevatron W- and Z-boson data, and from the profiled CT10nlo and MMHT2014 PDFs, b ratio of d-valence PDFs central values and uncertainties with respect to the d-valence PDF determined from a fit the HERA I and Tevatron W- and Z-boson data

6 Summary

The HERAFitter framework is used to perform a QCD analysis of the DIS data from HERA, together with W- and Z-boson production measurements performed at the Tevatron collider in Run II. The correlation model of the systematic uncertainties of the Tevatron data is investigated, and a modification is proposed which accounts for the statistical nature of some of the systematic uncertainties. The Tevatron and HERA data are well described by a NLO fit with 15 free parameters, with a new parametrisation of PDFs which adds to the basic form a combination of linear and exponential terms. The impact of the Tevatron W- and Z-boson measurements is assessed by comparing PDF uncertainties from a fit to the HERA data alone, and a fit to the HERA and Tevatron data. A significant reduction of the uncertainties is observed in the latter case, for the valence quarks and \(\bar{d}\) quarks in particular.

The Tevatron measurements are also compared to predictions evaluated with modern PDF sets, and the impact of the data on the PDFs is assessed using profiling and reweighting techniques. The profiling techniques take into account asymmetric PDF uncertainties. A good agreement between measurements and predictions is observed, if the PDF uncertainties of the predictions are taken into account. After the inclusion of the Tevatron data, the PDF uncertainties on the d-valence quarks are significantly reduced, especially for the PDF sets which include only the Z-boson data from the Tevatron, and the agreement between the various PDF sets is improved. These findings highlight the importance of the Tevatron W- and Z-boson production data to constrain d-quark and valence PDFs, and they suggest that the data should be used in the future global PDF analyses. All the supporting material to allow fits of the Tevatron data, including the updated correlation model and the grid files for fast theory calculations, are publicly available on the web page of the HERAFitter project.