Combination and QCD analysis of charm and beauty production cross-section measurements in deep inelastic $ep$ scattering at HERA

Measurements of open charm and beauty production cross sections in deep inelastic $ep$ scattering at HERA from the H1 and ZEUS Collaborations are combined. Reduced cross sections are obtained in the kinematic range of negative four-momentum transfer squared of the photon $2.5$ GeV$^2<Q^2<2000$ GeV$^2$ and Bjorken scaling variable $3\cdot10^{-5}<x_{\text{Bj}}<5\cdot10^{-2}$. The combination method accounts for the correlations of the statistical and systematic uncertainties among the different datasets. Perturbative QCD calculations are compared to the combined data. A next-to-leading order QCD analysis is performed using these data together with the combined inclusive deep inelastic scattering cross sections from HERA. The running charm- and beauty-quark masses are determined as $m_c(m_c) = 1.290^{+0.046}_{-0.041}\text{(exp/fit)}^{+0.062}_{-0.014}\text{(model)}^{+0.003}_{-0.031}\text{(parameterisation)}$ GeV and $m_b(m_b) = 4.049^{+0.104}_{-0.109}\text{(exp/fit)}^{+0.090}_{-0.032}\text{(model)}^{+0.001}_{-0.031} \text{(parameterisation)}$~GeV.

Reduced cross sections are obtained in the kinematic range of negative four-momentum transfer squared of the photon 2.5 GeV 2 ≤ Q 2 ≤ 2000 GeV 2 and Bjorken scaling variable 3 · 10 −5 ≤ x Bj ≤ 5 · 10 −2 . The combination method accounts for the correlations of the statistical and systematic uncertainties among the different datasets. Perturbative QCD calculations are compared to the combined data. A next-to-leading order QCD analysis is performed using these data together with the combined inclusive deep inelastic scattering cross sections from HERA. The running charmand beauty-quark masses are determined as m c (m c ) =

Introduction
Measurements of open charm and beauty production in neutral current (NC) deep inelastic electron 1 -proton scattering (DIS) at HERA provide important input for tests of the theory of strong interactions, quantum chromodynamics (QCD). Measurements at HERA  have shown that heavyflavour production in DIS proceeds predominantly via the boson-gluon-fusion process, γ g → QQ, where Q is the heavy quark. The cross section therefore depends strongly on the gluon distribution in the proton and the heavy-quark mass. This mass provides a hard scale for the applicability of perturbative QCD (pQCD). However, other hard scales are also present in this process: the transverse momenta of the outgoing quarks and the negative four momentum squared, Q 2 , of the exchanged photon. The presence of several hard scales complicates the calculation of heavy-flavour production in pQCD. Different approaches have been developed to cope with the multiple scale problem inherent in this process. In this paper, the massive fixed-flavour-number scheme (FFNS) [25][26][27][28][29][30][31][32][33][34][35][36] and different implementations of the variable-flavour-number scheme (VFNS) [37][38][39][40][41] are considered. At HERA, different flavour tagging methods are applied for charm and beauty cross-section measurements: the full reconstruction of D or D * ± mesons [1,2,4-6,10-12,15,17, ai Supported by the Israel Science Foundation aj Supported by the Japanese Ministry of Education, Culture, Sports, Science and Technology (MEXT) and its grants for Scientific Research ak Supported by the Swiss National Science Foundation al Supported by the Polish National Science Centre (NCN) grant no.
DEC-2014/13/B/ST2/02486 1 In this paper the term 'electron' denotes both electron and positron . [20][21][22], which is almost exclusively sensitive to charm production; the lifetime of heavy-flavoured hadrons [7][8][9]14,23] and their semi-leptonic decays [13,16,19], both enabling the measurement of the charm and beauty cross section simultaneously. In general, the different methods explore different regions of the heavy-quark phase space and show different dependencies on sources of systematic uncertainties. Therefore, by using different tagging techniques a more complete picture of heavy-flavour production is obtained.
In this paper, a simultaneous combination of charm and beauty production cross-section measurements is presented. This analysis is an extension of the previous H1 and ZEUS combination of charm measurements in DIS [42], including new charm and beauty data [13,14,16,19,[21][22][23] and extracting combined beauty cross sections for the first time. As a result, a single consistent dataset from HERA of reduced charm and beauty cross sections in DIS is obtained, including all correlations. This dataset covers the kinematic range of photon virtuality 2.5 GeV 2 ≤ Q 2 ≤ 2000 GeV 2 and Bjorken scaling variable 3 · 10 −5 ≤ x Bj ≤ 5 · 10 −2 .
The procedure follows the method used previously [42][43][44][45][46][47]. The correlated systematic uncertainties and the normalisation of the different measurements are accounted for such that one consistent dataset is obtained. Since different experimental techniques of charm and beauty tagging have been employed using different detectors and methods of kinematic reconstruction, this combination leads to a significant reduction of statistical and systematic uncertainties with respect to the individual measurements. The simultaneous combination of charm and beauty cross-section measurements reduces the correlations between them and hence also the uncertainties. The combined reduced charm cross sections of the previous analysis [42] are superseded by the new results presented in this paper.
The combined data are compared to theoretical predictions obtained in the FFNS at next-to-leading order (NLO, O(α 2 s )) QCD using HERAPDF2.0 [48], ABKM09 [29,30] and ABMP16 [32] parton distribution functions (PDFs), and to approximate next-to-next-to-leading order (NNLO, O(α 3 s )) using ABMP16 [32] PDFs. In addition, QCD calculations in the RTOPT [37] VFNS at NLO and approximate NNLO are compared with the data. The NLO calculations are at O(α 2 s ) except for the massless parts of the coefficient functions, which are at O(α s ); the NNLO calculations are one order of α s higher. A comparison is also made to predictions of two variants of the FONLL-C scheme [38][39][40] (O(α 3 s ) (NNLO) in the PDF evolution, O(α 2 s ) in all coefficient functions): the default scheme, which includes next-to-leading-log (NLL) resummation of quasi-collinear final state gluon radiation, and a variant which includes NLL low-x resummation in the PDFs and the matrix elements (NLLsx) [41] in addition.
The new data are subjected to a QCD analysis together with the final inclusive DIS cross-section data from HERA [48] allowing for the determination at NLO of the running charm-and beauty-quark masses, as defined from the QCD Lagrangian in the modified minimum-subtraction (MS) scheme.
The paper is organised as follows. In Sect. 2, the reduced heavy-flavour cross section is defined and the theoretical framework is briefly introduced. The heavy-flavour tagging methods, the data samples and the combination method are presented in Sect. 3. The resulting reduced charm and beauty cross sections are presented in Sects. 4 and 5, where they are compared with theoretical calculations based on existing PDF sets and with existing predictions at NLO and at NNLO in the FFNS and VFNS. In Sect. 6, the NLO QCD analysis is described and the measurement of the running masses of the charm and beauty quarks in the MS scheme at NLO is presented. The conclusions are given in Sect. 7.

Heavy-flavour production in DIS
In this paper, charm and beauty production via NC DIS are considered. In the kinematic range explored by the analysis of the data presented here, Q 2 is much smaller than M 2 Z , such that the virtual photon exchange dominates. Contributions from Z exchange and γ Z interference are small and therefore neglected. The cross section for the production of a heavy flavour of type Q, with Q being either beauty, b, or charm, c, may then be written in terms of the heavy-flavour contributions to the structure functions F 2 and F L , F QQ 2 (x Bj , Q 2 ) and F QQ L (x Bj , Q 2 ), as where y denotes the lepton inelasticity. The superscripts QQ indicate the presence of a heavy quark pair in the final state. The cross section d 2 σ QQ /dx Bj dQ 2 is given at the Born level without QED and electroweak radiative corrections, except for the running electromagnetic coupling, α(Q 2 ). In this paper, the results are presented in terms of reduced cross sections, defined as follows: In the kinematic range addressed, the expected contribution from the exchange of longitudinally polarised photons, F QQ L , is small. In charm production it is expected to reach a few per cent at high y [49]. The structure functions F QQ 2 and F QQ L are calculated to the same order (in most cases O(α 2 s )) in all calculations explicitly performed in this paper.
Various theoretical approaches can be used to describe heavy-flavour production in DIS. At values of Q 2 not very much larger than the heavy-quark mass, m Q , heavy flavours are predominantly produced dynamically by the photongluon-fusion process. The creation of a QQ pair sets a lower limit of 2m Q to the mass of the hadronic final state. This low mass cutoff affects the kinematics and the higher order corrections in the phase space accessible at HERA. Therefore, a careful theoretical treatment of the heavy-flavour masses is mandatory for the pQCD analysis of heavy-flavour production as well as for the determination of the PDFs of the proton from data including heavy flavours.
In this paper, the FFNS is used for pQCD calculations for the corrections of measurements to the full phase space and in the QCD fits. In this scheme, heavy quarks are always treated as massive and therefore are not considered as partons in the proton. The number of (light) active flavours in the PDFs, n f , is set to three and heavy quarks are produced only in the hardscattering process. The leading-order (LO) contribution to heavy-flavour production (O(α s ) in the coefficient functions) is the photon-gluon-fusion process. The NLO massive coefficient functions using on-shell mass renormalisation (pole masses) [25][26][27][28] were adopted by many global QCD analysis groups [31,[33][34][35], providing PDFs derived from this scheme. They were extended to the MS scheme [30], using scale dependent (running) heavy-quark masses. The advantages of performing heavy-flavour calculations in the MS scheme are reduced scale uncertainties and improved theoretical precision of the mass definition [24,36]. In all FFNS heavy-flavour calculations presented in this paper, the default renormalisation scale µ r and factorisation scale µ f are set to µ r = µ f = Q 2 + 4m 2 Q , where m Q is the appropriate pole or running mass.
For the extraction of the combined reduced cross sections of charm and beauty production, it is necessary to predict inclusive cross sections as well as exclusive cross sections with certain phase-space restrictions applied. For this purpose, the FFNS at NLO is used to calculate inclusive [25][26][27][28] and exclusive [50] quantities in the pole-mass scheme. This is currently the only scheme for which exclusive NLO calculations are available.
The QCD analysis at next-to-leading order 2 including the extraction of the heavy-quark running masses is performed in the FFNS with the OPENQCDRAD programme [31,51,52] in the xFitter (former HERAFitter) framework [53]. In OPENQCDRAD, heavy-quark production is calculated either using the MS or the pole-mass scheme of heavy-quark masses. In this paper, the MS scheme is adopted.
Predictions from different variants of the VFNS are also compared to the data. The expectations from the NLO and approximate NNLO RTOPT [37] implementation as used for HERAPDF2.0 [48] are confronted with both the charm and beauty cross sections while the FONLL-C calculations [39][40][41] are compared to the charm data only. In the VFNS, heavy quarks are treated as massive at small Q 2 up to Q 2 ≈ O(m 2 Q ) and as massless at Q 2 ≫ m 2 Q , with interpolation prescriptions between the two regimes which avoid double counting of common terms. In the FONLL-C calculations, the massive part of the charm coefficient functions is treated at NLO (O(α 2 s )) while the massless part and the PDFs are treated at NNLO (O(α 2 s ) and O(α 3 s ), respectively). In addition to the default FONLL-C scheme the NLLsx variant [41] is considered.

Combination of H1 and ZEUS measurements
The different charm-and beauty-tagging methods exploited at HERA enable a comprehensive study of heavy-flavour production in NC DIS.
Using fully reconstructed D or D * ± mesons gives the best signal-to-background ratio for measurements of the charm production process. Although the branching ratios of beauty hadrons to D and D * ± mesons are large, the contributions from beauty production to the observed D or D * ± meson samples are small for several reasons. Firstly, beauty production in ep collisions is suppressed relative to charm production by a factor 1 / 4 due to the quark's electric charge coupling to the photon. Secondly, the photon-gluon-fusion cross section depends on the invariant mass of the outgoing partons, s, which has a threshold value of 4m 2 Q . Because the beautyquark mass, m b , is about three times the charm-quark mass, m c , beauty production is significantly suppressed. Thirdly, in beauty production D and D * ± mesons originate from the fragmentation of charm quarks that are produced by the weak decay of B mesons. Therefore the momentum fraction of the beauty quark carried by the D or D * ± meson is small, so that the mesons often remain undetected.
Fully inclusive analyses based on the lifetime of the heavyflavoured mesons are sensitive to both charm and beauty production. Although the first two reasons given above for the suppression of beauty production relative to charm production also hold in this case, sensitivity to beauty production can be enhanced by several means. The proper lifetime of B mesons is on average a factor of 2 to 3 that of D mesons [59]. Therefore, the charm and beauty contributions can be disentangled by using observables directly sensitive to the lifetime of the decaying heavy-flavoured hadrons. The separation can be further improved by the simultaneous use of observables sensitive to the mass of the heavy-flavoured hadron: the relative transverse momentum, p rel T , of the particle with respect to the flight direction of the decaying heavy-flavoured hadron; the number of tracks with lifetime information; the invariant mass calculated from the charged particles attached to a secondary-vertex candidate.
The analysis of lepton production is sensitive to semileptonic decays of both charm and beauty hadrons. When taking into account the fragmentation fractions of the heavy quarks as well as the fact that in beauty production leptons may originate both from the b → c and the c → s transitions, the semi-leptonic branching fraction of B mesons is about twice that of D mesons [59]. Because of the large masses of B mesons and the harder fragmentation of beauty quarks compared to charm quarks, leptons originating directly from the B decays have on average higher momenta than those produced in D meson decays. Therefore, the experimentally observed fraction of beauty-induced leptons is enhanced relative to the observed charm-induced fraction. Similar methods as outlined in the previous paragraph are then used to further facilitate the separation of the charm and beauty contributions on a statistical basis.
While the measurement of fully reconstructed D or D * ± mesons yields the cleanest charm production sample, it suffers from small branching fractions and significant losses, because all particles from the D or D * ± meson decay have to be measured. Fully inclusive and semi-inclusive-lepton analyses, which are sensitive to both charm and beauty production, profit from larger branching fractions and better coverage in polar angle. However, they are affected by a worse signal to background ratio and the large statistical correla-tions between charm and beauty measurements inherent to these methods.

Data samples
The H1 [60][61][62] and ZEUS [63] detectors were general purpose instruments which consisted of tracking systems surrounded by electromagnetic and hadronic calorimeters and muon detectors, ensuring close to 4π coverage of the ep interaction region. Both detectors were equipped with highresolution silicon vertex detectors [64,65].
The datasets 1-8 have already been used in the previous combination [42] of charm cross-section measurements, while the datasets 9-13 are included for the first time in this analysis. Dataset 9 of the current analysis supersedes one dataset of the previous charm combination (dataset 8 in Table 1 of [42]), because the earlier analysis was based on a subset of only about 30 % of the final statistics collected during the HERA II running period.
For the inclusive lifetime analysis [14] (dataset 1) the reduced cross sections σ cc red and σ bb red are taken directly from Table 1 Datasets used in the combination. For each dataset, the tagging method, the Q 2 range, integrated luminosity (L), centre-of-mass energy ( √ s) and the numbers of charm (N c ) and beauty (N b ) measurements are given. The tagging method VTX denotes inclusive measurements based on lifetime information using a silicon vertex detector. Charge conjugates are always implied for the particles given in the column 'Tagging'

Dataset
Tagging the publication. For all other measurements, the combination starts from the measured double-differential visible cross sections σ vis,bin in bins of Q 2 and either x Bj or y, where the visibility is defined by the particular range of transverse momentum p T and pseudorapidity 3 η of the D meson, lepton or jet as given in the corresponding publications. In case of inclusive D meson cross sections, small beauty contributions as estimated in the corresponding papers are subtracted. Consistent with Eq. (1), all published visible cross-section measurements are corrected to Born level apart from the running of α, i.e. they include corrections for radiation of real photons from the incoming and outgoing lepton using the HER-ACLES programme [66]. QED corrections to the incoming and outgoing quarks are judged to be negligible and are therefore not considered. All cross sections are updated using the most recent hadron decay branching ratios [59].

QQ red
Except for dataset 1 of Table 1, for which only measurements expressed in the full phase space are available, the visible cross sections σ vis,bin measured in a limited phase space are converted to reduced cross sections σ QQ red using a common theory. The reduced cross section of a heavy flavour Q at a reference (x Bj , Q 2 ) point is extracted according to The programme for heavy-quark production in DIS, HVQDIS [50], is used with running α to calculate the theory predictions for σ QQ,th red (x Bj , Q 2 ) and σ th vis,bin in the NLO FFNS. Since the ratio in Eq. (3) describes the extrapolation from the visible phase space in p T and η of the heavy-flavour tag to the full phase space, only the shape of the cross-section predictions in p T and η is relevant for the corrections, while theory uncertainties related to normalisation cancel.
In pQCD, σ th red can be written as a convolution integral of proton PDFs with hard matrix elements. For the identification of heavy-flavour production, however, specific particles used for tagging have to be measured in the hadronic final state. This requires that in the calculation of σ th vis , the convolution includes the proton PDFs, the hard matrix elements and the fragmentation functions. In the case of the HVQDIS programme, non-perturbative fragmentation functions are used. The different forms of the convolution integrals for σ th red and σ th vis necessitate the consideration of different sets of theory parameters.
The following parameters are used in these NLO calculations and are varied within the quoted limits to estimate the uncertainties in the predictions introduced by these parameters: • The renormalisation and factorisation scales are taken as µ r = µ f = Q 2 + 4m 2 Q . The scales are varied simultaneously up or down by a factor of two.  [42,45] at NLO determined within the xFitter framework. No heavy-flavour measurements were included in the determination of these PDF sets. These PDF sets are those used in the previous combination [42] which were calculated for m c = 1.5 ± 0.15 GeV, α n f =3 s (M Z ) = 0.105 ± 0.002 and simultaneous variations of the renormalisation and factorisation scales up or down by a factor two. For the determination of the PDFs, the beauty-quark mass was fixed at m b = 4.50 GeV. The renormalisation and factorisation scales were set to µ r = µ f = Q for the light flavours and to µ r = µ f = Q 2 + 4m 2 Q for the heavy flavours. For all parameter settings considered, the respective HERAPDF1.0 set is used. As a cross check of the extrapolation procedure, the cross sections are also evaluated with the 3-flavour NLO versions of the HERAPDF2.0 set (FF3A) [48]; the differences are found to be smaller than the PDF-related cross-section uncertainties.
For the calculation of σ th vis , assumptions have been made on the fragmentation of the heavy quarks into particular hadrons and, when necessary, on the subsequent decays of the heavy flavoured hadrons into the particles used for tagging. In the calculation of σ th vis the following settings and parameters are used in addition to those needed for σ th red and are varied within the quoted limits: • The charm fragmentation function is described by the Kartvelishvili function [67] controlled by a single parameter α K to describe the longitudinal fraction of the charmquark momentum transferred to the D or D * ± meson. Depending on the invariant massŝ of the outgoing parton system, different values of α K and their uncertainties are used as measured at HERA [68,69] for D * ± mesons. The variation of α K as a function ofŝ observed in D * ± measurements has been adapted to the longitudinalfragmentation function of ground state D mesons not originating from D * ± decays [42]. Transverse fragmentation is modelled by assigning to the charmed hadron a transverse momentum k T with respect to the direction of the charmed quark with an average value of k T = 0.35 ± 0.15 GeV [42]. • The charm fragmentation fractions of a charm quark into a specific charmed hadron and their uncertainties are taken from [70]. • The beauty fragmentation function is parameterised according to Peterson et al. [71] with ǫ b = 0.0035 ± 0.0020 [72]. • The branching ratios of D and D * ± mesons into the specific decay channels analysed and their uncertainties are taken from [59]. • The branching fractions of semi-leptonic decays of heavy quarks to a muon or electron and their uncertainties are taken from [59]. • The decay spectra of leptons originating from charmed hadrons are modelled according to [73]. • The decay spectra for beauty hadrons into leptons are taken from the PYTHIA [74] Monte Carlo (MC) programme, mixing direct semi-leptonic decays and cascade decays through charm according to the measured branching ratios [59]. It is checked that the MC describes BELLE and BABAR data [75,76] well. • When necessary for the extrapolation procedure, partonlevel jets are reconstructed using the same clustering algorithms as used on detector level, and the cross sections are corrected for jet-hadronisation effects using corrections derived in the original papers [16,23]. 4 While the central values for the extrapolation factors σ QQ,th red (x Bj , Q 2 )/σ th vis,bin (see Eq. 3) are obtained in the FFNS pole-mass scheme at NLO, their uncertainties are calculated such that they should cover potential deviations from the unknown 'true' QCD result. The resulting reduced cross sections, with these uncertainties included, thus can be compared to calculations in any QCD scheme to any order.

Combination method
The quantities to be combined are the reduced charm and beauty cross sections, σ cc red and σ bb red , respectively. The combined cross sections are determined at common (x Bj , Q 2 ) grid points. For σ cc red , the grid is chosen to be the same as in [42]. The results are given for a centre-of-mass energy of √ s = 318 GeV. When needed, the measurements are transformed to the common grid (x Bj , Q 2 ) points using inclusive NLO FFNS calculations [25][26][27][28]. The uncertainties on the resulting scaling factors are found to be negligible.
The combination is based on the χ 2 -minimisation procedure [43] used previously [42,44,45,48]. The total χ 2 is defined as The three sums run over the different input datasets e, listed in Table 1, the (x Bj , Q 2 ) grid points i, for which the measured cross sections µ i,e are combined to the cross sections m i , and the sources j of the shifts b j in units of standard deviations of the correlated uncertainties. The correlated uncertainties comprise the correlated systematic uncertainties and the statistical correlation between the charm and beauty cross-section measurements. The quantities γ i,e j , δ i,e,stat and δ i,e,uncorr denote the relative correlated systematic, relative statistical and relative uncorrelated systematic uncertainties, respectively. The components of the vector m are the combined cross sections m i while those of the vector b are the shifts b j .
In the present analysis, the correlated and uncorrelated systematic uncertainties are predominantly of multiplicative nature, i.e. they are proportional to the expected cross sections m i . The statistical uncertainties are mainly background dominated and thus are treated as constant. All experimental systematic uncertainties are treated as independent between H1 and ZEUS. For the datasets 1, 8 and 11 of Table 1, statistical correlations between charm and beauty cross sections are accounted for as reported in the original papers. Where necessary, the statistical correlation factors are corrected to take into account differences in the kinematic region of the charm and beauty measurements (dataset 11) or binning schemes (dataset 1), using theoretical predictions calculated with the HVQDIS programme. The consistent treatment of the correlations of statistical and systematic uncertainties, including the correlations between the charm and beauty data sets where relevant, yields a significant reduction of the overall uncertainties of the combined data, as detailed in the following section.

Combined cross sections
The values of the combined cross sections σ cc red and σ bb red , together with the statistical, the uncorrelated and correlated systematic and the total uncertainties, are listed in Tables 2  and 3. A total of 209 charm and 57 beauty data points are combined simultaneously to obtain 52 reduced charm and 27 Table 2 Reduced cross section for charm production, σ cc red , obtained by the combination of H1 and ZEUS measurements. The cross-section values are given together with the statistical (δ stat ) and the uncorrelated (δ uncor ) and correlated (δ cor ) systematic uncertainties. The total uncertainties (δ tot ) are obtained through adding the statistical, uncorrelated and correlated systematic uncertainties in quadrature  Table 3 Reduced cross section for beauty production, σ bb red , obtained by the combination of H1 and ZEUS measurements. The cross-section values are given together with the statistical (δ stat ) and the uncorrelated (δ uncor ) and correlated (δ cor ) systematic uncertainties. The total uncertainties (δ tot ) are obtained through adding the statistical, uncorrelated and correlated systematic uncertainties in quadrature The distribution of pulls of the 266 input data points with respect to the combined cross sections is presented in Fig. 1. It is consistent with a Gaussian around zero without any significant outliers. The observed width of the pull distribution is smaller than unity which indicates a conservative estimate of the systematic uncertainties.
There are 167 sources of correlated uncertainties in total. These are 71 experimental systematic sources, 16 sources due to the extrapolation procedure (including the uncertainties on the fragmentation fractions and branching ratios) and 80 statistical charm and beauty correlations. The sources of correlated systematic and extrapolation uncertainties are listed in the "Appendix", together with the cross-section shifts induced by the sources and the reduction factors of the uncertainties, obtained as a result of the combination. Both quantities are given in units of σ of the original uncertainties. All shifts of the systematic sources with respect to  Fig. 1 The pull distribution for the combination of the charm and beauty reduced cross sections. The solid line shows a fit of a Gaussian to the pull distribution. The mean and the width quoted are the results from the fit their nominal values are smaller than 1.5σ . Several systematic uncertainties are reduced significantly -by up to factors of two or more. The reductions are due to the different heavy-flavour tagging methods applied and to the fact that for a given process (charm or beauty production), an unique cross section is probed by the different measurements at a given (x Bj , Q 2 ) point. Those uncertainties for which large reductions have been observed already in the previous analysis [42] are reduced to at least the same level in the current combination, some are further significantly reduced due to the inclusion of new precise data [21][22][23]. The shifts and reductions obtained for the 80 statistical correlations between charm and beauty cross sections are not shown. Only small reductions in the range of 10% are observed and these reductions are independent of x Bj and Q 2 . The cross-section tables of the combined data together with the full information on the uncertainties can be found elsewhere [77].
The combined reduced cross sections σ cc red and σ bb red are shown as a function of x Bj in bins of Q 2 together with the input H1 and ZEUS data in Figs. 2 and 3, respectively. The combined cross sections are significantly more precise than any of the individual input datasets for charm as well as for beauty production. This is illustrated in Fig. 4, where the charm and beauty measurements for Q 2 = 32 GeV 2 are shown. The uncertainty of the combined reduced charm cross section is 9% on average and reaches values of about 5% or better in the region 12 GeV 2 ≤ Q 2 ≤ 60 GeV 2 . The uncertainty of the combined reduced beauty cross section is about 25% on average and reaches about 15% at small x Bj and 12 GeV 2 ≤ Q 2 ≤ 200 GeV 2 .
In Fig. 5, the new combined reduced charm cross sections are compared to the results of the previously published combination [42]. Good consistency between the different combinations can be observed. A detailed analysis of the cross-section measurements reveals a relative improvement in precision of about 20 % on average with respect to the previous measurements. The improvement reaches about 30 % in the range 7 GeV 2 ≤ Q 2 ≤ 60 GeV 2 , where the newly added datasets (datasets 9-11 in Table 1) contribute with high precision.

Comparison with theory predictions
The combined heavy-flavour data are compared with calculations using various schemes and PDF sets. Predictions using the FFNS [25][26][27][28][29][30][31][32][33][34][35] and the VFNS [37][38][39][40][41] are considered, focussing on results using HERAPDF2.0 PDF sets. The data are also compared to FFNS predictions based on different variants of PDF sets at NLO and approximate NNLO provided by the ABM group [29,32]. In the case of the VFNS, recent calculations of the NNPDF group based on the NNPDF3.1sx PDF set [41] at NNLO, which specifically aim for a better description of the DIS structure functions at small x Bj and Q 2 , are also confronted with the measurements. The calculations in the FFNS based on the HERAPDF2.0 FF3A PDF set will be considered as reference calculations in the subsequent parts of the paper.

FFNS predictions
In Figs. 6 and 7, theoretical predictions of the FFNS in the MS running mass scheme are compared to the combined reduced cross sections σ cc red and σ bb red , respectively. The theoretical predictions are obtained within the open-source QCD fit framework for PDF determination xFitter [53], which uses the OPENQCDRAD programme [31,51,52] for the cross-section calculations. The running heavy-flavour masses are set to the world average values [59] of m c (m c ) = 1.27 ± 0.03 GeV and m b (m b ) = 4.18 ± 0.03 GeV. The predicted reduced cross sections are calculated using the HERA-PDF2.0 FF3A [48] and ABMP16 [32] NLO PDF sets using NLO (O(α 2 s )) coefficient functions and the ABMP16 [32] NNLO PDF set using approximate NNLO coefficient functions. The charm data are also compared to NLO predictions based on the ABKM09 [29] NLO PDF set used in the previous analysis [42] of combined charm data. This PDF set was determined using a charm-quark mass of m c (m c ) = 1.18 GeV. The PDF sets considered were extracted without explicitly using heavy-flavour data from HERA with the exception of the ABMP16 set, in which the HERA charm data from the previous combination [42] and some of the beauty data [14,23] have been included. For the predictions based on the HERAPDF2.0 FF3A set, theory uncertainties are given which are calculated by adding in quadrature the uncertainties from the PDF set, simultaneous variations of µ r and µ f by a factor of two up or down and the variation of the quark masses within the quoted uncertainties.
The FFNS calculations reasonably describe the charm data ( Fig. 6) although in the kinematic range where the data are very precise, the data show a x Bj dependence somewhat steeper than predicted by the calculations. For the different PDF sets and QCD orders considered, the predictions are quite similar at larger Q 2 while some differences can be observed at smaller Q 2 or x Bj . For beauty production (Fig. 7) the predictions are in good agreement with the data within the considerably larger experimental uncertainties.
The description of the charm-production data is illustrated further in Fig. 8, which shows the ratios of the reduced cross sections for data, ABKM09 and ABMP16 at NLO and approximate NNLO with respect to the NLO reduced cross sections predicted in the FFNS using the HERAPDF2.0 FF3A set. For Q 2 ≥ 18 GeV 2 , the theoretical predictions are similar to each other in the kinematic region accessible at HERA. In this region, the predictions based on the different

H1 and ZEUS
PDF sets and orders are well within the theoretical uncertainties obtained for the HERAPDF2.0 FF3A set. Towards smaller Q 2 and x Bj , some differences in the predictions become evident. In the region of 7 GeV 2 ≤ Q 2 ≤ 120 GeV 2 , the theory tends to be below the data at small x Bj and above the data at large x Bj , independent of the PDF set and order used. In Fig. 9, the corresponding ratios are shown for the beauty data. In the kinematic region accessible at HERA, the predictions are very similar to each other. Within the experimental uncertainties, the data are well described by all calculations.

VFNS predictions
In Fig. 10, predictions of the RTOPT [37] NLO and approximate NNLO VFNS using the corresponding NLO and NNLO HERAPDF2.0 PDF sets are compared to the charm measurements. As in Fig. 8, the ratio of data and theory predictions to the reference calculations are shown. While the NLO VFNS predictions are in general consistent with both the data cross sections and the reference calculations, the approximate NNLO cross sections show somewhat larger differences, about 10% smaller than the reference cross sections in the region 12 GeV 2 ≤ Q 2 ≤ 120 GeV 2 . On the other hand, at Q 2 ≤ 7 GeV 2 the x Bj slopes of the NNLO VFNS predictions tend to describe the data somewhat better than the reference calculations. Overall, the NLO and approximate NNLO VFNS predictions describe the data about equally well, but not better than the reference FFNS calculations.
In Fig. 11, the same ratios as in the preceding paragraph are shown for beauty production. In the kinematic region accessible in DIS beauty production at HERA, the differences between the different calculations are small in comparison to the experimental uncertainties of the measurements.
The calculations considered so far generally show some tension in describing the x Bj slopes of the measured charm data over a large range in Q 2 . Therefore the charm data are compared in Fig. 12 to recent calculations [41,78] in the FONLL-C scheme with (NNLO+NLLsx) and without (NNLO) low-x resummation in both O(α 2 s ) matrix elements and O(α 3 s ) PDF evolution, using the NNPDF3.1sx framework, which aim for a better description of the proton structure functions at low x Bj and Q 2 . The charm data from the previous combination have already been used for the determination of the NNPDF3.1sx PDFs. Both calculations provide a better description of the x Bj shape of the measured charm cross sections for Q 2 < 32 GeV 2 . However, the predictions lie significantly below the data in most of the phase space. This is especially the case for the NNLO+NLLsx calculations. Overall, the description is not improved with respect to the FFNS reference calculations. The comparison to data of the different predictions considered is summarised in Table 4 in which the agreement with data is expressed in terms of χ 2 and the corresponding fit probabilities (p values). The table also includes a comparison to the previous combined charm data [42]. The agreement of the various predictions with the charm cross-section measurements of the current analysis is poorer than with

H1 and ZEUS
the results of the previous combination, for which consistency between theory and data within the experimental uncertainties is observed for most of the calculations. As shown in Sect. 4, the charm cross sections of the current analysis agree well with the previous measurements but have considerably smaller uncertainties. The observed changes in the χ 2 values are consistent with the improvement in data precision if the predictions do not fully describe reality. The tension observed between the central theory predictions and the charm data ranges from ∼ 3σ to more than 6σ , depending on the prediction. Among the calculations considered, the NLO FFNS calculations provide the best description of the charm data. For the beauty cross sections, good agreement of theory and data is observed within the larger experimental uncertainties. In all cases, the effect of the PDF uncertainties on the χ 2 values is negligible.

QCD analysis
The combined charm and beauty data are used together with the combined HERA inclusive DIS data [48] to perform a QCD analysis in the FFNS using the MS massrenormalisation scheme at NLO. The main focus of this anal-ysis is the simultaneous determination of the running heavyquark masses m c (m c ) and m b (m b ). The theory description of the x Bj dependence of the reduced charm cross section is also investigated.

H1 and ZEUS
restricted to Q 2 ≥ Q 2 min = 3.5 GeV 2 . No such cut is applied to the charm and beauty data, since the relevant scales µ 2 r = µ 2 f = Q 2 + 4m 2 Q are above 3.5 GeV 2 for all measurements. This theory setup is slightly different from that used for the original extraction [48] of HERAPDF2.0 FF3A. In contrast to the analysis presented here, HERAPDF2.0 FF3A was determined using the on-shell mass (pole-mass) scheme for the calculation of heavy-quark production and F L was calculated to O(α 2 s ). Perturbative QCD predictions were fit to the data using the same χ 2 definition as for the fits to the inclusive DIS data (equation (32) in reference [48]). It includes an additional logarithmic term that is relevant when the estimated statistical and uncorrelated systematic uncertainties in the data are rescaled during the fit [88]. The correlated systematic uncertainties are treated through nuisance parameters.
The procedure for the determination of the PDFs follows the approach of HERAPDF2.0 [48]. At the starting scale µ f,0 , the density functions of a parton f of the proton are parametrised using the generic form: where x is the fraction of the incoming proton momentum carried by the incoming parton in the proton's infinite-momentum frame. The parametrised PDFs are the gluon distribution xg(x), the valence quark distributions xu v (x) and xd v (x), and the u-and d-type antiquark distributions xU (x) and x D(x).
At the initial QCD evolution scale 5 µ 2 f,0 = 1.9 GeV 2 , the default parameterisation of the PDFs has the form: The gluon density function, xg(x), is different from Eq. (5),  sum rules. The B and B ′ parameters determine the PDFs at small x, and the C parameters describe the shape of the distributions as x → 1. The parameter C ′ g = 25 is fixed [89]. Additional constraints B U = B D and A U = A D (1 − f s ) are imposed to ensure the same normalisation for the xu and xd distributions as x → 0. The strangeness fraction f s = xs/(xd + xs) is fixed to f s = 0.4 as in the HERA-PDF2.0 analysis [48].
The selection of parameters in Eq. (6) from the general form, Eq. (5), is made by first fitting with all D and E parameters set to zero, and then including them one at a time in the fit. The improvement in the χ 2 of the fit is monitored. If χ 2 improves significantly, the parameter is added and the procedure is repeated until no further significant improvement is observed. This leads to the same 14 free PDF parameters as in the inclusive HERAPDF2.0 analysis [48].
The PDF uncertainties are estimated according to the general approach of HERAPDF2.0 [48], in which the experimental, model, and parameterisation uncertainties are taken into account. The experimental uncertainties are determined from the fit using the tolerance criterion of χ 2 = 1. Model uncertainties arise from the variations of the strong coupling constant α An additional parameterisation uncertainty is considered by using the functional form in Eq. (6) with E u v = 0. The χ 2 in this variant of the fit is only 5 units worse than that with the released E u v parameter; changing this parameter noticeably affects the mass determination. In addition, µ 2 f,0 is varied within 1.6 GeV 2 < µ 2 f,0 < 2.2 GeV 2 . The parameterisation uncertainty is determined at each x Bj value from the maximal differences between the PDFs resulting from the central fit and all parameterisation variations. The total uncertainty is obtained by adding the fit, model and parameterisation uncertainties in quadrature. The values of the input parameters for the fit and their variations considered, to evaluate model and parameterisation uncertainties, are given in Table 5. Table 5 List of uncertainties for the charm-and beauty-quark mass determination. The PDF parameterisation uncertainties not shown have no effect on m c (m c ) and m b (m b )

Parameter
Variation

QCD fit and determination of the running heavy-quark masses
In the QCD fit, the running heavy-quark masses are fitted simultaneously with the PDF parameters in Eq. (6). The fit yields a total χ 2 = 1435 for 1208 degrees of freedom. The ratio χ 2 /d.o.f. = 1.19 is similar in size to the values obtained in the analysis of the HERA combined inclusive data [48]. The resulting PDF set is termed HERAPDF-HQMASS. The central values of the fitted parameters are given in the "Appendix". In Fig. 13, the PDFs at the scale µ 2 f,0 = 1.9 GeV 2 are presented. Also shown are the PDFs, including experimental uncertainties, obtained by a fit to the inclusive data only with the heavy-quark masses fixed to m c (m c ) = 1.27 GeV and m b (m b ) = 4.18 GeV [59]. No significant differences between the two PDF sets are observed. Only a slight enhancement in the gluon density of HERAPDF-HQMASS compared to that determined from the inclusive data only can be observed around x = 2 · 10 −3 . This corresponds to the region in x where the charm data are most precise. When used together with the inclusive HERA data, the heavy-flavour data have only little influence on the shape of the PDFs determined with quark masses fixed to their expected values. This confirms the findings [48] made with the previously published combined charm data. However, the smaller uncertainties of the new combination reduce the uncertainty of the charm-quark mass determination with respect to the previous result 6 [42]. The beauty-quark mass determination improves the previous result based on a single dataset [23]. The running heavy-quark masses are determined as: 6 The previous analysis did not consider scale variations and a less flexible PDF parameterisation was used.
The individual contributions to the uncertainties are listed in Table 5. The model uncertainties are dominated by those arising from the QCD scale variations. In the case of the charmquark mass, the variation in α s also yields a sizeable contribution while the other sources lead to uncertainties of typically a few MeV, both for m c (m c ) and m b (m b ). The main contribution to the parameterisation uncertainties comes from the fit variant in which the term E u v is set to zero, other contributions are negligible. Both mass values are in agreement with the corresponding PDG values [59] and the value of m c (m c ) determined here agrees well with the result from the previous analysis of HERA combined charm cross sections [42]. A cross check is performed using the Monte Carlo method [90,91]. It is based on analysing a large number of pseudo datasets called replicas. For this cross check, 500 replicas are created by taking the combined data and fluctuating the values of the reduced cross sections randomly within their statistical and systematic uncertainties taking into account correlations. All uncertainties are assumed to follow a Gaussian distribution. The central values for the fitted parameters and their uncertainties are estimated using the mean and RMS values over the replicas. The obtained heavyquark masses and their experimental/fit uncertainties are in agreement with those quoted in Eq. (7).
In order to study the influence of the inclusive data on the mass determination, fits to the combined inclusive data only are also tried. In this case, the fit results are very sensitive to the choice of the PDF parameterisation. When using the default 14 parameters, the masses are deter- .00 GeV. The sensitivity to the PDF parameterisation and the large experimental/fit uncertainties for a given parameterisation demonstrate that attempts to extract heavy-quark masses from inclusive HERA data alone are not reasonable in this framework. The large effect on the fitted masses observed here, when setting E u v = 0, motivates the E u v variation in the HERAPDF-HQMASS fit.
The NLO FFNS predictions based on HERAPDF-HQMASS are compared to the combined charm and beauty cross sections in Figs. 14 and 15, respectively. The predictions based on the HERAPDF2.0 set are included in the figures. Only minor differences between the different predictions can be observed. This is to be expected because of the similarities of the PDFs, in particular that of the gluon and the values of the heavy-quark masses. The description of the data is similar to that observed for the predictions based on the HERAPDF2.0 FF3A set.
In Fig. 16, the ratios of data and predictions based HERAPDF-HQMASS to the predictions based on HERA-PDF2.0 FF3A are shown for charm production. The description of the data is almost identical for both calculations. The data show a steeper x Bj dependence than expected in NLO FFNS. The partial χ 2 value of 116 for the heavy-flavour data 7 (d.o.f. = 79) in the fit presented is somewhat large. It corresponds to a p value 8 of 0.004, which is equivalent to 2.9σ . A similar behaviour can be observed already for the charm cross sections from the previous combination [42], albeit at lower significance due to the larger uncertainties.
In Fig. 17, the same ratios as in Fig. 16 are shown for beauty production. Agreement is observed between theory and data within the large uncertainties of the measurements. 7 It is not possible to quote the charm and the beauty contribution to this χ 2 value separately because of the correlations between the combined charm and beauty measurements. 8 The χ 2 and the p value given here do not correspond exactly to the statistical definition of χ 2 or p value because the data have been used in the fit to adjust theoretical uncertainties. Therefore the theory is somewhat shifted towards the measurements. However this bias is expected to be small because the predictions are mainly constrained by the much larger and more precise inclusive data sample. Since in LO QCD heavy-flavour production proceeds via boson-gluon-fusion, at least two partons, the heavy-quark pair, are present in the final state. Therefore, already in LO, the x of the incoming parton is different from x Bj measured at the photon vertex. At LO, the gluon x is given by It depends on the kinematic DIS variables x Bj and Q 2 and on the invariant massŝ of the heavy-quark pair. At higher orders, the final state contains additional partons, such that x cannot be expressed in a simple way. Independent of the order of the calculations, only an average x can be determined at a given (x Bj , Q 2 ) point by the integration over all contributions to the cross section in the vicinity of this phase space point. In Fig. 18, the ratio of the measured reduced cross sections to the NLO FFNS predictions based on HERAPDF-HQMASS is shown as a function of x instead of x Bj , where x is the geometric mean calculated at NLO with HVQDIS. While the charm measurements cover the range 0.0005 x 0.1 the beauty data are limited to a higher x range, 0.004 x 0.1, because of the large beauty-quark mass. For the charm data, a deviation from the reference calculation is evident, showing a steeper slope in x in the range 0.0005 x 0.01, consistent with being independent of Q 2 . Due to the larger experimental uncertainties, no conclusion can be drawn for the beauty data.
6.4 Increasing the impact of the charm data on the gluon density While inclusive DIS cross sections constrain the gluon density indirectly via scaling violations, and directly only through higher order corrections, heavy-flavour production probes the gluon directly already at leading order. Contributions to heavy-flavour production from light-flavour PDFs are small. For charm production they amount to five to eight per cent, varying only slightly with x Bj or Q 2 [49]. Because of the high precision of σ cc red reached in this analysis, a study is performed to enhance the impact of the charm measurement on the gluon determination in the QCD fit.
To reduce the impact of the inclusive data in the determination of the gluon density function, a series of fits is performed by requiring a minimum x Bj ≥ x Bj,min for the inclusive data included in the fit, with x Bj,min varying from 2·10 −4 to 0.1. No such cut is applied to the heavy-flavour data. The χ 2 /d.o.f. values for the inclusive plus heavy-flavour data and the partial χ 2 /d.o.f. for the heavy-flavour data only are presented in Fig. 19 as a function of x Bj,min . The partial χ 2 /d.o.f. for the heavy-flavour data improves significantly with rising x Bj,min cut reaching a minimum at x Bj,min ≈ 0.04, while the χ 2 /d.o.f. for the inclusive plus heavy-flavour data sample is slightly larger than that obtained without a cut in x Bj . For further studies x Bj,min = 0.01 is chosen. The total χ 2 is 822 for 651 degrees of freedom. The partial χ 2 of the heavy-flavour data improves to 98 for 79 degrees of freedom (corresponding to a p value of 0.07 or 1.8σ ). The resulting gluon density function, shown in Fig. 20 at the scale µ 2 f = 1.9 GeV 2 , is significantly steeper than the gluon density function determined when including all inclusive measurements in the fit. The other parton density functions are consistent with the result of the default fit.
In Fig. 21, a comparison is presented of the ratios of the combined reduced charm cross section and the cross section as calculated from the alternative fit, in which the The values of χ 2 per degree of freedom of the QCD fit to the inclusive and heavy-flavour data: (triangles) for the heavy-flavour data only and (dots) for the inclusive plus heavy-flavour data when including in the fit only inclusive data with x Bj ≥ x Bj,min inclusive data are subject to the cut x Bj ≥ 0.01, to the reference cross sections based on HERAPDF2.0 FF3A. The predictions from HERAPDF-HQMASS are also shown. As expected, the charm cross sections fitted with the x Bj cut imposed on the inclusive data rise more strongly towards small x Bj and describe the data better than the other predictions. In general, the predictions from the fit with x Bj cut follow nicely the charm data. A similar study for beauty is also made but no significant improvement in the description of the beauty data is observed. The heavy-quark masses extracted from the fit with x Bj ≥ 0.01 are consistent with those quoted in Eq. (7).
Cross-section predictions based on the three PDF sets, discussed in the previous paragraph, are calculated for inclusive DIS. In Fig. 22, these predictions are compared to the inclusive reduced cross sections [48] for NC e + p DIS. The predictions based on HERAPDF2.0 FF3A and on HERAPDF-HQMASS agree with the inclusive measurement. The calculations based on the PDF set determined by requiring x Bj ≥ 0.01 for the inclusive data predict significantly larger inclusive reduced cross sections at small x Bj .
This study shows that a better description of the charm data can be achieved by excluding the low-x Bj inclusive data in the fit. However, the calculations then fail to describe the inclusive data at low x Bj . In the theoretical framework used in this analysis, it seems impossible to resolve the 2.9σ difference in describing simultaneously the inclusive and charm measurements from HERA, using this simple approach of changing the gluon density. The comparison of various theory predictions to the charm data in section 5 suggests that the situation is unlikely to improve at NNLO because the NNLO predictions presented provide a poorer description of the charm data than that observed at NLO. The combined inclusive analysis [48] already revealed some tensions in the theory description of the inclusive DIS data. The current analysis reveals some additional tensions in describing simultaneously the combined charm data and the combined inclusive data.

Summary
Measurements of charm and beauty production cross sections in deep inelastic ep scattering by the H1 and ZEUS experiments are combined at the level of reduced cross sections, accounting for their statistical and systematic correlations. The beauty cross sections are combined for the first time. The datasets are found to be consistent and the combined data have significantly reduced uncertainties. The combined charm cross sections presented in this paper are significantly more precise than those previously published.
Next-to-leading and approximate next-to-next-to-leadingorder QCD predictions of different schemes are compared to the data. The calculations are found to be in fair agreement with the charm data. The next-to-leading-order calculations in the fixed-flavour-number scheme provide the best description of the heavy-flavour data. The beauty data, which have larger experimental uncertainties, are well described by all QCD predictions.
The new combined heavy-flavour data together with the previously published combined inclusive data from HERA are subjected to a next-to-leading-order QCD analysis in the fixed-flavour-number scheme using the MS running-mass definition. The running heavy-quark masses are determined as: The simultaneously determined parton density functions are found to agree well with HERAPDF2.0 FF3A.
The QCD analysis reveals some tensions, at the level of 3σ , in describing simultaneously the inclusive and the heavyflavour HERA DIS data. The measured reduced charm cross sections show a stronger x Bj dependence than obtained in the combined QCD fit of charm and inclusive data, in which the PDFs are dominated by the fit of the inclusive data. A study in which inclusive data with x Bj < 0.01 are excluded from the fit is carried out. A better description of the charm data can be achieved this way. However, the resulting PDFs fail to describe the inclusive data in the excluded x Bj region. Alternative next-to-leading-order and next-to-next-leadingorder QCD calculations considered, including those with low-x resummation, do not provide a better description of the combined heavy-flavour data.  Table 6 Sources of bin-to-bin correlated systematic uncertainties considered in the combination. For each source, the affected datasets are given, together with the cross-section shift induced by this source and the reduction factor of the correlated uncertainty in units of σ after the first iteration. For those measurements which have simultaneously extracted charm and beauty cross sections, a suffix b or c indicates that the given systematic source applies only to the charm or beauty measurements, respectively