Measurement of the Charm and Beauty Structure Functions using the H1 Vertex Detector at HERA

Inclusive charm and beauty cross sections are measured in e-p and e+p neutral current collisions at HERA in the kinematic region of photon virtuality 5<Q^2<2000 GeV^2 and Bjorken scaling variable 0.0002<x<0.05. The data were collected with the H1 detector in the years 2006 and 2007 corresponding to an integrated luminosity of 189 pb^-1. The numbers of charm and beauty events are determined using variables reconstructed by the H1 vertex detector including the impact parameter of tracks to the primary vertex and the position of the secondary vertex. The measurements are combined with previous data and compared to QCD predictions.


Introduction
The measurement of the inclusive charm (c) and beauty (b) quark cross sections and the derived structure functions F cc 2 and F bb 2 in DIS at HERA is an important test of the theory of the strong interaction, quantum chromodynamics (QCD), within the Standard Model. These measurements uniquely constrain the parton density functions (PDFs) of the proton, in particular its b and c content. Precise knowledge of the PDFs is for example essential at the Large Hadron Collider (LHC). The predictions of the 'standard candle' QCD processes at the LHC, such as the inclusive production of W and Z bosons, are sensitive to the theoretical treatment of heavy quarks [1][2][3][4][5][6][7]. The b quark density is important in Higgs production at the LHC in both the Standard Model and in extensions to the Standard Model such as supersymmetric models at high values of the mixing parameter tan β [8].
This paper reports on measurements made in neutral current deep inelastic scattering (DIS) at HERA of the charm and beauty contributions to the inclusive proton structure function F 2 in the range of virtuality of the exchanged photon 5 ≤ Q 2 ≤ 2000 GeV 2 and Bjorken x 0.0002 ≤ x ≤ 0.05. The analysis uses the precise spatial information from the H1 vertex detector to separate events containing c and b flavoured hadrons from light quark events. The analysis extends to lower and higher Q 2 than previous H1 measurements [9, 10] which used a similar technique to the one used in this paper.
The analysis is based on a dataset with an integrated luminosity of 189 pb −1 , which is about three times greater than in the previous measurements. The data was recorded in the years 2006 and 2007 with 54 pb −1 taken in e − p mode and 135 pb −1 in e + p mode. The ep centre of mass energy is √ s = 319 GeV, with a proton beam energy of E p = 920 GeV and electron beam energy of E e = 27.6 GeV. This dataset is referred to here as HERA II. Many details of the analysis are similar to the previous measurements [9, 10], referred to here as HERA I. The HERA I and HERA II measurements are combined to produce a complete HERA dataset. Measurements of the charm contribution to the proton structure function have also been made at HERA using D or D * meson production [11,12]. There are also measurements of charm and beauty in DIS using semi-leptonic decays [13].
Events containing heavy quarks are distinguished from those containing only light quarks using variables that are sensitive to the longer lifetimes of heavy flavour hadrons. The most important of these variables are the transverse displacement of tracks from the primary vertex and the reconstructed position of a secondary vertex in the transverse plane. For events with three or more tracks in the vertex detector the reconstructed variables are used as input to an artificial neural network. This method has better discrimination between c and b compared to previous methods [9 , 10], which used only the transverse displacement of tracks from the primary vertex. Lifetime based methods have the advantage over more exclusive methods, such as D * or muon tagging, in that a higher fraction of heavy flavour events may be used, although the background from light quark events is larger. The charm structure function F cc 2 and the beauty structure function F bb 2 are obtained from the measured c and b cross sections after applying small corrections for the longitudinal structure functions F cc L and F bb L .

Monte Carlo Simulation
Monte Carlo simulations are used to correct for the effects of the finite detector resolution, acceptance and efficiency. The Monte Carlo program RAPGAP [14] is used to generate DIS events for the processes ep → ebbX, ep → eccX and ep → eqX where q is a light quark of flavour u, d or s. RAPGAP combines O(α s ) matrix elements with higher order QCD effects modelled by parton showers. The heavy flavour event samples are generated according to the 'massive' photon gluon fusion (PGF) matrix element [15] with the mass of the c and b quarks set to m c = 1.5 GeV and m b = 4.75 GeV, respectively. The DIS cross section is calculated using the leading order (LO) 3-flavour PDFs from MRST (MRST2004F3LO [16]). The partonic system for the generated events is fragmented according to the Lund string model [17] implemented within the PYTHIA program [18]. The c and b quarks are hadronised according to the Bowler fragmentation function [19]. The HERACLES program [20] interfaced to RAPGAP calculates single photon radiative emissions off the lepton line, virtual and electroweak corrections. The Monte Carlo program PHOJET [21] is used to simulate the background contribution from photoproduction γp → X.
The samples of generated events are passed through a detailed simulation of the detector response based on the GEANT3 program [22], and through the same reconstruction software as is used for the data.

H1 Detector
Only a short description of the H1 detector is given here; a more complete description may be found in [23]. A right handed coordinate system is employed with the origin at the position of the nominal interaction point that has its Z-axis pointing in the proton beam, or forward, direction and X (Y ) pointing in the horizontal (vertical) direction. The pseudorapidity is related to the polar angle θ by η = − ln tan(θ/2).
Charged particles are measured in the central tracking detector (CTD). This device consists of two cylindrical drift chambers interspersed with Z-chambers to improve the Z-coordinate reconstruction and multi-wire proportional chambers mainly used for triggering. The CTD is operated in a uniform solenoidal 1.16 T magnetic field, enabling the momentum measurement of charged particles over the polar angular range 20 • < θ < 160 • .
The CTD tracks are linked to hits in the vertex detector, the central silicon tracker CST [24], to provide precise spatial track reconstruction. The CST consists of two layers of double-sided silicon strip detectors surrounding the beam pipe, covering an angular range of 30 • < θ < 150 • for tracks passing through both layers. The information on the Z-coordinate of the CST tracks is not used in the analysis presented in this paper. For CTD tracks with CST hits in both layers the transverse distance of closest approach (DCA) to the nominal vertex in X-Y , averaged over the azimuthal angle, is measured to have a resolution of 43 µm⊕51 µm/(P T [GeV]) where P T is the transverse momentum of the particle. The first term represents the intrinsic resolution (including alignment uncertainty) and the second term is the contribution from multiple scattering in the beam pipe and the CST.
The track detectors are surrounded in the forward and central directions (4 • < θ < 155 • ) by a fine-grained liquid argon calorimeter (LAr) and in the backward region (153 • < θ < 178 • ) by a lead-scintillating fibre calorimeter (SPACAL) [25] with electromagnetic and hadronic sections. These calorimeters are used in this analysis to measure and identify the scattered electron 1 and also provide energy and angular reconstruction for final state particles from the hadronic system.
Electromagnetic calorimeters situated downstream in the electron beam direction allow detection of photons and electrons scattered at very low Q 2 . The luminosity is measured with these calorimeters from the rate of photons produced in the Bethe-Heitler process ep → epγ.

DIS Event Selection
The events are triggered by requiring a compact, isolated electromagnetic cluster in either the LAr or SPACAL calorimeters with an overall trigger efficiency of almost 100%. The electromagnetic cluster with the highest transverse energy, which also passes stricter offline criteria is taken as the scattered electron. The Z-position of the interaction vertex, reconstructed by one or more charged tracks in the tracking detectors, must be within ±20 cm to match the acceptance of the CST. The interaction vertex approximately follows a Gaussian distribution with a standard deviation of 13 cm.
Photoproduction events and DIS events with a hard photon radiated from the initial state electron are suppressed by requiring i (E i − p z,i ) > 35 GeV. Here, E i and p z,i denote the energy and longitudinal momentum components of a particle and the sum is over all final state particles including the scattered electron and the hadronic final state (HFS). The HFS particles are reconstructed using a combination of tracks and calorimeter deposits in an energy flow algorithm that avoids double counting [26].
The event kinematics, including the photon virtuality Q 2 , the Bjorken scaling variable x and the inelasticity variable y, are reconstructed with the 'eΣ' method [27], which uses the scattered electron and the HFS. The variables obey the relation x = Q 2 /sy. In order to have good acceptance in the SPACAL and to ensure that the HFS has a significant transverse momentum, events are selected for Q 2 > 4.5 GeV 2 . The analysis is restricted to y > 0.07 in order to ensure that the direction of the quark which is struck by the photon is mostly in the CST angular range. An upper y cut is applied that varies from 0.5 at low Q 2 to 0.85 at high Q 2 in order to suppress photoproduction background. The measurement is made differentially by dividing the data into discrete y-Q 2 intervals. This binning scheme is preferable to one using x-Q 2 boundaries as it avoids event losses near the cuts on y. The RAPGAP Monte Carlo program is used to estimate the acceptance of c (b) events that have a c (b) quark with tranverse momentum greater than 0.3 GeV and within the angular acceptance of the CST. The overall c (b) quark acceptance is 89% (90%) with a minimum of 75% (75%) in any y-Q 2 bin.
The position of the beam interaction region in X-Y (beam spot) is calculated from information of tracks with CST hits and updated regularly to account for drifts during beam storage. The beam interaction region has an elliptical shape with a size of around 90 µm in x and around 22 µm in Y .

Track Selection
The impact parameter δ of a track, which is the transverse DCA of the track to the primary vertex point (see figure 1), is only determined for those tracks which are measured in the CTD and have at least two CST hits linked (referred to as CST tracks). The beam spot is used as the position of the primary vertex. CST tracks are required to have a transverse momentum P track The direction of the struck quark is used in the determination of the sign of δ. The vector of the struck quark (P q T , θ q , φ q ) is reconstructed as the azimuthal angle of the highest transverse momentum jet in the event. If there is no jet reconstructed in the event the vector is reconstructed from the electron [28] so that P q T = P elec T , cos θ q = 1−8E 2 e y 2 /(4E 2 e y 2 +Q 2 (1−y)), φ q = 180 • − φ elec , where P elec T and φ elec are the transverse momentum and the azimuthal angle in degrees of the scattered electron, respectively. Jets are reconstructed using the inclusive longitudinally invariant k T algorithm with massless P T recombination scheme and distance parameter R 0 = 1 in the η − φ plane [29]. The algorithm is run in the laboratory frame using all reconstructed HFS particles and the resultant jets are required to have transverse momentum greater than 1.5 GeV and to be in the angular range 15 • < θ < 155 o . Approximately 95% (99%) of c (b) events have φ q reconstructed from a jet, as determined from the Monte Carlo simulation. Tracks with azimuthal angle φ track outside ±90 • of φ q are assumed not to be associated to the struck quark and rejected.
If the angle α between φ q and the line joining the primary vertex to the point of DCA is less than 90 • , δ is defined as positive. It is defined as negative otherwise. Figure 1 shows an example of a track with positive δ and one with negative δ. The δ distribution, shown in figure 2, is seen to be asymmetric with positive values in excess of negative values indicating the presence of long-lived particles. It is found to be well described by the Monte Carlo simulation. CST tracks with |δ| > 0.1 cm are rejected to suppress light quark events containing long-lived strange particles.
The number of reconstructed CST tracks N track associated to the struck quark is an important quantity since for higher track multiplicities more information can be used. In the kinematic range of this measurement 22% of the events have no selected track, 26% of the events have N track = 1, 23% have N track = 2 and 29% have N track ≥ 3.

Secondary Vertex Reconstruction
The complete set of reconstructed tracks in the event is used to simultaneously reconstruct a secondary and primary vertex in the transverse plane. Two vertices are reconstructed in each event even if they are not statistically separable. Each track is assigned a weight for each vertex, which is a measure of the probability that the track originated at that vertex [30]. In this approach tracks are not assigned unambiguously to one vertex or the other. A simultaneous fit to the primary and secondary vertex is made, with the weights of all tracks of the event considered for the primary vertex, but only the weights of CST tracks considered for the secondary vertex. The beam spot together with its spread is used as an additional constraint to the primary vertex. The vertex configuration that minimises the global χ 2 is found iteratively using deterministic annealing [31].
The transverse distance between the secondary and primary vertex is defined as L xy . The secondary vertex significance S L is L xy /σ(L xy ), where σ(L xy ) is the uncertainty on L xy . If the absolute difference between the azimuthal angle of the secondary vertex (taking the primary vertex as the origin) and φ q is less than 90 • , S L is signed positive and negative otherwise. A measure of the decay multiplicity N SV track is made by counting the number of tracks with weight greater than 0.8 to the secondary vertex. This method was shown in [9] to yield consistent results with the default method that used track significances.

Quark Flavour Separation
The track significance is defined as δ/σ(δ), where σ(δ) is the uncertainty on δ. The significances S 1 , S 2 and S 3 , are defined as the significance of the CST track with the highest, second highest and third highest absolute significance, respectively. The significances take the sign of δ (see section 4.2). Tracks with a negative sign for significance are likely not to arise from particles with a large lifetime and are used in this analysis to estimate the light quark contribution.
Tracks that do not have the same significance sign as S 1 are ignored. The S 1 distribution is used for events with one remaining CST track after this selection and the S 2 distribution is used if there are two remaining CST tracks. For events with three or more remaining CST tracks, where more information is available, information is combined from the significance distributions and the reconstructed secondary vertex using an artificial neural network (NN) that takes into account correlations of the input variables [32]. In this way each event with at least 1 CST track appears in exactly one distribution and the distributions are statistically independent. The S 1 and S 2 distributions are shown in figure 3.
The NN has inputs of S 1 , S 2 , S 3 , S L , N track , N SV track and P track T of the CST tracks with the highest and second highest transverse momentum. The NN has one hidden layer with 5 nodes. It was trained using a sample of around 5000 Monte Carlo b events as 'signal' and a similar number of Monte Carlo c events as 'background'. No attempt is made to discriminate against the light quark events since their impact is minimized by the subtraction procedure described below. The same NN is used for all y-Q 2 bins. The distributions of the NN inputs are shown in figures 4 and 5. These distributions are dominated by light quark events. It can be seen that the Monte Carlo simulation gives a reasonable description of these distributions. It is also apparent that these distributions have good separation power between light, c and b events. The decrease in events around zero for the S 1 and S 2 distributions is due to the requirement that |S 1 | > |S 2 | > |S 3 |. In order to see how well the Monte Carlo simulation describes the heavy flavour contribution of the NN input distributions, the distribution for those events with S 2 > 3 is taken and the distribution for those events with S 2 < −3 is subtracted from it. This has the effect of greatly reducing the light quark distribution which is almost symmetric in S 2 . This subtraction method can be used for any distribution. The subtracted distributions of the NN inputs are shown in figures 6 and 7. The Monte Carlo simulation gives a good description of these distributions. It can be seen that b events tend to have a higher track multiplicity and more tracks at higher P T . Other distributions are checked in a similar way. Distributions of variables related to the struck quark, P q T and η q , as well as the kinematic variables log Q 2 and log x are shown in figure 8 for N track ≥ 2. The Monte Carlo simulation gives a good description of these distributions.
The output of the NN is shown in figure 9. It gives output values in the range from about 0.2 to 0.95. The NN output is signed according to the sign of S 1 . It can be seen that: the light quark distribution is approximately symmetric and peaks towards low absolute values; the c and b distributions are asymmetric with more positive than negative entries; the b events are peaked towards 1, whereas the c events are peaked towards 0. The Monte Carlo simulation gives a good description of the distribution.
Since the light quark S 1 , S 2 and NN distributions are nearly symmetric around zero the sensitivity to the modelling of the light quarks can be reduced by subtracting the contents of the negative bins from the contents of the corresponding positive bins. The subtracted distributions are shown in figure 10. The resulting distributions are dominated by c quark events, with a b quark fraction increasing towards the upper end of the distributions. The light quarks contribute a small fraction, although this fraction is larger than in [10], mainly due to the lower Q 2 and P track T selections in the present analysis.
The fractions of c, b and light quarks of the data are extracted in each y-Q 2 interval using a least squares simultaneous fit to the subtracted S 1 , S 2 and NN distributions (as in figure 10) and the total number of inclusive events before any CST track selection. Only those bins in the significance distributions which have at least 25 events before subtraction are considered in the fit, since Gaussian errors are assumed. The last fitted bin of the significance distributions, which usually has the lowest statistics, is made 3 times as wide as the other bins (see figure 10). If any of the bins before subtraction within the NN output range contain less than 25 events the bin size is doubled, which ensures all bins contain at least 25 events.
The light, c and b Monte Carlo simulation samples are used as templates. The Monte Carlo light, c and b contributions in each y-Q 2 interval are scaled by factors ρ l , ρ c and ρ b , respectively, to give the best fit to the observed subtracted S 1 , S 2 and NN distributions and the total number of inclusive events in each y − Q 2 interval. Only the statistical errors of the data and Monte Carlo simulation are considered in the fit. The fit to the subtracted significance and NN distributions mainly constrains ρ c and ρ b , whereas the overall normalisation constrains ρ l .
Since the error on ρ c is much smaller than that of ρ b the c cross section is measured in more y-Q 2 intervals than the b cross section. Therefore two sets of fits are performed, one with a fine binning of 29 bins and the other with a coarse binning of 12 bins. The two sets of fits are performed in an identical manner. The results of the two sets of fits are listed in tables 1 and 2. Also included in the tables are the χ 2 /n.d.f. evaluated using statistical errors only. Acceptable values are obtained for all bins. The fit is also performed to the complete data sample and shown in figure 10. The stability of the method is checked by repeating the fit to the complete data in a variety of ways: fitting the e + p and e − p data separately; using S 3 instead of the NN; using S L instead of the NN; using the NN alone without S 1 and S 2 ; using S 1 and S 2 without the NN; using the NN alone without negative subtraction. All give consistent results within statistical and systematic errors.
The results of the fit in each y-Q 2 interval are converted to a measurement of the 'reduced c cross section' defined from the differential cross section as using:σ where α is the fine structure constant evaluated with scale Q 2 , and N MCgen l , N MCgen c , and N MCgen b are the number of light, c, and b quark events generated from the Monte Carlo simulation in each bin. The inclusive reduced cross sectionσ(x, Q 2 ) is taken from H1 measurements: Tables 17 and 19 from [33], Tables 10 and 11 from [34] and Table 11 from [35]. A bin centre correction δ BCC is applied using a NLO QCD expectation forσ cc andσ to convert the bin averaged measurement into a measurement at a given x-Q 2 point. This NLO QCD expectation is calculated from the results of a fit similar to that performed in [36] but using the massive scheme to generate heavy flavours.
The reduced cross section is corrected using the Monte Carlo simulation for pure QED radiative effects. The photoproduction background is not subtracted, which, due to the method used to calculate the reduced cross sections, means that the fraction of c and b events in the photoproduction background is assumed to be the same as in the DIS data. In most of the y range the background from photoproduction events is negligible and does not exceed 4% in any y-Q 2 interval used in this analysis. Events that contain c hadrons via the decay of b hadrons are not included in the definition of the reduced c cross section. The reduced b cross section is evaluated in the same manner.

Systematic Uncertainties
The following uncertainties are taken into account in order to evaluate the systematic error.
• An uncertainty in the δ resolution of the tracks is estimated by varying the resolution by an amount that encompasses any difference between the data and simulation evaluated from figure 2. This was achieved by applying an additional Gaussian smearing in the Monte Carlo simulation of 200 µm to 5% of randomly selected tracks and 12 µm to the rest.
• A track efficiency uncertainty is assigned of 1% due to the CTD and of 2% due to the CST.
• The uncertainties on the various D and B meson lifetimes, decay branching fractions and mean charge multiplicities are estimated by varying the input values of the Monte Carlo simulation by the errors on the world average measurements. For the branching fractions of b quarks to hadrons and the lifetimes of the D and B mesons the central values and errors on the world averages are taken from [37]. For the branching fractions of c quarks to hadrons the values and uncertainties are taken from [38], which are consistent with measurements made in DIS at HERA [39]. For the mean charged track multiplicities the values and uncertainties for c and b quarks are taken from MarkIII [40] and LEP/SLD [41] measurements, respectively.
• The uncertainty on the fragmentation function of the heavy quarks is estimated by reweighting the events according to the longitudinal string momentum fraction z carried by the heavy hadron in the Lund model using weights of ( for charm quarks and of (1 ∓ 0.5) · (1 − z) + z · (1 ± 0.5) for beauty quarks. The variations for the charm fragmentation are motivated by comparison of the Monte Carlo simulation with H1 data [42].
• An uncertainty on the QCD model of heavy quark production is estimated by reweighting the jet transverse momentum and pseudorapidity by (P jet T /(10 GeV)) ±0.2 and (1 ± η jet ) ±0.15 for charm jets and (P jet T /(10 GeV)) ±0. 3 and (1 ± η jet ) ±0.3 for bottom jets. These values are obtained by comparing these variations with the measured reduced cross section for b and c jets [43]. The effects of each of these uncertainties on the subtracted reconstructed distributions of P q T and η q are shown in figure 8, where the data is seen to be consistent with the Monte Carlo simulation within the uncertainties.
• The uncertainty on the asymmetry of the light quark δ distribution is estimated by repeating the fits with the subtracted light quark distributions (figure 10) changed by ±30%.
The light quark asymmetry was checked to be within this uncertainty by comparing the asymmetry of Monte Carlo simulation events to that of the data for K 0 candidates, in the region 0.1 < |δ| < 0.5 cm, where the light quark asymmetry is enhanced.
• An error on φ q is estimated by shifting φ q by 2 • (5 • ) for events with (without) a reconstructed jet. These shifts were estimated by comparing the difference between φ q and the track azimuthal angle in data and Monte Carlo simulation.
• The uncertainty on the hadronic energy scale is estimated by changing the hadronic energy by ±2%.
• The uncertainty in the photoproduction background is taken as 100% of the fraction of photoproduction events in each bin, for events with N track ≥ 1.
• Uncertainties on the acceptance and bin centre correction due to the input heavy quark structure functions used are estimated by reweighting the inputσ cc distribution by x ±0.1 and 1 ± 0.2 ln[Q 2 /(10 GeV 2 )] andσ bb by x ±0.3 and 1 ± 0.4 ln[Q 2 /(10 GeV 2 )]. The range of variation of the input structure functions was estimated by comparing to the measured values obtained in this analysis. The effects of each of these uncertainties on the subtracted distributions of log Q 2 and log x are shown in figure 8, where the data is seen to be consistent with the Monte Carlo simulation within the uncertainties.
The above systematic uncertainties are evaluated by making the changes described above to the Monte Carlo simulation and repeating the procedure to evaluate the reduced c and b cross sections, including the fits. The uncertainties are evaluated separately for each x-Q 2 measurement.
Additional contributions to the systematic error due to the inclusive cross section are taken from the corresponding x-Q 2 bin in [33][34][35], since the measurements are normalised to the inclusive cross section measurements (see equation 2). The error due to the inclusive DIS selection includes a 1.1-1.5% uncertainty on the luminosity measurement; an uncertainty on the scattered electron polar angle of 0.2-3.0 mrad and energy of 0.2-2.0% depending on the energy and angle; typically < 1% combined error due to trigger and scattered electron identification efficiency; and a 0.5-1.0% uncertainty on the reduced cross section evaluation due to QED radiative corrections.
A detailed list of the systematic effect on each reduced cross section measurement is given in tables 3 and 4. The errors of δ resolution and track efficiency are considered uncorrelated between the HERA I and HERA II datasets. All other errors are assumed 100% correlated.

Comparison and Combination of Data
The measurements ofσ cc andσ bb are shown as a function of x for fixed values of Q 2 in figures 11 and 12 respectively. Also shown in these figures are the HERA I data extracted using measurements based on the displacement of tracks [9, 10]. The HERA I measurements for Q 2 ≤ 60 GeV 2 are normalised to the H1 inclusive cross sections measurements from [33] and [34]. Theσ cc andσ bb data from HERA I and HERA II show good agreement for all measured x and Q 2 values. In figure 11 theσ cc data are also compared with those extracted from D * meson measurements by H1 [12], which were obtained using a NLO program [44] based on DGLAP evolution to extrapolate the measurements outside the visible D * range. The D * data agree well with the measurements from the present analysis.
The HERA I and HERA II datasets are combined for each x-Q 2 point where there are two measurements. The combination is performed using a weighted mean using those errors considered uncorrelated between the two data sets (the statistical errors and the systematic errors of δ resolution and track efficiency, see section 5): where σ comb is the combined measurement and σ I and σ II are the HERA I and HERA II measurements respectively, with their respective errors of δ I = i δ i I and δ II = i δ i II , where i 12 denotes the statistical error and each source of uncorrelated error between the two data sets. These contributions to the combined error are evaluated as: For those systematics j considered correlated between the two datasets (all apart from the errors of δ resolution and track efficiency) the systematic error as evaluated from HERA II is taken, δ j II . The total error on the combined measurement is therefore evaluated from: All data that is subsequently shown is from the combined dataset. Tables 3 and 4 list the combined results for c and b. The results listed in these tables supersede the HERA I measurements published in [9] and [10]. If fits are performed with the data listed in these tables it should be noted that, although each x-Q 2 measurement is statistically independent, there is a correlation between the c and b measurements. Since the c and b binning is very different this correlation may be neglected for most bins. However, for the first and last bin the binning is identical so the correlation coefficient C bc listed in table 1 should be used.

Comparison with QCD
The leading contribution to heavy flavour production in the region Q 2 < ∼ M 2 is given by the massive boson-gluon fusion matrix element [15,45] convoluted with the gluon density of the proton. In the region where Q 2 is much larger than M 2 the massive approach may be a poor approximation due to the large logarithms log Q 2 /M 2 which are not resummed [2]. Here, the heavy quarks can be treated as massless partons with the leading order contribution coming from the quark parton model and the heavy quark parton densities. In QCD fits to global hard-scattering data the parton density functions are usually extracted using the general mass variable flavour number scheme (GM VFNS) [1][2][3][4] for heavy quarks which interpolates from the massive approach at low scales to the massless approach at high scales.
The data are compared with recent QCD predictions based on the GM VFNS from MSTW [7] (at NLO and NNLO), CTEQ [6] (at NLO) and from H1 [34] (at NLO). The MSTW predictions use the MSTW08 PDFs which have m c = 1.4 GeV and m b = 4.75 GeV, and the renormalization and factorization scales are set to µ r = µ f = Q. The CTEQ predictions use the CTEQ6.6 PDFs where m c = 1.3 GeV, m b = 4.5 GeV and µ r = µ f = Q 2 + M 2 . The predictions from H1 use the H1PDF 2009 PDFs and the same heavy flavour treatment, including the quark masses and perturbative scales, as for the MSTW08 NLO predictions [7].
The data are also compared with predictions based on CCFM [46] parton evolution and massive heavy flavour production. The CCFM predictions use the A0 PDF set [47] with m c = 1.4 GeV, m b = 4.75 GeV and µ r = µ f = ŝ + Q 2 T , whereŝ is the square of the partonic centre of mass energy and Q T is the transverse momentum of the heavy quark pair.
Theσ cc data as a function of x for fixed values of Q 2 are compared in figure 13 with the QCD predictions from CCFM, CTEQ and MSTW at NNLO, and in figure 14 with the predictions 13 from H1 and MSTW at NLO. In figure 13 the GM VFNS predictions from CTEQ and MSTW at NNLO are observed to be similar with the size of the largest differences between the two being at the level of the total experimental errors on the data. The CTEQ and MSTW predictions provide a reasonable description of the rise of the data with decreasing x across the whole of the measured kinematic range thus supporting the validity of PDFs extracted using the GM VFNS. The predictions based on CCFM evolution tend to undershoot the data at the lowest values of Q 2 and x but also provide a reasonable description for the rest of the measured phase space. In figure 14 the GM VFNS predictions from H1 and MSTW at NLO, which implement the same heavy flavour treatment [7], are similar and also provide a reasonable description of the data. The H1 predictions are shown with uncertainty bands representing the experimental and theoretical uncertainties [34]. The inner error band describes the experimental fit uncertainty, the middle error band represents the experimental and model uncertainties added in quadrature and the outer error band represents the fit parameterisation uncertainty added in quadrature with all the other uncertainties. The largest contribution to the uncertainty comes from the model which is dominated by the variation of the charm quark mass, which is varied from 1.38 to 1.47 GeV. The total uncertainties on the H1PDF 2009 reduced charm cross section predictions are generally smaller than those on the data.
Theσ bb data as a function of x for fixed values of Q 2 are compared with the QCD predictions in figures 15 and 16. In figure 15 the CTEQ and CCFM predictions are seen to be very similar across the whole range of the measurements. The MSTW NNLO predictions are around 35% higher than CTEQ and CCFM at low values of Q 2 , with the difference decreasing with increasing values of Q 2 . The differences between the theory predictions for the b cross section at low Q 2 are much reduced compared with the theoretical status at the time of the HERA I publication where there was a factor 2 difference at Q 2 = 12 GeV 2 [10]. In figure 16 the MSTW and H1 NLO QCD predictions forσ bb are observed, as for the case ofσ cc , to be very similar. The uncertainty on the H1 predictions is again dominated by the model uncertainty due to the variation of the b quark mass, which is varied from 4.3 to 5.0 GeV. At lower values of Q 2 the uncertainties on the H1 PDF predictions are larger than those on the data. The reduced b cross section data, including the points in the newly measured regions, are well described by all the present QCD predictions.
The structure function F cc 2 is evaluated from the reduced cross sectioñ where the longitudinal structure function F cc L is estimated from the same NLO QCD expectation as used for the bin centre correction. The correction due to F cc L is negligible for most bins but contributes up to 6.7% of the reduced cross section at the highest value of y. The structure function F bb 2 is evaluated in the same manner. The measurements of F cc 2 and F bb 2 are presented in table 3 and shown as a function of Q 2 in figure 17 and figure 18 respectively. The data are compared with the GM VFNS QCD predictions from CTEQ [6] at NLO and from MSTW at NLO and NNLO [7]. The description of the charm data by the MSTW QCD calculations is reasonable, with the NNLO being somewhat better than NLO. The CTEQ NLO prediction also gives a reasonable description of the data.
The measurements are presented in figure 19 in the form of the fractional contribution to the total ep cross section The b fraction f bb is defined in the same manner. In the present kinematic range the value of f cc is around 17% on average and increases slightly with increasing Q 2 and decreasing x.
The value of f bb increases rapidly with Q 2 from about 0.2% at Q 2 = 5 GeV 2 to around 1% for Q 2 > ∼ 60 GeV 2 . The NNLO QCD predictions of MSTW shown in figure 19 are found to describe the data well.

Conclusion
The reduced charm and beauty cross sections in deep inelastic scattering are measured for a wide range of Q 2 and Bjorken x using the HERA II data. The analysis was performed using several variables including the significance (the impact parameter divided by its error) and the position of the secondary vertex as reconstructed from the vertex detector. For selected track multiplicities of 1 or 2 the highest and second highest significance distributions are used to evaluate the charm and beauty content of the data. For selected track multiplicities ≥ 3 several variables are combined using an artificial neural network.
The reduced cross sections agree with previous measurements using a similar technique, but have reduced errors and cover an extended Q 2 range. HERA I and HERA II data are combined resulting in more precise reduced cross section and structure function measurements. The charm and beauty fractional contributions to the total ep cross section are also measured. In this kinematic range the charm cross section contributes on average 17% and the beauty fraction increases from about 0.2% at Q 2 = 5 GeV 2 to 1.0% for Q 2 > ∼ 60 GeV 2 . The measurements are described by predictions using perturbative QCD in the general mass variable flavour number scheme at NLO and NNLO.    The measured values and relative errors for the reduced DIS cross section (σ qq ) and structure function F qq 2 for charm (c) and beauty (b) quarks for the combined datasets. The values for F qq 2 are obtained from the measured reduced cross sections using the NLO QCD fit to correct for the contributions from F qq L . The table shows the statistical error (δ stat ), the systematic error (δ sys ), the total error (δ tot ) and the uncorrelated systematic error (δ unc ) on the reduced cross section. The next ten columns show the effect of a +1σ shift for the correlated systematic error contributions to the reduced cross section from: track impact parameter resolution, CJC track efficiency, CST track efficiency, c fragmentation, b fragmentation, light quark contribution, struck quark angle φ q , hadronic energy scale, photoproduction background and the DIS event selection. The −1σ errors are taken as the negative of the +1σ errors. The correlated systematic errors are continued in table 4. bin    Table 4: The correlated errors continued from table 3. The first eight errors represent a +1σ shift for the correlated systematic error contributions from: reweighting the Q 2 distribution, reweighting the x distribution, reweighting the jet transverse momentum distribution P jet T , and reweighting the jet pseudorapidity η jet distribution for c and b events. The remaining six columns show the contributions from: c hadron branching fractions and multiplicities, and the b quark multiplicity. Only those uncertainties where there is an effect of > 1% for any x-Q 2 point are listed; the remaining uncertainties are included in the uncorrelated error.   Figure 8: The subtracted distributions for P q T , η q , log Q 2 , log x. Each plot shows the difference between the distributions for those events with S 2 > 3 and the corresponding distribution with S 2 < −3. Included in the figure is the expectation from the Monte Carlo simulation for light, c and b quarks. The contributions from the various quark flavours in the Monte Carlo are shown after applying the scale factors ρ l , ρ c and ρ b obtained from the fit to the complete data sample. Also shown are the variations in P q T and η q resulting from uncertainties in the QCD model of heavy quark production, as well as the variations in log Q 2 and log x, resulting from uncertainties on the input heavy quark structure functions.