1 Introduction

In the parton model [1,2,3] formulated by Bjorken, Feynman, and Gribov, the quarks and gluons of a nucleon are viewed as “quasi-free” particles probed by an external hard probe in the infinite momentum frame. The parton that participates in the hard interaction with the probe, e.g., a virtual photon, is expected to be causally disconnected from the rest of the nucleon. On the other hand, the parton and the rest of the nucleon have to form a colour-singlet state due to colour confinement. In order to further understand the role of colour confinement in high energy collisions, it has been suggested [4, 5] that quantum entanglement of partons could be an important probe of the underlying mechanism. Measurements of charged particle multiplicities are proposed [5,6,7,8,9,10] to be related to the entanglement entropy predicted from the gluon density [5], as an indication of quantum entanglement of partons inside the proton. The entanglement entropy is defined here for the bipartite quantum system consisting of the struck partons and the proton remnants, respectively. It is then determined by the von Neumann entropy [11] of either reduced state, for example by that of the struck partons.

Fig. 1
figure 1

Reconstructed momentum transfer \(Q^2\), inelasticity y, pseudorapidity \(\eta _{_{\mathrm{lab}}}\), and charged particle multiplicity \(N_{\mathrm{rec}}\) for data (open circle), and the DJANGOH (red line) and the RAPGAP (blue line) MC models. The phase space restrictions are given in Table 1. The photoproduction background simulated from PYTHIA 6.4 [55] is shown for the  \(Q^2\)  and y distributions. Error bars indicate the statistical uncertainty of data

Particle multiplicity distributions have previously been measured in deep inelastic scattering (DIS) at HERA [12,13,14,15,16]. In this paper, charged particle multiplicity distributions in positron-proton (ep) DIS at \(\sqrt{s}=319\) GeV are reported using high statistics data collected with the H1 detector. The phase space of the measurement of multiplicity distributions is defined in bins of the virtuality of the photon \(5<Q^2<100 ~\mathrm {GeV}^2 \), the inelasticity y variable \(0.0375<y<0.6\) and the pseudorapidity \(\eta \) of charged particles, \(|\eta _{_{\text {lab}}}|<1.6\) in the laboratory frame, and \(0<\eta ^{*}<4\) in the hadronic centre-of-mass (HCM) frame.Footnote 1 The first and the second moments of the multiplicity distributions, corresponding to the mean and the variance, are measured as a function of the hadronic centre-of-mass energy W, in different bins of \(Q^2\). The KNO scaling function [17] is also measured in different W and   \(Q^2\)   regions.

The final-state hadron entropy, \(S_{\mathrm{hadron}}\), as a function of the Bjorken variable \(x_{\mathrm{bj}}\) in different   \(Q^2\)   bins is also measured. A relation between the \(S_{\mathrm{hadron}}\) and the initial-state parton entropy, \(S_{\mathrm{gluon}}\), due to “parton liberation” [18] and “local parton-hadron duality (LPHD)” [19], is described by [5]:

$$\begin{aligned} S_{\mathrm{hadron}} \!\equiv \! -\sum {P(N)\ln {P(N)}}\!=\!\ln {[xG(x,Q^2)]} \equiv S_{\mathrm{gluon}}.\nonumber \\ \end{aligned}$$
(1)

Here, P(N) is the charged particle multiplicity distribution measured in either the current fragmentation region or the target fragmentation region, and the gluon density \(xG(x,Q^{2})\) is evaluated at \(x =\) \(x_{\mathrm{bj}}\).

The measurements are compared to theoretical predictions obtained from simulations including parton showers and hadronisation (RAPGAP [20], DJANGOH [21] and PYTHIA 8 [22]).

Table 1 Summary of the fiducial kinematic phase space used in this analysis

2 H1 detector

A full description of the H1 detector can be found elsewhere [23,24,25,26,27,28,29] and only the components most relevant for this analysis are briefly mentioned here. The coordinate system of H1 is defined such that the positive z axis is pointing in the proton beam direction (forward direction) and the nominal interaction point is located at z = 0. The polar angle \(\theta \) is defined with respect to this axis. The pseudorapidity is defined to be \(\eta \equiv -\ln {(\tan {(\theta /2)})}\).

Charged particles are measured in the polar angle range \({15}^{\circ }<\theta <{165}^{\circ }\) using the central tracking detector (CTD), which is also used to reconstruct the interaction vertex. The CTD comprises two large cylindrical, concentric and coaxial jet chambers (CJCs), and the silicon vertex detector [26, 27]. The CTD is operated inside a 1.16 T solenoidal magnetic field. The CJCs are separated by a cylindrical drift chamber which improves the z coordinate reconstruction. A cylindrical multiwire proportional chamber [28], which is mainly used in the trigger, is situated inside the inner CJC. The trajectories of charged particles are measured with a transverse momentum resolution of \(\sigma (p_{T})/p_{T} \approx 0.2\%/\,\text{ GeV }\oplus 1.5\%\). The forward tracking detector (FTD) [29] measures the tracks of charged particles at polar angles \({6}^{\circ }<\theta <{25}^{\circ }\). In the region of angular overlap, FTD and short CTD track segments are used to reconstruct combined tracks, extending the detector acceptance for well-reconstructed tracks. Both CTD tracks and combined tracks are linked to hits in the vertex detectors: the central silicon tracker (CST) [26, 27], the backward silicon tracker (BST) and the forward silicon tracker (FST). These detectors provide precise spatial coordinate measurements and therefore significantly improve the primary vertex spatial resolution. The CST consists of two layers of double-sided silicon strip detectors surrounding the beam pipe covering an angular range of \({30}^{\circ }<\theta <{150}^{\circ }\) for tracks passing through both layers. The BST consists of six double wheels of silicon strip detectors measuring the transverse coordinates of charged particles. The FST design is similar to that of the BST and consists of five double wheels of single-sided silicon strip detectors. The lead-scintillating fibre calorimeter (SpaCal) [25] covering the region \({153}^{\circ }<\theta <{177.5}^{\circ }\) has electromagnetic and hadronic sections. The calorimeter is used to measure the scattered positron and the backward hadronic energy flow. The energy resolution for positrons in the electromagnetic section is \(\sigma (E)/E \approx 7.1\%/ \sqrt{E/\,\text{ GeV }} \oplus 1\%\), as determined in test beam measurements [30]. The SpaCal provides energy and time-of-flight information used for triggering purposes. A backward proportional chamber (BPC) in front of the SpaCal is used to improve the angular measurement of the scattered lepton. The liquid argon (LAr) calorimeter [31] covers the range \({4}^{\circ }<\theta <{154}^{\circ }\) and is used in this analysis in the reconstruction of the hadronic final state. It has an energy resolution of \(\sigma (E)/E \approx 50\%/ \sqrt{E/\,\text{ GeV }} \oplus 2\%\) for hadronic showers, as obtained from test beam measurements [32].

Table 2 Summary of the systematic uncertainties in this analysis

3 Theoretical predictions

The DIS process is simulated by different Monte Carlo (MC) event generators, which include the hard scattering process and simulations of higher order QCD correction in form of parton showers and hadronisation. A brief description of the MC event generators is given below:

  • The RAPGAP 3.1 [20] MC event generator matches first order Quantum Chromodynamics (QCD) matrix elements to the Dokshitzer–Gribov–Lipatov–Altarelli–Parisi (DGLAP) [33,34,35,36] parton showers with strongly ordered transverse momenta of subsequently emitted partons. The factorisation and renormalisation scales are set to \(\mu _{f}=\mu _{r}=\sqrt{Q^2+{\hat{p}}^{2}_{T}}\), where \({\hat{p}}_{T}\) is the transverse momentum of the outgoing hard parton from the matrix element in the centre-of-mass frame of the hard subsystem. The CTEQ 6L [37] leading order parametrisation of the parton density function (PDF) is used.

  • The DJANGOH 1.4 [21] MC event generator uses the Colour Dipole Model (CDM) as implemented in ARIADNE [38], which models first order QCD processes and creates dipoles between coloured partons. Gluon emission is treated as radiation from these dipoles, and new dipoles are formed from the emitted gluons from which further radiation is possible. The radiation pattern of the dipoles includes interference effects, thus modelling gluon coherence. The transverse momenta of the emitted partons are not ordered in transverse momentum with respect to rapidity, producing a configuration similar to the Balitsky–Fadin–Kuraev–Lipatov (BFKL) [39,40,41] treatment of parton evolution [42]. The CTEQ 6L [37] at leading order is used as the PDF.

  • The PYTHIA 8 [22] MC event generator models the hard collision by LO-pQCD cross sections. In order to enable the PYTHIA 8 program to simulate DIS processes, the \(\mathcal{O}(\alpha _s^0)\) process together with the DIrect REsummation (DIRE) parton shower [43] is applied. The parton shower recoil is treated such that the kinematic variables of the DIS process (y and \(\,\,Q^2\)) are unchanged. The PDF CTEQ 5L [37] is used.

Fig. 2
figure 2

Charged particle multiplicity distributions P(N) as a function of the number of particles N at \(\sqrt{s}=319\) GeV ep collisions. Different panels correspond to different   \(Q^2\)   and y bins, as indicated by the text in the figure. The phase space restrictions are given in Table 1. Predictions from DJANGOH, RAPGAP and PYTHIA 8 are also shown. The total uncertainty is denoted by the error bars

Fig. 3
figure 3

Charged particle multiplicity distributions P(N) as a function of the number of particles N at \(\sqrt{s}=319\) GeV ep collisions in the range \(5<Q^2<10~\mathrm {GeV}^2 \). Further phase space restrictions are given in Table 1. Different panels correspond to different \(\eta _{_{\text {lab}}}\) and y bins, as indicated by the text in the figure. Predictions from DJANGOH, RAPGAP and PYTHIA 8 are also shown. The total uncertainty is denoted by the error bars

Fig. 4
figure 4

Charged particle multiplicity distributions P(N) as a function of the number of particles N at \(\sqrt{s}=319\) GeV ep collisions in the range \(10<Q^2<20~\mathrm {GeV}^2 \). Further phase space restrictions are given in Table 1. Different panels correspond to different \(\eta _{_{\text {lab}}}\) and y bins, as indicated by the text in the figure. Predictions from DJANGOH, RAPGAP and PYTHIA 8 are also shown. The total uncertainty is denoted by the error bars

Fig. 5
figure 5

Charged particle multiplicity distributions P(N) as a function of the number of particles N at \(\sqrt{s}=319\) GeV ep collisions in the range \(20<Q^2<40~\mathrm {GeV}^2 \). Further phase space restrictions are given in Table 1. Different panels correspond to different \(\eta _{_{\text {lab}}}\) and y bins, as indicated by the text in the figure. Predictions from DJANGOH, RAPGAP and PYTHIA 8 are also shown. The total uncertainty is denoted by the error bars

Fig. 6
figure 6

Charged particle multiplicity distributions P(N) as a function of the number of particles N at \(\sqrt{s}=319\) GeV ep collisions in the range \(40<Q^2<100~\mathrm {GeV}^2 \). Further phase space restrictions are given in Table 1. Different panels correspond to different \(\eta _{_{\text {lab}}}\) and y bins, as indicated by the text in the figure. Predictions from DJANGOH, RAPGAP and PYTHIA 8 are also shown. The total uncertainty is denoted by the error bars

Fig. 7
figure 7

Charged particle multiplicity distributions P(N) as a function of the number of particles N at \(\sqrt{s}=319\) GeV ep collisions with additional restriction to the current hemisphere \(0<\eta ^{*}<4\). Further phase space restrictions are given in Table 1. Different panels correspond to different   \(Q^2\)   and y bins, as indicated by the text in the figure. Predictions from DJANGOH, RAPGAP and PYTHIA 8 are also shown. The total uncertainty is denoted by the error bars

Fig. 8
figure 8

Mean multiplicity \(\langle N \rangle \) as a function of W measured at \(\sqrt{s}=319\) GeV ep collisions (left) and with an additional restriction to the current hemisphere \(0<\eta ^{*}<4\) (right). Further phase space restrictions are given in Table 1. The corresponding \(\langle y \rangle \) is indicated by the top axis for each measured W. Predictions from the RAPGAP model are shown by dashed lines. The total uncertainty is represented by the error bar

DJANGOH and RAPGAP, as well as the PYTHIA 6 [44] photoproduction events, are also used together with the H1 detector simulation in order to determine acceptance, efficiency and backgrounds. as well as to estimate systematic uncertainties associated with the measurement. DJANGOH and RAPGAP are interfaced to HERACLES [45,46,47] to simulate QED radiative effects. The generated events are passed through a detailed simulation of the H1 detector response based on the GEANT3 simulation program [48] and are processed using the same reconstruction and analysis chain as used for the data. For the determination of the detector effects both the RAPGAP and DJANGOH predictions are studied.

4 Event selection

The data set used for this analysis was collected with the H1 detector in the years 2006 and 2007 when positrons and protons were collided at energies of 27.6 GeV and 920 GeV, respectively. The integrated luminosity of the data set is 136  pb\(^{-1}\)[49]. DIS events were recorded using triggers based on electromagnetic energy deposits in the SpaCal calorimeter. The trigger inefficiency is determined using independently triggered data and is negligible in the kinematic region of the analysis.

The scattered positron is defined by the energy cluster in the SpaCal calorimeter with the highest transverse momentum. The energy of the cluster is further required to be larger than 12 GeV, and the radial position of the cluster is required to be between 15 and 70 cm with respect to the beam axis. The z coordinate of the event vertex is required to be within \(|z_v|<35~\mathrm{cm}\) of the nominal interaction point. The \(E-p_{z}\) variable is required to be between 35 and 75 GeV in order to reduce QED radiation and photoproduction backgrounds. Here \(E-p_{z}\) is defined as the sum of \(E_{i}-p_{z,i}\) of both the scattered positron and the hadronic final-state (HFS) particles. The HFS particles are reconstructed using an energy flow algorithm [50,51,52]. This algorithm combines charged particle tracks and calorimetric energy clusters into hadronic objects, taking into account their respective resolution and geometric overlap, while avoiding double counting of energy.

Tracks measured in the CTD alone (central tracks) and the combination of CTD and FTD (combined tracks) are used in this analysis. Both central and combined tracks are required to have transverse momenta \(p_{\mathrm{T, lab}}>150\) MeV. Furthermore, the total momentum for the combined tracks is required to be larger than 500 MeV in order to ensure particles have enough momentum to cross the material between the CJC and FTD. The pseudorapidity of both central and combined tracks is required to be within \(|\eta _{_{\text {lab}}}|<1.6\). In addition, the central tracks are required to have their \(|dca'\sin {\theta }|<2~\mathrm{cm}\), where \(dca'\) is the distance of closest approach with respect to the primary vertex and \(\theta \) is the polar angle of the track. The innermost hit in the CTD is required for central tracks to be less than 50 cm away from the z axis, where the radial track length is required to be larger than 10 cm. Neutral particles are not considered in the multiplicity analysis. Using only tracks assigned to the primary event vertex, the contributions from in-flight decays of \(K^{0}_{S}\), \(\Lambda \), from photon conversions and from other secondary decays and interactions with detector material are reduced. Details of the track selection are described elsewhere [53].

4.1 Event reconstruction

The scattered positron and the HFS particles are used for the reconstruction of the kinematic variables, \(x_{\mathrm{bj}}\), y and \(Q^2\). Similar to the H1 measurement of charged particle momentum spectra [53], the e-\(\Sigma \) method is used [54], where these variables are defined as

$$\begin{aligned} Q^2= & {} 4E_{e}E^{'}_{e}\cos {\frac{\theta _{e}}{2}}^{2},\nonumber \\ y= & {} 2E_{e}\frac{\Sigma }{[\Sigma +E^{'}_{e}(1-\cos {\theta _{e}})]^{2}},\nonumber \\ x_{\mathrm{bj}}= & {} \frac{Q^2}{sy}. \end{aligned}$$
(2)

Here, s is the ep centre-of-mass energy squared, \(E_e\) is the incoming lepton energy, \(E^{'}_e\) and \(\theta _{e}\) are the scattered positron energy and polar angle, respectively. The quantity \(\Sigma \) is defined as \(\sum {E_{i}-p_{z,i}}\), where the sum runs over all the HFS particles. This method provides an optimum in resolution of the kinematic variables and shows only little sensitivity to QED radiative effects. The hadronic centre-of-mass energy W is defined as:

$$\begin{aligned} W = \sqrt{sy-Q^{2}+M^{2}_{p}}, \end{aligned}$$
(3)

where \(M_{p}\) is the proton rest mass.

The hadronic centre-of-mass frame is defined as the frame where \(p+q=0\), with p and q being the four-momenta of the proton and the virtual photon, respectively. In this frame, the positive z axis is aligned with the direction of the virtual photon.

In Fig. 1, distributions of the reconstructed quantities of \(Q^2\), y, \(\eta _{_{\text {lab}}}\), and the number of charged particles \(N_\mathrm{rec}\), are shown in comparison with predictions from DJANGOH and RAPGAP. The panels of the   \(Q^2\)   and y distribution also contain the contribution of expected photoproduction background estimated based on PYTHIA 6.4 [55]. The photoproduction background is found to be less than 0.5%, and therefore is neglected in the subsequent analysis.

4.2 Experimental observables and kinematic phase space

For a given range in   \(Q^2\)   and y, the probability P(N) is defined as the fraction of events for which N charged particles are produced in the specified \(\eta \) range relative to the total number of events in that \(Q^2\),y range. Based on the distribution P(N), the first and second moments of the multiplicity distributions, the KNO function \(\Psi (z)\) and the final-state hadron entropy \(S_{\mathrm{hadron}}\) are defined as:

$$\begin{aligned}&\langle N \rangle \equiv \frac{\sum {N\cdot P(N)}}{\sum {P(N)}},\nonumber \\&Var(N) \equiv \frac{\sum {(N-\langle N \rangle )^{2}\cdot P(N)}}{\sum {P(N)}}, \end{aligned}$$
(4)
$$\begin{aligned}&\Psi (z)=\langle N \rangle P(N), \end{aligned}$$
(5)
$$\begin{aligned}&S_{\mathrm{hadron}} = -\sum {P(N)\ln {P(N)}}. \end{aligned}$$
(6)

Here, the sum runs over the number of charged particles and the z variable is equal to \(N/\langle N \rangle \). These quantities are measured within the fiducial kinematic phase space listed in Table 1. The selection in the HCM frame relative to the laboratory frame differs only by the additional restriction of the charged particles properties to the current hemisphere, \(0<\eta ^{*}<4\).

4.3 Data correction

The detector-level charged track multiplicity distributions are corrected to stable particles with proper lifetime \(c\tau >10~\mathrm{mm}\) including charged hyperons. The probabilities P(N) are derived from the distributions of events with \(N=0,1,2,\ldots \) tracks reconstructed in the specified \(\eta \) range and with   \(Q^2\)   and y reconstructed in the kinematic bin of interest. First, the observed event counts in the three-dimensional grid of N, \(Q^2\), y are unfolded, such that migrations between bins as well as efficiency and acceptance distortions are removed. The second step is to normalize the event counts as a function of N to the total number of events in each \(Q^2\), y bin and thus obtain the probabilities P(N). The last step is to correct for QED radiation from the electron line.

The unfolding is done within the TUnfold framework [56]. In order to better resolve migration effects, the number of bins in \(Q^2\), y and N is chosen to be higher when counting events in reconstructed quantities as compared to the truth quantities. The phase space in   \(Q^2\)   and y is enlarged over the phase space given in Table 1, such that the measured phase space is guarded against migrations at the phase space boundaries. The unfolding turns out to be robust against variations of the regularisation scheme. The dominant systematic uncertainty in the detector correction procedure arises from constructing the matrix of migration with an alternative MC model, DJANGOH instead of the RAPGAP default.

In this paragraph QED radiative effects are discussed. In the radiative MC, the scattered positron at the particle level is defined to be a scattered positron with photons that are within a cone of 0.4 radian. The QED radiative effects are corrected for based on the MC event generator. The correction factors are derived bin-by-bin in each measured phase space at the particle level using radiative and non-radiative MC samples.

The binning of the multiplicity N at the particle level after the unfolding is made to be wider than at the detector level when \(N>3\), in order to avoid large negative bin-to-bin correlations. For those wide bins, the reported values of P(N) are defined as \(\sum {P(N)/\Delta }\), where \(\Delta \) is the number of distinct values of N included in the bin. In order to determine moments and the hadron entropy, the measured distribution is extrapolated within the wide bins using an exponentiated cubic spline. Correlations between the extrapolated values are taken into account. Bin centres of the wide bins are also reported.

5 Systematic uncertainties

Systematic uncertainties are studied based on systematic variations on fully corrected results, where details of the variations are listed below. The systematic uncertainties are found to depend on N, \(\eta _{_{\text {lab}}}\),   \(Q^2\)   and y. They are are observed to be similar in size for the HCM results as compared to those obtained in the laboratory frame. The following systematic uncertainty sources are considered in this analysis:

  • Sys. 1 – Radiative corrections: the difference in correction factors between the MC generators, the DJANGOH and RAPGAP is taken as systematic uncertainty. It is found to be 1–2% for the P(N) distributions and up to 1% for their moments.

  • Sys. 2 – MC model dependence: the P(N) distributions and their moments are compared between results that are corrected by DJANGOH or RAPGAP. This leads to 1–4% systematic uncertainty on the multiplicity distributions and their moments.

  • Sys. 3 – SpaCal energy scale and angular resolution: a variation of 1% on the energy scale of the SpaCal [57] and \(1^{\circ }\) on the angular direction is considered. The combined systematic uncertainty on the distribution of P(N) and its moments is found to be 1–5% and 2%, respectively.

  • Sys. 4 – Hadronic energy scale: a variation of 2% on the energy scale of hadronic final-state objects [58] results in 1–3% systematic uncertainty on the multiplicity distribution and 2% on the moments.

  • Sys. 5 – Single track efficiency: 0.5% of tracking efficiency uncertainty on central tracks and 10% uncertainty on combined tracks are applied. This leads to 1–5% systematic uncertainty on the P(N) distributions, and 1% on the moments.

  • Sys. 6 – V0 particle contamination: 50% of uncertainty on tracks originating from \(K^{0}_{s}\) decay and 0.5% uncertainty on tracks from photon conversion and Dalitz-decays surviving the primary track selection result in 1–7% of systematic uncertainty on the P(N) distributions and up to 2% for the moments.

  • Sys. 7 – Diffractive contributions: variation of the diffractive MC contribution by a factor of 2 results in 1–5% systematic uncertainty on the P(N) distributions and up to 1% on the moments.

  • Sys. 8 – Extrapolation: values of P(N) that are not explicitly measured are extrapolated and 1% of difference is found for the mean, variance, and entropy with respect to the generator value based on MC.

Systematic uncertainties associated with photoproduction background are negligible. A summary of systematic uncertainties as a function of the multiplicity N can be found in Table 2. The total systematic uncertainties are obtained by adding in quadrature all individual contributions. The tables of the results contain a complete breakdown of the systematic uncertainty contributions from different sources for each measured data point.

6 Results

6.1 Multiplicity distributions

The charged particle multiplicity distributions in ep DIS at \(\sqrt{s}=319\) GeV are measured in different bins of   \(Q^2\)   and y for particles with \(p_{_{\mathrm{T,lab}}}>150~\,\text{ MeV }\) and \(|\eta _{_{\mathrm{lab}}}|<1.6\) in the laboratory frame. The data are presented in Fig. 2, numerical values are reported in the appendix. The data are compared with predictions from the DJANGOH, RAPGAP, and PYTHIA 8 event generators without simulation of QED radiation. The multiplicity distributions P(N) are found to broaden as y increases for fixed \(Q^2\). Since y can be related to the hadronic centre-of-mass energy W (cf. Eq. 3), the increase in multiplicity is qualitatively expected because more energy is available for hadronisation. On the other hand, the P(N) distributions are found to be almost independent of   \(Q^2\)   for fixed y. Qualitatively, the MC predictions can describe the peak position of the multiplicity distributions well. However, they tend to underestimate the data both at low and high multiplicities, especially towards low   \(Q^2\)   and high y. Among all MC models considered, the PYTHIA 8 model gives the poorest description and peaks significantly below the data, especially at high y.

In Figs. 3, 4, 5 and 6 the charged particle multiplicity distributions P(N) are presented in four bins of \(Q^2\). In each figure, the P(N) distributions are shown differentially in bins of y (identical binning as in Fig. 2) and in three different ranges of \(\eta _{_{\text {lab}}}\). Numerical values are given in the appendix. The overlapping \(\eta _{_{\text {lab}}}\) ranges are chosen such that a pseudorapidity window of 1.4 units around the scattered parton direction in the leading order Quark Parton Model (QPM) picture can be selected. Similar to the measurements over the full \(\eta _{_{\text {lab}}}\) range, the P(N) distributions are found to broaden as y increases, independent of the considered \(\eta _{_{\text {lab}}}\) ranges. For fixed y, the P(N) distributions also broaden as \(\eta _{_{\text {lab}}}\) increases. Similar to the results presented in Fig. 2, the MC models underestimate the high multiplicity tail, where the deviation is found to be the strongest at low \(Q^2\).

Table 3 Mean multiplicity measured in ep DIS at \(\sqrt{s}=319\) GeVas a function of W. The measurement is repeated in four ranges of   \(Q^2\)   as indicated. The phase-space is further restricted as shown in Table 1
Table 4 Mean multiplicity measured in ep DIS at \(\sqrt{s}=319\) GeV as a function of W. The measurement is repeated in four ranges of   \(Q^2\)   as indicated. The phase-space is further restricted as shown in Table 1. For this dataset, the track pseudorapidities in the hadronic centre-of-mass frame are restricted to the range \(0<\eta ^* <4\)
Fig. 9
figure 9

Second moment (variance) of the multiplicity distributions Var(N) as a function of W measured at \(\sqrt{s}=319\) GeV ep collisions (left) and with an additional restriction to the current hemisphere \(0<\eta ^{*}<4\) (right). Further phase space restrictions are given in Table 1. The corresponding \(\langle y \rangle \) is indicated by the top axis for each measured W. Predictions from the RAPGAP model are shown by dashed lines. The total uncertainty is represented by the error bar

Table 5 Variance of multiplicity distributions measured in ep DIS at \(\sqrt{s}=319\) GeV as a function of W. The measurement is repeated in four ranges of   \(Q^2\)   as indicated. The phase-space is further restricted as shown in Table 1
Table 6 Variance of multiplicity distributions measured in ep DIS at \(\sqrt{s}=319\) GeV as a function of W. The measurement is repeated in four ranges of   \(Q^2\)   as indicated. The phase-space is further restricted as shown in Table 1. For this dataset, the track pseudorapidities in the hadronic centre-of-mass frame are restricted to the range \(0<\eta ^* <4\)
Fig. 10
figure 10

KNO function \(\Psi (z)\) as a function of z measured at \(\sqrt{s}=319\) GeV in ep collisions in bins of   \(Q^2\)   with an additional restriction to the current hemisphere \(0<\eta ^{*}<4\). Further phase space restrictions are given in Table 1. The total uncertainty is denoted by the error bars

In Fig. 7, the multiplicity distributions P(N) are presented with the additional restriction in the HCM frame \(0<\eta ^{*}<4\), mainly selecting particles originating from the current hemisphere. Numerical values are given in the appendix. Predictions obtained from DJANGOH, RAPGAP, and PYTHIA 8 are compared with data. Qualitatively, the comparisons are very similar to those of Fig. 2. For the quantities presented in the subsequent sections the differences between RAPGAP and DJANGOH are small, thus only RAPGAP is compared to the data.

6.2 Moments of multiplicity distributions

The mean multiplicity \(\langle N \rangle \) as a function of W is presented in Fig. 8 and Tables 3 and 4, for both the full phase space and with an additional restriction in \(\eta ^{*}\) (see Table 1). Predictions from the RAPGAP model are also shown. The average multiplicity \(\langle N \rangle \) rises with the hadronic centre-of-mass energy W, which is in agreement with previous observations [13, 14, 16, 59,60,61,62]. For higher \(Q^2\), the increase in W is faster than that at lower \(Q^2\). For high W, the   \(Q^2\)   dependence is observed to be stronger in the full phase space than with the \(\eta ^{*}\) restriction of the current hemisphere. The RAPGAP prediction yields a reasonable description of the data. Only at high W and low   \(Q^2\)   the MC tends to underestimate the data.

Similarly, in Fig. 9 and Tables 5 and 6, second moments of the multiplicity distributions (variances) are presented as a function of W, for both the full phase space and with the additional restriction in \(\eta ^{*}\). The variances rise strongly with W, almost independent of   \(Q^2\)   within uncertainties. The restriction of the current hemisphere has little inpact on the variance. For \(Q^2>40~\mathrm {GeV}^2 \), RAPGAP describes the data reasonably well. However, towards high W, the MC not only underestimates the data but also shows a   \(Q^2\)   dependence, which is absent in data. This effect is more pronounced when restricting the analysis to the current hemisphere.

6.3 The KNO-scaling function \(\Psi (z)\)

In order to further study the multiplicity distribution, the KNO function \(\Psi (z)\) measured as a function of \(z=N/\langle N \rangle \) is shown in different bins of   \(Q^2\)   in Fig. 10. Numerical values are given in the appendix. The analysis is restricted to the current hemisphere (\(0<\eta ^{*}<4\)). In all   \(Q^2\)   bins, KNO scaling is observed, in broad agreement with many past experiments. However, in proton-proton collisions at LHC energies, violations of KNO scaling have been reported recently [61].

6.4 Entropy

It was recently suggested [5, 10] that the final-state hadron entropy \(S_{\mathrm{hadron}}\) calculated from charged particle multiplicity distributions might be related to the entanglement entropy of gluons \(S_{\mathrm{gluon}}\) at low \(x_{\mathrm{bj}}\) (Eq. 1). In Fig. 11 and Table 7, \(S_{\mathrm{hadron}}\) is studied as a function of \(\langle \) \(x_{\mathrm{bj}}\) \(\rangle \) in different   \(Q^2\)   bins. Similar to the observable studied in Ref. [10], a moving \(\eta _{_{\mathrm{lab}}}\) window depending on \(\langle \) \(x_{\mathrm{bj}}\) \(\rangle \) is chosen to match the rapidity of the scattered quark in a leading order QPM picture. The respective selected pseudorapidity window in the laboratory frame is indicated in Table 7. The hadron entropy observed in data is consistent with being constant in \(\langle \) \(x_{\mathrm{bj}}\) \(\rangle \), increasing only slightly with \(Q^2\). The same \(\langle \) \(x_{\mathrm{bj}}\) \(\rangle \) and   \(Q^2\)   dependence is also observed for RAPGAP, however with slightly smaller values of \(S_{\mathrm{hadron}}\). Predictions for the entanglement entropy \(S_{\mathrm{gluon}}\) based on the gluon density \(xG(x,Q^{2})\) are also shown at various   \(Q^2\)   values, which correspond to the lower boundaries of the   \(Q^2\)   bins in data. Here the PDF set HERAPDF 2.0 [63] at leading order is used. Neither the dependence on \(\langle \) \(x_{\mathrm{bj}}\) \(\rangle \) nor the magnitude agrees with data. Thus the prediction \(S_{\mathrm{hadron}}=S_{\mathrm{gluon}}\) (Eq. 1) is not confirmed by the present measurement.

To investigate further, the hadron entropy determined in the current hemisphere, \(0<\eta ^*<4\), is presented in Fig. 12 and Table 8. The hadron entropy based on multiplicity distributions is shown as a function of \(\langle \) \(x_{\mathrm{bj}}\) \(\rangle \) in different bins of \(Q^2\). In contrast to Fig. 11, there is no moving pseudorapidity range applied. Independent of \(Q^2\), the measured hadron entropy, \(S_{\mathrm{hadron}}\), rises with decreasing \(\langle \) \(x_{\mathrm{bj}}\) \(\rangle \) with similar slopes in different   \(Q^2\)   bins. Also shown are the predictions from RAPGAP which closely follow the data at high   \(Q^2\)   but underestimate the data at low \(Q^2\). Since \(S_{\mathrm{hadron}}\) is calculated from the multiplicity distributions, this behaviour of the MC is related to what has been observed for the moments, Figs. 8 and 9. The predictions from the entanglement approach based on the gluon density again fail to describe \(S_\mathrm{hadron}\) in magnitude. However, at low   \(Q^2\)   the slope of \(S_\mathrm{gluon}\) has some similarities with that observed for \(S_\mathrm{hadron}\), while it becomes steeper than observed with increasing \(Q^2\).

7 Summary

The charged particle multiplicity distributions P(N) are measured in deep inelastic scattering at \(\sqrt{s}=319\) GeV using the H1 detector at HERA. The integrated luminosity used in this analysis is 136  pb\(^{-1}\), recorded in the years 2006 and 2007 in positrons scattering off protons. The P(N) distributions are measured in bins of \(Q^2\), y and pseudorapidity \(\eta \). Predictions based on simulations of partonic tree level matrix elements, supplemented by parton shower and hadronisation are generally found to be consistent with the measurement at low multiplicities while they underestimate the data in the high multiplicity regions, especially at low \(Q^2\). For measurements of moments, the predictions generally describe the data well at low hadronic centre-of-mass energy W and high   \(Q^2\)   but less so at high W and low \(Q^2\). In addition, KNO scaling is observed within the kinematic phase space of the analysis.

Table 7 Hadron entropy derived from multiplicity distributions measured in ep DIS at \(\sqrt{s}=319\) GeV as a function of \(\langle x_{\mathrm{bj}} \rangle \). The measurement is repeated in four ranges of   \(Q^2\)   as indicated. The phase-space is further restricted as shown in Table 1. For this dataset, the entropy is determined in a sliding window in pseudorapidity as indicated in the table
Table 8 Hadron entropy derived from multiplicity distributions measured in ep DIS at \(\sqrt{s}=319\) GeV as a function of \(\langle x_{\mathrm{bj}} \rangle \). The measurement is repeated in four ranges of   \(Q^2\)   as indicated. The phase-space is further restricted as shown in Table 1. For this dataset, the entropy is determined with the track pseudorapidities in the hadronic centre-of-mass frame are restricted to the range \(0<\eta ^* <4\)
Fig. 11
figure 11

Hadron entropy \(S_{\mathrm{hadron}}\) derived from multiplicity distributions, reported as a function of \(\langle \) \(x_{\mathrm{bj}}\) \(\rangle \) in different   \(Q^2\)   ranges, measured in \(\sqrt{s}=319\) GeV ep collisions. For each \(\langle \) \(x_{\mathrm{bj}}\) \(\rangle \), the multiplicity is determined in a dedicated pseudorapidity window as discussed in the text. Further phase space restrictions are given in Table 1. Predictions for \(S_{\mathrm{hadron}}\) from the RAPGAP model and for the entanglement entropy \(S_{\mathrm{gluon}}\) based on an entanglement model are shown by the dashed lines and solid lines, respectively. For each   \(Q^2\)   range, the value of the lower boundary is used for predicting \(S_{\mathrm{gluon}}\). The total uncertainty on the data is represented by the error bars

Fig. 12
figure 12

Hadron entropy \(S_{\mathrm{hadron}}\) derived from multiplicity distributions as a function of \(\langle \) \(x_{\mathrm{bj}}\) \(\rangle \) measured in different   \(Q^2\)   ranges, measured in \(\sqrt{s}=319\) GeV ep collisions. Here, a restriction to the current hemisphere \(0<\eta ^{*}<4\) is applied. Further phase space restrictions are given in Table 1. Predictions for \(S_{\mathrm{hadron}}\) from the RAPGAP model and for the entanglement entropy \(S_{\mathrm{gluon}}\) based on an entanglement model are shown by the dashed lines and solid lines, respectively. For each   \(Q^2\)   range, the value of the lower boundary is used for predicting \(S_\mathrm{gluon}\). The total uncertainty on the data is represented by the error bars

The measurement of the charged particle multiplicity distributions is also used to test for the first time predictions based on quantum entanglement on sub-nucleonic scales in deep-inelastic ep scattering. The predictions from the entropy of gluons are found to grossly disagree with the hadron entropy obtained from the multiplicity measurements presented here, and therefore the data do not support the basic concept of equality of the parton and hadron entropy with the current level of theory development.

The measurements reported in this paper not only provide valuable information for better understanding particle production mechanisms, but also set an important testing ground for the development of new concepts, like quantum entanglement at sub-nucleonic scales.