1 Introduction and motivation

This paper presents experimental results on event-by-event fluctuations of multiplicities and transverse momenta of charged particles produced in inelastic p+p interactions at 20, 31, 40, 80 and 158 \(\text{ GeV }\!/\!c\). The measurements were performed by the multi-purpose NA61/SHINE [1, 2] experiment at the CERN Super Proton Synchrotron (SPS). They are part of the strong interaction program devoted to the study of the properties of the onset of deconfinement and search for the critical point of strongly interacting matter. Within this program a two-dimensional scan in collision energy and size of colliding nuclei is in progress. Data on p+p, Be+Be and Ar+Sc collisions were already recorded and data on p+Pb and Xe+La collisions will be registered within the coming years. The expected signal of a critical point is a non-monotonic dependence of various fluctuation measures in such a scan, for recent review see Ref. [3].

The NA49 experiment [4] published results for central Pb+Pb collisions in the collision energy range 20A to 158\(A\,\text{ GeV }\), as well as for p+p, C+C and Si+Si reactions at 158\(A\,\text{ GeV }\). Multiplicity fluctuations were measured in terms of the scaled variance of the multiplicity distribution [5, 6] and fluctuations of the transverse momentum of the particles were studied employing measures \(\varPhi _{p_T}\) [7, 8], recently \(\varDelta [P_{T},N]\) and \(\varSigma [P_{T},N]\) [9].

Also, at SPS energies results on event-by-event fluctuations in Pb+Au collisions on mean transverse momentum were published by the CERES experiment [10] and in Pb+Pb collisions by the WA98 Collaboration on charged particle multiplicity, transverse energy [11] as well as the ratio of the charged to neutral pion multiplicity [12].

An interpretation of the experimental results on nucleus–nucleus collisions relies to a large extent on a comparison with the corresponding data on p+p and p+A interactions. However, in particular the available data on fluctuations are sparse. Suitable fluctuation measurements for p+p interactions only exist at 158 GeV/c beam momentum [5, 7]. Moreover, fluctuation measurements cannot be corrected in a model independent manner for partial phase space acceptance. Thus all measurements of the scan should be performed in the same phase space region.

In nucleus–nucleus reactions the impact parameter of the collisions cannot be tightly controlled. This problem results in additional unwanted contributions to fluctuations the effect of which needs to be suppressed by employing so-called strongly intensive measures. In addition to \(\varPhi _{p_T}\) two recently proposed strongly intensive quantities \(\varDelta [P_{T},N]\) and \(\varSigma [P_{T},N]\) are studied in this publication.

The paper is organized as follows. In Sect. 2 strongly intensive fluctuation measures are introduced. The experimental setup is presented in Sect. 3. Data processing and simulation and the analysis procedure are described in Sects. 4 and  5, respectively. Results are presented and discussed in Sect. 6 and compared to model calculations in Sect. 7. A summary and outlook in Sect. 8 closes the paper.

Across this paper the pion rapidity is calculated in the collision center-of-mass system: \(y_{\pi } = {{\mathrm{atanh}}}(\beta _L)\), where \(\beta _L = p_L/E\) is the longitudinal (z) component of the velocity, \(p_L\) and E are pion longitudinal momentum and energy given in the collision center-of-mass system. The transverse component of the momentum is denoted \(p_T\) and the transverse mass \(m_T\) is defined as \(m_T = \sqrt{m_{\pi }^2 + (cp_T)^2}\), where \(m_{\pi }\) is the charged pion mass in \(\text{ GeV }\). The azimuthal angle \(\phi \) is the angle between transverse momentum vector and the horizontal (x) axis. The nucleon mass and collision energy per nucleon pair in the center-of-mass system are denoted \(m_\mathrm {N}\) and \(\sqrt{s_\mathrm {NN}}\), respectively.

2 Fluctuation measures

2.1 Intensive fluctuation measures

Event quantities are called intensive if they do not depend on the volume of the system within the grand canonical ensemble of statistical mechanics. Examples are the mean transverse momentum of particles or ratios of particle numbers in the events. In contrast, so-called extensive quantities (for example the mean multiplicity or the variance of the multiplicity distribution) are proportional to the system volume. Within the Wounded Nucleon Model [13] intensive quantities are those which are independent of the number of wounded nucleons, and extensive ones those which are proportional to the number of wounded nucleons. The ratio of two extensive quantities is an intensive quantity [14]. Therefore, the scaled variance of a quantity A

$$\begin{aligned} \omega [A] = \frac{\mathrm{Var}(A)}{\langle A \rangle } = \frac{\langle A^2 \rangle - \langle A \rangle ^2}{\langle A \rangle } \end{aligned}$$
(1)

is an intensive measure. In fact, due to its intensity property, the scaled variance (\(\omega [N]\)) of the distribution of multiplicity N in the events is widely used to quantify multiplicity fluctuations in high-energy heavy-ion experiments.

The scaled variance assumes the value \(\omega [N] = 0\) for \(N = \mathrm{const}.\) and \(\omega [N] = 1\) for a Poisson multiplicity distribution.

2.2 Strongly intensive fluctuation measures

Unfortunately, the volume of the matter produced in heavy-ion collisions cannot be fixed and changes significantly from one event to another. Therefore, it is very important to be able to measure the properties of the created matter independently of its volume fluctuations. The quantities which allow this are called strongly intensive measures. They depend neither on the volume nor on the fluctuations of the volume. Ratios of mean multiplicities are both intensive and strongly intensive measures. The situation is, however, much more difficult for the analysis of fluctuations. For example the scaled variance is an intensive but not strongly intensive measure.

It was shown in Ref. [14], that, for certain combinations of scaled variances, terms dependent on the volume fluctuations cancel out. There are at least two families of strongly intensive measures of two fluctuating extensive quantities A and B:

$$\begin{aligned} \varDelta [A,B]= & {} \frac{1}{C_{\varDelta }} \biggl [ \langle B \rangle \omega [A] - \langle A \rangle \omega [B] \biggr ] \end{aligned}$$
(2)
$$\begin{aligned} \varSigma [A,B]= & {} \frac{1}{C_{\varSigma }} \biggl [ \langle B \rangle \omega [A] + \langle A \rangle \omega [B] - 2 \bigl ( \langle AB \rangle - \langle A \rangle \langle B \rangle \bigr ) \biggr ].\nonumber \\ \end{aligned}$$
(3)

For the study of transverse momentum fluctuations one uses

$$\begin{aligned} A = P_{T} = \sum \limits _{i=1}^{N} p_{T_{i}}, \qquad B = N , \end{aligned}$$

where \(p_{T_{i}}\) is the modulus of the transverse momentum of particle i.

There is an important difference between \(\varDelta [P_{T},N]\) and \(\varSigma [P_{T},N]\). Only the first two moments: \(\langle P_{T} \rangle \), \(\langle N \rangle \), and \(\langle P_{T}^2 \rangle \), \(\langle N^2 \rangle \) are required to calculate \(\varDelta [P_{T},N]\), whereas \(\varSigma [P_{T},N]\) includes the correlation term \(\langle P_{T}N \rangle - \langle P_{T} \rangle \langle N \rangle \). Thus \(\varDelta [P_{T},N]\) and \(\varSigma [P_{T},N]\) can be sensitive to various physics effects in different ways. In Ref. [14] strongly intensive quantities including the correlation term are named the \(\varSigma \) family, and those based only on mean values and variances the \(\varDelta \) family.

Historically, the first proposed strongly intensive fluctuations measure was \(\varPhi \) [15]. When applied to transverse momentum fluctuations the measure is called \(\varPhi _{p_T}\). This has already been used extensively by the NA49 experiment [7, 8]. The measure is a member of the \(\varSigma \) family:

$$\begin{aligned} \varPhi _{p_T} = \sqrt{\overline{p_T} \omega [p_T]} \left[ \sqrt{\varSigma [P_{T},N]}-1\right] , \end{aligned}$$
(4)

where \(\overline{p_T}\) and \(\omega [p_T]\) denote the average and scaled variance of the inclusive \(p_T\) distribution.

With the normalization proposed in Ref. [16],

$$\begin{aligned} C_{\varDelta } = C_{\varSigma } = \langle N \rangle \omega [p_{T}], \end{aligned}$$
(5)

these measures are dimensionless and have a common scale required for a quantitative comparison of fluctuations of different, in general dimensional, extensive quantities. More precisely, the values of \(\varDelta \) and \(\varSigma \) are equal to zero in the absence of event-by-event fluctuations (\(N = \mathrm{const}.\), \(P_{T} = \mathrm{const}.\)) and equal to one for fluctuations given by the independent particle production model (IPM) [16, 17]. The values of \(\varDelta [P_{T},N]\) and \(\varSigma [P_{T},N]\) have already been determined in several models. The results of the IPM, the Model of Independent Sources (MIS), source-by-source temperature fluctuations (example of MIS), event-by-event (global) temperature fluctuations, correlation between average \(p_T\) per event and its multiplicity were published in Ref. [17]. The effects of acceptance losses, efficiency losses, quantum (Bose–Einstein and Fermi–Dirac) statistics, and centrality dependence (UrQMD) were investigated in Ref. [18]. Finally, the system size and energy dependence in the UrQMD model was studied in Ref. [16]. One of the conclusions (supported by the UrQMD calculations) is that the \(\varDelta [P_{T},N]\), \(\varSigma [P_{T},N]\), and \(\varPhi _{p_T}\) quantities measure deviations from the superposition model in different ways. Therefore, in the analysis of experimental data a simultaneous measurement of all three quantities is highly desirable.

A comparison of the properties of these three measures within the IMP and MIS models is shown in Table 1. If one finds, e.g. \(\varPhi _{p_T}=10\) \(\text{ MeV }/c\), one does not know whether this is a large or a small effect, especially when the magnitudes of \(\varPhi _{p_T}\) from several “trivial” effects (Bose–Einstein statistics, resonance decays, etc.) are not estimated. The situation is, however, different for \(\varSigma [P_{T},N]\). If one measures, for example, \(\varSigma [P_{T},N]=1.1\) this means that (for this specific combination of moments) one measures 10% deviation from the IPM (fluctuations are 10% larger than in the IPM). Therefore, the new measures \(\varDelta [P_{T},N]\) and \(\varSigma [P_{T},N]\) have the advantages of \(\omega [N]\) but they also preserve the advantage of \(\varPhi _{p_T}\), i.e. they are strongly intensive measures of fluctuations.

Table 1 Properties of \(\varPhi _{p_T}\), \(\varDelta [P_{T},N]\), and \(\varSigma [P_{T},N]\) in the absence of fluctuations, and in the independent particle model (IPM) and the model of independent sources (MIS) (\(N_S\) denotes the number of sources)

3 Experimental facility

3.1 The NA61/SHINE detector

The NA61/SHINE experimental facility [2] consists of a large acceptance hadron spectrometer located in the CERN North Area Hall 887 (EHN1) and the H2 beam-line to which beams accelerated in the CERN accelerator complex are delivered from the Super Proton Synchrotron (SPS). The schematic layout of the NA61/SHINE detector is shown in Fig. 1.

Fig. 1
figure 1

The schematic layout of the NA61/SHINE spectrometer (horizontal cut, not to scale). The beam and trigger detector configuration used for data taking in 2009 is shown in the inset. The chosen coordinate system is drawn on the lower left its origin lies in the middle of the VTPC-2, on the beam axis. The nominal beam direction is along the z axis. The magnetic field bends charged particle trajectories in the xz (horizontal) plane. Positively charged particles are bent towards the top of the plot. The drift direction in the TPCs is along the y (vertical) axis

A set of scintillation and Cherenkov counters as well as beam position detectors (BPDs) upstream of the spectrometer provide timing reference, identification and position measurements of incoming beam particles. Trajectories of individual beam particles were measured in a telescope of beam position detectors placed along the beam line (BPD-1/2/3 in Fig. 1). These counters are small (\(4.8\times 4.8\) cm\(^2\)) proportional chambers with cathode strip readout, providing a resolution of about 100 \(\upmu \)m in two orthogonal directions. Due to properties of the H2 beam line both the beam width and the divergence at the NA61/SHINE target increase with decreasing beam momentum. The trigger scintillator counter S4 placed downstream of the target is used to select events with collisions in the target area. The liquid hydrogen target as well as the proton beams and triggers are described in Sects. 3.2 and 3.3, respectively.

The main tracking devices of the spectrometer are four large volume Time Projection Chambers (TPCs). Two of them, the vertex TPCs (VTPC-1 and VTPC-2 in Fig. 1), are located in the magnetic fields of two super-conducting dipole magnets with a maximum combined bending power of 9 Tm, which corresponds to about 1.5 T and 1.1 T fields in the upstream and downstream magnets, respectively. In order to optimize the acceptance of the detector at each collision momentum, the field in both magnets was adjusted proportionally to the beam momentum.

Two large TPCs (MTPC-L and MTPC-R) are positioned downstream of the magnets symmetrically to the beam line. The fifth small TPC (GAP-TPC) is placed between VTPC-1 and VTPC-2 directly on the beam line. It closes the gap along the beam axis between the sensitive volumes of the other TPCs.

The TPCs are filled with Ar:CO\(_2\) gas mixtures in proportions 90:10 for the VTPCs and the GAP-TPC, and 95:5 for the MTPCs.

The particle identification capability of the TPCs based on measurements of the specific energy loss, \(\mathrm{d}E\!/\!\mathrm{d}x\), is augmented by time-of-flight measurements using Time-of-Flight (ToF) detectors. The high resolution forward calorimeter, the Projectile Spectator Detector (PSD), measures energy flow around the beam direction, which in nucleus–nucleus collisions is primarily given by the projectile spectators.

The results presented in this paper were obtained using information from the TPCs the Beam Position Detectors as well as from the beam and trigger counters.

3.2 Target

NA61/SHINE uses various solid nuclear targets and a liquid hydrogen target (see Sect. 3.3 for details). For data taking on p+p interactions a liquid hydrogen target of 20.29 cm length (2.8% interaction length) and 3 cm diameter was placed 88.4 cm upstream of VTPC-1. The Liquid Hydrogen Target facility (LHT) filled the target cell with para-hydrogen obtained in a closed-loop liquefaction system which was operated at 75 mbar overpressure with respect to the atmosphere. At the atmospheric pressure of 965 mbar the liquid hydrogen density is \(\rho _\mathrm {LH} = 0.07\) g/cm\(^3\).

Data taking with inserted (I) and removed (R) liquid hydrogen (LH) in the LHT was alternated in order to calculate a data-based correction for interactions with the material surrounding the liquid hydrogen.

3.3 Beams and triggers

Secondary beams of positively charged hadrons at 20, 31, 40, 80 and 158 \(\text{ GeV }\!/\!c\) were produced from 400 \(\text{ GeV }\!/\!c\) protons extracted from the SPS onto a beryllium target in a slow extraction mode with a flat-top of 10 s. The secondary beam momentum and intensity was adjusted by proper setting of the H2 beam-line magnet currents and collimators. The beam was transported along the H2 beam-line towards the experiment. The precision of the setting of the beam magnet currents was approximately 0.5%. This was verified by a direct measurement of the beam momentum at 31 \(\text{ GeV }\!/\!c\) by bending the incoming beam particles into the TPCs with the maximum magnetic field [19]. Selected beam properties are given in Table 2.

The setup of the beam detectors is illustrated in the inset on Fig. 1. Protons from the secondary hadron beam were identified by two Cherenkov counters, a CEDAR (either CEDAR-W or CEDAR-N) and a threshold counter (THC). The CEDAR counter, using a coincidence of six out of the eight photo-multipliers placed radially along the Cherenkov ring, provided positive identification of protons, while the THC, operated at pressure lower than the proton threshold, was used in anti-coincidence in the trigger logic. Due to their limited range of operation two different CEDAR counters were employed, namely for beams at 20, 31, and 40 \(\text{ GeV }\!/\!c\) the CEDAR-W counter and for beams at 80 and 158 \(\text{ GeV }\!/\!c\) the CEDAR-N counter. The threshold counter was used for 20, 31, and 40 \(\text{ GeV }\!/\!c\) beams. A selection based on signals from the Cherenkov counters allowed one to identify beam protons with a purity of about 99%. A consistent value for the purity was found by bending the 30.1 \(\text{ GeV }\!/\!c\) beam into the TPCs with the full magnetic field and using the \(\mathrm{d}E\!/\!\mathrm{d}x\) identification method. The fraction of protons in the beams is given in Table 2.

Table 2 Basic properties of the beam used in the study of p+p interactions at 20, 31, 40, 80, and 158 \(\text{ GeV }\!/\!c\). The first column gives the beam momentum. The second and third columns list typical numbers of beam particles at NA61/SHINE per spill (about 10 s) and the fraction of protons in the beam, respectively

Two scintillation counters, S1 and S2, provided beam definition, together with the three veto counters V0, V1 and V1\(^\mathrm {p}\) with a 1 cm diameter hole, which were defining the beam before the target. The S1 counter also provided the timing (start time for the gating of all counters). Beam protons were then selected by the coincidence:

$$\begin{aligned} T_\mathrm{beam} = \text {S1} \wedge \text {S2} \wedge \overline{\text {V0}} \wedge \overline{\text {V1}} \wedge \overline{\text {V1}^\mathrm {p}} \wedge \text {CEDAR} \wedge \overline{\text {THC}}. \end{aligned}$$
(6)

The interaction trigger (\(T_\mathrm{int}\)) was provided by the anti-coincidence of the incoming proton beam and a scintillation counter S4 (\(T_\mathrm{int} = T_\mathrm{beam} \wedge \overline{\text {S4}}\)). The S4 counter with a 2 cm diameter, was placed between the VTPC-1 and VTPC-2 detectors along the beam trajectory at about 3.7 m from the target; see Fig. 1. A large fraction of beam protons that interact in the target does not reach S4. The interaction and beam triggers were run simultaneously. The beam trigger events were recorded with a frequency by a factor of about 10 lower than the frequency of interaction trigger events.

4 Data processing and simulation

Detector parameters were optimized by a data-based calibration procedure which also took into account their time dependence, for details see Refs. [20, 21].

The main steps of the data reconstruction procedure were:

  1. (i)

    cluster finding in the TPC raw data, calculation of the cluster center of gravity and total charge,

  2. (ii)

    reconstruction of local track segments in each TPC separately,

  3. (iii)

    matching of track segments into global tracks,

  4. (iv)

    track fitting through the magnetic field and determination of track parameters at the first measured TPC cluster,

  5. (v)

    determination of the interaction vertex using the beam trajectory (x and y coordinates) fitted in the BPDs and the trajectories of tracks reconstructed in the TPCs (z coordinate),

  6. (vi)

    refitting the particle trajectory using the interaction vertex as an additional point and determining the particle momentum at the interaction vertex,

  7. (vii)

    matching of ToF hits with the TPC tracks.

The accuracy of the transverse position of the main vertex is given by the resolution of the BPDs (\(\approx \)100 \(\upmu \)m). The resolution of the longitudinal position determination is given by the TPC track reconstruction procedure and depends on the track multiplicity and magnetic field. For inelastic p+p interactions 158 \(\text{ GeV }\!/\!c\) it is about 2 cm.

A simulation of the NA61/SHINE detector response was used to correct the reconstructed data. Several MC models were compared with the NA61/SHINE results on p+p, p+C and \(\pi \)+C interactions: FLUKA2008, URQMD1.3.1, VENUS4.12, EPOS1.99, GHEISHA2002, QGSJetII-3 and Sibyll2.1 [19, 22]. Based on these comparisons and taking into account continuous support and documentation from the developers the EPOS model was selected for the MC simulation. The simulation consisted of the following steps:

  1. (i)

    generation of inelastic p+p interactions using the EPOS model,

  2. (ii)

    propagation of outgoing particles through the detector material using the GEANT 3.21 package which takes into account the magnetic field as well as relevant physics processes, such as particle interactions and decays,

  3. (iii)

    simulation of the detector response using dedicated NA61/SHINE packages which simulates charge clusters in the TPCs and introduces distortions corresponding to all corrections applied to the real data,

  4. (iv)

    simulation of the interaction trigger selection by checking whether a charged particle hits the S4 counter, see Sect. 3.3,

  5. (v)

    storage of the simulated events in a file which has the same format as the raw data,

  6. (vi)

    reconstruction of the simulated events with the same reconstruction chain as used for the real data,

  7. (vii)

    matching of the reconstructed to the simulated tracks based on the cluster positions.

It should be underlined that only inelastic p+p interactions in the hydrogen in the target cell were simulated and reconstructed. Thus the Monte Carlo-based corrections (see Sect. 5) can be applied only for inelastic events. The contribution of elastic events is removed by the event selection cuts (see Sect. 5.1), whereas the contribution of off-target interactions is subtracted based on the data (see Sect. 5.4).

5 Analysis procedure

The analysis procedures consisted of the following steps:

  1. (i)

    applying event and track selection criteria,

  2. (ii)

    evaluation of the moments of distributions of quantities needed to calculate fluctuations (Eqs. 1, 2, 3, 4),

  3. (iii)

    evaluation of corrections to the moments based on experimental data and simulations,

  4. (iv)

    calculation of the corrected fluctuations.

Corrections for the following biases were evaluated and applied:

  1. (i)

    contribution of off-target interactions,

  2. (ii)

    losses of inelastic p+p interactions due to the trigger and the event and track selection criteria,

  3. (iii)

    contribution of particles other than primary charged hadrons,

  4. (iv)

    losses of primary charged hadrons due to the track selection criteria.

The final results refer to charged hadrons produced in the analysis acceptance in inelastic proton–proton interactions at 20, 31, 40, 80, and 158 \(\text{ GeV }\!/\!c\) beam momenta. Products of electromagnetic decays are included. Products of weak decays and secondary interactions among the tracks satisfying the selection criteria are corrected for. The result is referred to as accepted primary hadrons.

The list of analyzed data sets together with statistics of all recorded and selected events in target inserted and target removed configurations is presented in Table 3.

Table 3 Data sets together with the statistics of events recorded and selected for the analysis in target inserted and target removed configurations

5.1 Event selection criteria

The following event selection criteria were applied to the events recorded with the interaction trigger (Table 4):

  1. (i)

    no off-time beam particle was detected within \(\pm 1.5\ \upmu \)s around the trigger particle,

  2. (ii)

    the beam particle trajectory was measured in BPD-3 and at least one of BPD-1 or BPD-2 detectors,

  3. (iii)

    there was at least one track reconstructed in the TPCs and fitted to the interaction vertex,

  4. (iv)

    events with a single, well-measured positively charged track with absolute momentum close to the beam momentum (\(p > p_\mathrm{beam} -1~\) \(\text{ GeV }\!/\!c\)) were rejected.

  5. (v)

    the vertex z position (fitted using the beam and TPC tracks) was not farther away than 50 cm from the center of the LHT,

The off-line (listed above) and on-line (the interaction trigger condition; see Sect. 3.3) event cuts select well-measured (cuts (i), (ii)) inelastic p+p interactions. The background due to elastic interactions is removed (cuts (iii) and (iv)) and the contribution of off-target interactions is reduced (cut (v)) and was later subtracted using data recorded in target removed configuration. The losses of inelastic interactions due to the event selection procedure were corrected using a simulation (see below).

Table 4 Summary of event and track selection criteria used in the analysis

5.2 Track selection criteria

In order to select well-measured tracks of primary charged hadrons as well as to reduce the contamination of tracks from secondary interactions, weak decays and off-time interactions the following track selection criteria were applied (Table 4):

  1. (i)

    the track momentum fit at the interaction vertex should have converged,

  2. (ii)

    the total number of reconstructed points on the track should be greater than 30,

  3. (iii)

    the sum of the number of reconstructed points in VTPC-1 and VTPC-2 should be greater than 15 or the number of reconstructed points in the GAP-TPC should be greater than 5,

  4. (iv)

    the distance between the track extrapolated to the interaction plane and the interaction point (impact parameter) should be smaller than 4 cm in the horizontal (bending) plane and 2 cm in the vertical (drift) plane,

  5. (v)

    the track should be measured in a high (\({\ge }90\%\)) TPC acceptance and tracking efficiency region (see Sect. 5.3),

  6. (vi)

    tracks with energy loss and total momentum values characteristic for electrons were rejected.

  7. (vii)

    the transverse momentum was required to be less than 1.5 \(\text{ GeV }\!/\!c\).

5.3 Determination of the analysis kinematical acceptance

The detection and reconstruction inefficiencies were corrected using the simulation. However, in order to limit the impact of possible inaccuracies of this simulation, only regions were accepted where the reconstruction efficiency (defined as the ratio of the number of reconstructed and matched Monte Carlo tracks passing the track selection criteria to the number of generated tracks) is greater than 90%. These regions were identified using a separate, statistically independent simulation in three-dimensional bins of rapidity, azimuthal angle and transverse momentum. The result is stored in the form of three-dimensional tables Ref. [23] where 0 signal bins excluded from the acceptance and 1 those that are included. The population of charged particles within this acceptance is shown in Fig. 2 for 20 and 158 \(\text{ GeV }\!/\!c\) p+p interactions.

Fig. 2
figure 2

Population of all charged hadrons in the analysis acceptance used in this paper to study transverse momentum and multiplicity fluctuations: the two top plots refer to 20 \(\text{ GeV }\!/\!c\) and the two bottom plots to 158 \(\text{ GeV }\!/\!c\) p+p interactions

5.4 Data-based correction for off-target interactions

The event quantities used to derive final fluctuation measures were calculated for events recorded in the LH filled (target inserted, I) and removed (target removed, R) configurations. The latter data set represents interactions with material downstream and upstream of the liquid hydrogen (off-target interactions). Then, in the absence of other corrections, the corrected mean value of the distribution of any quantity (denoted X) was calculated as

$$\begin{aligned} \langle X \rangle = \frac{1}{N_\mathrm{ev}^{I} - \varepsilon \cdot N_\mathrm{ev}^{R}} \left( \sum \limits _{i=1}^{N_\mathrm{ev}^{I}} X_{i}^{I} - \varepsilon \cdot \sum \limits _{j=1}^{N_\mathrm{ev}^{R}} X_{j}^{R} \right) , \end{aligned}$$
(7)

where \(N_\mathrm{ev}\) denotes the number of events and \(\varepsilon \) is a normalization factor. The value of \(\varepsilon \) was derived based on the distribution of the fitted z coordinate of the interaction vertex. All vertices far away from the target originate from interactions with the beam line and detector materials. Neglecting the beam attenuation in the target one gets

$$\begin{aligned} \varepsilon = \frac{N_\mathrm{ev}^{I}}{N_\mathrm{ev}^{R}}\bigg |_{z>-450\mathrm {\ cm}}. \end{aligned}$$
(8)

Examples of distributions of the z coordinate of the reconstructed interaction vertex for events recorded with the liquid hydrogen inserted and removed are shown in Fig. 3.

Fig. 3
figure 3

Distributions of the z coordinate of the reconstructed interaction vertex for events recorded with the target inserted (I) and removed (R). The target removed distribution was normalized to the target inserted one in the region \(z > -450\) cm

5.5 Simulation-based correction for other biases

The correction for losses due to event and track selections, reconstruction inefficiency and the interaction trigger, as well as for background of non-primary charged hadrons was calculated using the Epos1.99  [24] event generator. The simulated data were reconstructed with the standard NA61/SHINE procedure. Tables of correction factors were calculated as the ratio of generated to reconstructed tracks. The reconstructed tracks were required to pass the event and track selection criteria. The corrections were obtained in bins of N, \(P_{T}\) and \(P_{T,2} = \sum \nolimits _{i=1}^{N} p^{2}_{T_{i}}\) for positively, negatively and all charged hadrons, separately. The event quantity \(P_{T,2}\) is needed to calculate \(\omega [p_T]\) using only event quantities. \(\omega [p_T]\) appears in the normalization factors \(C_{\varDelta }\) and \(C_{\varSigma }\). Thus there are three three-dimensional tables of correction factors. Then for a given charge selection an event i with N, \(P_{T}\) and \(P_{T,2}\) is weighted with the correction factor \(c_{i}\) from the table of corrections for this charge selection and from the bin which corresponds to N, \(P_{T}\) and \(P_{T,2}\). Thus, in the absence of off-target interactions, the corrected mean value of a quantity X is

$$\begin{aligned} \langle X \rangle = \frac{1}{ M_\mathrm{ev} } \left( \sum \limits _{i=1}^{ N_\mathrm{ev} } c_{i}~X_{i} \right) , \end{aligned}$$
(9)

where \(M_\mathrm{ev} = \sum \limits _{i=1}^{ N_\mathrm{ev} } c_{i} \).

5.6 The final correction

The final results were obtained by combining the data-based correction for off-target interactions with the Monte Carlo based correction for other biases. It was calculated as

$$\begin{aligned} \langle X \rangle = \frac{1}{ M_\mathrm{ev}^I - \varepsilon ~M_\mathrm{ev}^R } \left( \sum \limits _{i=1}^{ N_\mathrm{ev}^I } c_{i}~X_{i}^I - \varepsilon \cdot \sum \limits _{j=1}^{ N_\mathrm{ev}^R } c_{j}~X_{j}^R \right) . \end{aligned}$$
(10)

In order to illustrate the correction procedure and its impact on the results selected distributions of the relevant event quantities, N, \(P_T\), \(NP_T\) and \(P_{T,2}\), and results on \(\varDelta [P_{T},N]\), \(\varSigma [P_{T},N]\), \(\varPhi _{p_{T}}\) and \(\omega [N]\) obtained at the subsequent stages of the procedure are presented and discussed.

Figure 4 shows uncorrected distributions of the event quantities for data recorded with proton beams at 20 \(\text{ GeV }\!/\!c\) and 158 \(\text{ GeV }\!/\!c\) with the liquid hydrogen inserted and removed. The spectra with the removed liquid hydrogen are multiplied by the \(\varepsilon \) factor defined in Eq. 8. The distributions with the LH inserted and removed have similar shape. The normalized spectra for the LH removed are about 10 times lower than the ones for the LH inserted. Thus the correction for the off-target interactions is expected to be small (see below).

Fig. 4
figure 4

Examples of uncorrected distributions of event quantities for p+p interactions at 20 and 158 \(\text{ GeV }\!/\!c\) beam for data recorded with the liquid hydrogen inserted (I) and removed (R). The spectra with the removed liquid hydrogen are multiplied by the \(\varepsilon \) factor (see Eq. 8)

Figure 5 presents the ratio of fully corrected (see Eq. 10) to uncorrected distributions of the event quantities for p+p interactions at 158 \(\text{ GeV }\!/\!c\) for positively and negatively charged particles, separately. The ratio varies significantly from about 0.5 to about 1.7.

Fig. 5
figure 5

Examples of ratios of corrected to uncorrected distributions of event quantities for p+p interactions at 158 \(\text{ GeV }\!/\!c\)

Results for uncorrected, corrected only for the off-target interactions and fully corrected data are shown in Fig. 6. Statistical and systematic uncertainties (see below) of the fully corrected points are also plotted for comparison. The corrections for off-target interactions only weakly change the results. The corrections for the remaining experimental biases have significant impact in particular on results for \(\omega [N]\) and \(\varDelta [P_T,N]\). It is mostly due to the requirement of a well-fitted interaction vertex as well as corrections for the trigger bias and the off-line selection of events. This is illustrated in Fig. 7 where the collision energy dependence of \(\omega [N]\) and \(\varDelta [P_T,N]\) for fully corrected data, uncorrected for the trigger bias, uncorrected for the trigger bias and for the off-line event selection as well as fully uncorrected data are presented. In addition, the results with all corrections but the correction for the contribution of non-primary tracks (feed-down) are shown.

Fig. 6
figure 6

Results on \(\varDelta [P_{T},N]\), \(\varSigma [P_{T},N]\), \(\varPhi _{p_{T}}\) and \(\omega [N]\) as a function of collision energy before corrections (open squares), corrected for contributions of off-target interactions (open circles) and additionally corrected for all other experimental biases; see Sect. 5.6 (filled circles). Statistical uncertainties (mostly invisible) are shown by vertical bars, systematic uncertainties by shaded bands

Fig. 7
figure 7

Results on \(\omega [N]\) and \(\varDelta [P_{T},N]\) as a function of collision energy for all charged hadrons after all corrections (full circles), without the correction for the trigger bias (upward pointing triangles), without corrections for the trigger bias and the off-line event selection (crosses), without correction for feed-down (diamonds) and uncorrected (open circles). Statistical uncertainties (mostly invisible) are shown by vertical bars, systematic uncertainties by shaded bands

5.7 Statistical uncertainties

The statistical uncertainties of \(\varPhi _{p_{T}}\), \(\varDelta [P_{T},N]\) and \(\varSigma [P_{T},N]\) were estimated as follows. The whole sample of events was divided into 30 independent sub-samples. The values of \(\varPhi _{p_{T}}\), \(\varDelta [P_{T},N]\), and \(\varSigma [P_{T},N]\) were evaluated for each sub-sample separately (following all the procedures described above, including target removed and Monte Carlo corrections) and the dispersions (\(D_{\varPhi }\), \(D_{\varDelta }\), and \(D_{\varSigma }\)) of the results were then calculated. The statistical uncertainty of \(\varPhi _{p_{T}}\) (\(\varDelta [P_{T},N]\) or \(\varSigma [P_{T},N]\)) is taken to be equal to \(D_{\varPhi }/\sqrt{30}\) (\(D_{\varDelta }/\sqrt{30}\) or \(D_{\varSigma }/\sqrt{30}\)).

For each beam momentum, 5 million events were generated and reconstructed, several times more than the recorded experimental data. Therefore statistical uncertainties arising from the event statistics of the simulation were neglected.

5.8 Systematic uncertainties

Systematic uncertainties were estimated by changing:

  1. (i)

    event and track selection criteria: tight and loose cuts (see Table 4) and,

  2. (ii)

    model chosen for the simulation: Epos1.99  [24] and Venus4.12  [25, 26].

To estimate the contribution of each source, the complete analysis was repeated under these different conditions.

The uncertainties of corrections for the event selection and feed-down are expected to be mostly due to uncertainties in modeling of p+p interactions, whereas the uncertainties of the remaining corrections (e.g. for the track reconstruction inefficiency) are expected to be mostly due to imperfectness of the detector simulation. Total systematic uncertainties were calculated by adding in quadrature uncertainties calculated in (i) and (ii).

6 Results

The results shown in this section refer to primary accepted hadrons produced in all inelastic p+p interactions. The accepted hadrons are hadrons produced within the kinematical acceptance selected for the analysis [23]; see also Sect. 5. The results are corrected for event and track losses due to detector inefficiencies, selection criteria and the interaction trigger, as well as contamination of tracks from weak decays and secondary interactions and leptons from primary interactions. The correction procedure is described in detail in Sect. 5. Results are not corrected for the kinematic acceptance. This acceptance should be taken into account when the data are compared with models. Table 5 shows mean multiplicities of negatively and positively charged hadrons within the kinematical acceptance selected for the analysis in this paper.

Table 5 Mean multiplicities of negatively and positively charged hadrons produced in inelastic p+p interactions at 20, 31, 40, 80 and 158 \(\text{ GeV }\!/\!c\) in the kinematical acceptance used in this paper and in the NA49 publications. For comparison also mean multiplicity of \(\pi ^-\) mesons in full phase space as obtained by NA61/SHINE  [20] is shown in the second column

Figure 8 shows the results on \(\varDelta [P_{T},N]\), \(\varSigma [P_{T},N]\), \(\varPhi _{p_T}\) and \(\omega [N]\) calculated separately for all charged, negatively charged, and positively charged hadrons produced in inelastic p+p interactions at 20–158 \(\text{ GeV }\!/\!c\) beam momentum.

Fig. 8
figure 8

Collision energy dependence of \(\varDelta [P_{T},N]\), \(\varSigma [P_{T},N]\), \(\varPhi _{p_{T}}\) and \(\omega [N]\) for inelastic p+p interactions. The NA61/SHINE measurements (filled circles) are compared with predictions of the Epos1.99  (upward pointing triangles) and UrQMD (downward pointing triangles) models (both in the NA61/SHINE acceptance). Statistical uncertainties (mostly invisible) are shown by vertical bars, systematic uncertainties by shaded bands

First, one observes that the results, in general, significantly deviate from the reference values \(\varDelta [P_{T},N] = \varSigma [P_{T},N] = 1\) and \(\varPhi _{p_T} =0\) \(\text{ MeV }/c\), which are expected in the case of independent particle production.

Second, the results for the three charge selections show differences. The effect of long-range correlations, e.g., caused by conservation laws, decreases with decreasing fraction of accepted particles. In particular, many charged hadrons come from decays of resonances into two or more hadrons, e.g., \(\rho \rightarrow \pi ^+ + \pi ^-\). Positively charged hadrons are mostly \(\pi ^{+}\) mesons and protons. Less of them come from resonance decays into two or more positively charged hadrons. The majority of negatively charged hadrons are negatively charged pions and only a small fraction comes from resonance decays into two or more negatively charged hadrons. Thus the correlations due to resonance decay kinematics decrease from all charged hadrons to positively and negatively ones. Other sources of correlations like quantum statistics and Coulomb interactions are also expected to impact fluctuations differently for different charge selections.

Third, the collision energy dependence of \(\varDelta [P_{T},N]\) and \(\varSigma [P_{T},N]\) are opposite. With increasing collision energy \(\varDelta [P_{T},N]\) decreases, whereas \(\varSigma [P_{T},N]\) increases. The largest deviations from unity for both \(\varDelta [P_{T},N]\) and \(\varSigma [P_{T},N]\) are observed for all charged hadrons at 158 \(\text{ GeV }\!/\!c\). Note that at this energy the fraction of accepted particles is the largest.

7 Comparison with models and central Pb+Pb collisions

Figure 8 shows a comparison of the measured fluctuations with predictions of two commonly used models: Epos1.99  [24] and UrQMD [27]. The predictions were calculated for the NA61/SHINE acceptance [23].

The Epos1.99 predictions agree quantitatively with results on \(\omega [N]\), They exhibit the same trend with increasing collision energy as the experimental results on \(\varSigma [P_{T},N]\) and \(\varPhi _{p_T}\), but there is qualitative disagreement with results on \(\varDelta [P_{T},N]\). Note that the Epos1.99 model agrees reasonably well with single particle spectra of identified hadrons produced in the same inelastic p+p interactions [28].

Essentially all UrQMD predictions disagree with the data. In many cases even qualitative disagreement is observed. Note that UrQMD also fails to describe single particle spectra in p+p reactions [28]. These disagreements are probably due to problems in modeling of hadron production via resonance decays [29].

NA61/SHINE records data at beam momenta per nucleon identical to or close to those at which NA49 [4] performed measurements of central Pb+Pb collisions. This allows for a direct study of the system size dependence of various hadron production properties in the SPS energy range.

The NA61/SHINE results presented in this paper are obtained in a phase space acceptance [23] which is larger than the acceptances used by NA49 to obtain the corresponding results for central Pb+Pb collisions [5,6,7,8]. Thus, in order to compare the NA61/SHINE measurements with the NA49 data, the more restrictive NA49 cuts were applied to the NA61/SHINE data.

The narrowest acceptance (referred as to the NA49-N acceptance ) was used in the NA49 study of collision energy dependence of (transverse momentum)-multiplicity fluctuations [8]. The NA49-N acceptance is limited to the rapidity range \(1.1< y_{\pi } < 2.6\), where \(y_{\pi }\) is the rapidity calculated in the cms assuming the pion mass, and selects particles in a common narrow azimuthal angle wedge at all beam momenta.

Figure 9 shows the NA61/SHINE results on \(\varPhi _{p_T}\) in inelastic p+p interactions within the full NA61/SHINE acceptance and within the NA49-N acceptance. As expected, the restriction of the acceptance strongly reduces the values of the fluctuation measure.

Fig. 9
figure 9

NA61/SHINE results on \(\varPhi _{p_T}\) vs. collision energy for inelastic p+p interactions within the full NA61/SHINE acceptance (see Fig. 8) and within the NA49-N acceptance (see Ref. [8]). Statistical errors (mostly invisible) are shown by vertical bars, systematic uncertainties by shaded bands

Figure 10 presents a comparison of \(\varPhi _{p_T}\) for inelastic p+p (NA61/SHINE) interactions and the 7.2% most central Pb+Pb (NA49) collisions within the NA49-N acceptance. No significant difference is observed between the results for the two reactions. Moreover, neither shows a structure in the collision energy dependence which could be attributed to the onset of deconfinement or the critical point.

Fig. 10
figure 10

\(\varPhi _{p_T}\) versus collision energy for inelastic p+p (NA61/SHINE) interactions and the 7.2% most central Pb+Pb (NA49 [8]) collisions in the NA49-N acceptance. Statistical errors are shown by vertical bars, systematic uncertainties by shaded bands

Figure 11 shows the collision energy dependence of the scaled variance of the multiplicity distributions for inelastic p+p (NA61/SHINE) interactions and the 1% most central Pb+Pb (NA49) collisions within the NA49-M (top row) and NA49-B (bottom row) acceptances [6]. The NA49 medium (NA49-M) acceptance includes all particles well measured by the NA49 detector within the rapidity range \(1.1< y_{\pi } < y_\mathrm{beam}\) and the NA49 broad acceptance (NA49-B) extends the range to \(0< y_{\pi } < y_\mathrm{beam}\). Significant differences are observed between the results for p+p and Pb+Pb reactions; see below for a discussion. However, neither shows a structure in the collision energy dependence which could be attributed to the onset of deconfinement or the critical point.

Fig. 11
figure 11

Collision energy dependence of the scaled variance of the multiplicity distribution for inelastic p+p (NA61/SHINE) interactions and the 1% most central Pb+Pb (NA49 [6]) collisions in the NA49-M (top) and NA49-B (bottom) acceptances (see text for details). Statistical errors (mostly invisible) are shown by vertical bars, systematic uncertainties by shaded bands

The scaled variance is significantly larger for inelastic p+p interactions at 158 \(\text{ GeV }\!/\!c\) than for the 1% most central Pb+Pb collisions at 158\(A\,\text{ GeV }\!/\!c\). In the following this difference will be discussed within the Wounded Nucleon Model (WNM) [13] and the Statistical Model (SM) [30] of particle production. These models are selected because they are simple and play a special role in analyzing the physics of heavy-ion collisions. The discussion will be focused on the results for negatively charged hadrons as they are weakly influenced by decays of resonances [31].

The Wounded Nucleon Model [13] assumes that particle production in nucleon–nucleon and nucleus–nucleus collisions is an incoherent superposition of particle production from wounded nucleons (nucleons which interacted inelastically and whose number is calculated using the Glauber approach). Properties of wounded nucleons are assumed to be independent of the size of the colliding nuclei, e.g. they are the same in p+p and Pb+Pb collisions at the same collision energy per nucleon. The scaled variance of the multiplicity distribution of produced particles calculated within the WNM reads [14]

$$\begin{aligned} \omega [N] = \omega ^*[N] + \langle N \rangle / \langle W \rangle \cdot \omega [W]~, \end{aligned}$$
(11)

where W is the number of wounded nucleons and \(\omega ^*[N]\) denotes the scaled variance calculated for any fixed value of W. The number of wounded nucleons (protons) in p+p interactions is fixed to two and the measured scaled variance for p+p interactions can be used as \(\omega ^*[N]\). The second component of the sum Eq. 11 vanishes or is positive. Consequently the WNM predicts that the scaled variance in nucleus–nucleus collisions has to be equal to or larger than the one in inelastic proton–proton interactions. The NA61/SHINE and NA49 results presented in Fig. 12 clearly contradict this prediction.

Fig. 12
figure 12

Scaled variance of the multiplicity distribution of negatively charged hadrons as a function of the mean number of wounded nucleons. Results for inelastic p+p (NA61/SHINE) interactions (filled circles) and the 1% most central Pb+Pb (NA49 [6]) collisions (squares) in the NA49-B acceptances are shown together with the predictions of the Wounded Nucleon Model (left) and the statistical model (right) (see text for explanations)

Note that at SPS energies multiplicity distributions of negatively charged hadrons in inelastic p+p, n+p and n+n interactions are different [32]. The largest difference is observed between the distributions in p+p and n+n interactions. Thus the prediction of the WNM for Pb+Pb collisions which takes into account the isospin effects estimated using the Epos model ranges between the results for p+p (the measured NA61/SHINE data) and n+n interactions (estimated based on the Epos predictions and the NA61/SHINE data). This range is bounded in Fig. 12 (left) by the horizontal dashed lines.

The Statistical Model of multi-particle production was initiated by Fermi in 1950 [30]. Its basic assumption states that all possible micro-states of the macroscopic system created in a collision are equally probable. For a non-interacting (ideal) gas of Boltzmann particles in the grand canonical ensemble (IB-GCE) the multiplicity distribution is Poissonian (\(\omega [N] = 1\)) independent of the (fixed) system volume (upper dashed line in Fig. 12 (right)). Resonance decays and Bose effects increase the scaled variance from 1 to 1.1, whereas the conservation laws reduce it if the system volume is sufficiently large [31]. The combined effect is demonstrated by the lower dashed line in Fig. 12 (right). In fact the NA49 result for the 1% most central Pb+Pb collisions at 158\(A\,\text{ GeV }\!/\!c\) is well described by the hadron gas model in the micro–canonical ensemble (HG-MCE) [31]. Within the statistical models a scaled variance significantly larger than one as measured in inelastic p+p interactions at high collision energies (top SPS and higher) can be understood as a result of volume and/or energy fluctuations [33].

Multiplicity and transverse momentum fluctuations quantified using strongly intensive measures were studied in a number of theoretical papers. In particular, the influence of resonance decays [17, 34], quantum statistics [18, 34] and a dependence of the mean transverse momentum on multiplicity [17, 34] was considered. These studies are important for a qualitative understanding of experimental data and predictions of complicated Monte Carlo models. However, the obtained results cannot be directly compared to the measurements as they did not include important effects. In particular, the limited experimental acceptance is difficult to take into account in simple models.

8 Summary

This paper presents measurements of multiplicity and transverse momentum fluctuations of negatively, positively and all charged hadrons produced in inelastic p+p interactions at 20, 31, 40, 80 and 158 \(\text{ GeV }\!/\!c\) beam momentum. Values for the scaled variance of multiplicity distributions and three strongly intensive measures of multiplicity–transverse momentum fluctuations \(\varDelta [P_{T},N]\), \(\varSigma [P_{T},N]\) and \(\varPhi _{p_T}\) were obtained. These results were calculated in the NA61/SHINE acceptance which has to be taken into account in a quantitative comparison with models and other results. For the first time the results on fluctuations are fully corrected for experimental biases, in particular, for the losses of inelastic events due to the trigger and analysis event selections as well as for the contamination of particles from weak decays and secondary interactions.

The measurements of multiplicity and transverse momentum fluctuations significantly deviate from expectations for independent particle production (\(\varDelta [P_{T},N] = \varSigma [P_{T},N] = 1\), \(\varPhi _{p_T} =0\) \(\text{ MeV }/c\)). They also depend on the charges of the selected hadrons. The scaled variances of the multiplicity distributions increase with increasing collision energy and for all charged hadrons at 158 \(\text{ GeV }\!/\!c\) reach the value of 2.

The string-resonance Monte Carlo models Epos and Urqmd do not describe the data well. In several cases the collision energy dependence predicted by the models even shows a trend qualitatively different from the measurements.

The scaled variance of multiplicity distributions is significantly higher in inelastic p+p interactions than in the 1% most central Pb+Pb collisions measured by NA49 at the same energy per nucleon. The largest difference is observed at 158\(A\,\text{ GeV }\!/\!c\). This result is in qualitative disagreement with the predictions of the Wounded Nucleon Model. The low level of multiplicity fluctuations measured in central Pb+Pb collisions is well reproduced by the statistical model in the micro-canonical ensemble formulation. Within the statistical framework the enhanced multiplicity fluctuations in inelastic p+p interactions can be interpreted as due to event-by-event fluctuations of the fireball energy and/or volume.