Study of the inclusive production of charged pions, kaons, and protons in pp collisions at sqrt(s) = 0.9, 2.76, and 7 TeV

Spectra of identified charged hadrons are measured in pp collisions at the LHC for sqrt(s) = 0.9, 2.76, and 7 TeV. Charged pions, kaons, and protons in the transverse-momentum range pt approximately 0.1-1.7 GeV and for rapidities abs(y)<1 are identified via their energy loss in the CMS silicon tracker. The average pt increases rapidly with the mass of the hadron and the event charged-particle multiplicity, independently of the center-of-mass energy. The fully corrected pt spectra and integrated yields are compared to various tunes of the PYTHIA6 and PYTHIA8 event generators.


Introduction
The study of hadron production has a long history in high-energy particle and nuclear physics, as well as cosmic-ray physics. The absolute yields and the transverse momentum (p T ) spectra of identified hadrons in high-energy hadron-hadron collisions are among the basic physical observables that can be used to test the predictions for non-perturbative quantum chromodynamics processes like hadronization and soft parton interactions, and their implementation in Monte Carlo (MC) event generators. The dependence of these quantities on the hardness of the pp collision provides valuable information on multi-parton interactions as well as on other final-state effects. In addition, the measurements of baryon (and notably proton) production are not reproduced by the existing models, and more data at higher energy may help improving the models. Spectra of identified particles in proton-proton (pp) collisions also constitute an important reference for high-energy heavy-ion studies, where final-state effects are known to modify the spectral shape and yields of different hadron species.
The present analysis focuses on the measurement of the p T spectra of charged hadrons, identified mostly via their energy deposits in silicon detectors, in pp collisions at √ s = 0.9, 2.76, and 7 TeV. In certain phase space regions, particles can be identified unambiguously while in other regions the energy loss measurements provide less discrimination power and more sophisticated methods are necessary. This paper is organized as follows. The Compact Muon Solenoid (CMS) detector, operating at the Large Hadron Collider (LHC), is described in Section 2. Elements of the data analysis, such as event selection, tracking of charged particles, identification of interaction vertices, and treatment of secondary particles are discussed in Section 3. The applied energy loss parametrization, the estimation of energy deposits in the silicon, and the calculation of the energy loss rate of tracks are explained in Section 4. In Section 5 the various aspects of the unfolding of particle yields are described. After a detailed discussion of the applied corrections (Section 6), the final results are shown in Section 7 and summarized in the conclusions.

The CMS detector
The central feature of the CMS apparatus is a superconducting solenoid of 6 m internal diameter. Within the field volume are the silicon pixel and strip tracker, the crystal electromagnetic calorimeter, and the brass/scintillator hadron calorimeter. In addition to the barrel and endcap detectors, CMS has extensive forward calorimetry. CMS uses a right-handed coordinate system, with the origin at the nominal interaction point and the z axis along the counterclockwise beam direction. The pseudorapidity and rapidity of a particle with energy E, momentum p, and momentum along the z axis p z are defined as η = − ln tan(θ/2) where θ is the polar angle with respect to the z axis and y = 1 2 ln[(E + p z )/(E − p z )], respectively. A more detailed description of CMS can be found in Ref. [1].
Two elements of the CMS detector monitoring system, the beam scintillator counters (BSCs) and the beam pick-up timing for the experiments (BPTX) devices, were used to trigger the detector readout. The two BSCs are located at a distance of ±10.86 m from the nominal interaction point (IP) and are sensitive to particles in the |η| range from 3.23 to 4.65. Each BSC is a set of 16 scintillator tiles. The BSC elements are designed to provide hit and coincidence rates. The two BPTX devices, located around the beam pipe at a distance of 175 m from the IP on either side, are designed to provide precise information on the bunch structure and timing of the incoming beam. A steel/quartz-fibre forward calorimeter (HF) covers the region of |η| between about 3.0 and 5.0. The HF tower segmentation in η and azimuthal angle φ is 0.175×0.175, except for |η|  [2]. The inset shows the region 1 < p < 5 GeV/c. (b) For each particle, the accessible (y, p T ) area is contained between the upper thicker (determined by particle identification capabilities) and the lower thinner lines (determined by acceptance and efficiency). More details are given in Section 2.1. above 4.7 where the segmentation is 0.175×0. 35.
The tracker measures charged particles within the pseudorapidity range |η| < 2.4. It has 1440 silicon pixel and 15 148 silicon strip detector modules and is located in the 3.8 T field of the solenoid. The pixel detector [3] consists of three barrel layers (PXB) at radii of 4.4, 7.3, and 10.2 cm as well as two endcap disks (PXF) on each side of the PXB. The detector units are segmented n-on-n silicon sensors of 285 µm thickness. Each readout chip serves a 52 × 80 array of 150 µm × 100 µm pixels. In the data acquisition system, zero suppression is performed with adjustable thresholds for each pixel. Offline, pixel clusters are formed from adjacent pixels, including both side-by-side and corner-by-corner adjacent pixels. The strip tracker [4] employs p-in-n silicon wafers. It is partitioned into different substructures: the tracker inner barrel (TIB) and the tracker inner disks (TID) are the innermost part with 320 µm thick sensors, surrounded by the tracker outer barrel (TOB) with 500 µm thick sensors. On both sides, the tracker is completed by endcaps with a mixture of 320 µm thick sensors (TEC3) and 500 µm thick sensors (TEC5). The first two layers of TIB and TOB and some of the TID and TEC contain "stereo" modules: two silicon modules mounted back-to-back with a 100 mrad angle to provide twodimensional hit resolution. Each readout chip serves 128 strips. Algorithms are run in the Front-End Drivers (FED) to perform pedestal subtraction, common-mode subtraction and zero suppression. Only a small fraction of the channels are read out in one event. Offline, clusters are formed by combining contiguous hits. The tracker provides an impact-parameter resolution of ∼15 µm and an absolute p T resolution of about 0.02 GeV/c in the range p T ≈ 0.1-2 GeV/c, of relevance here.

Particle identification capabilities
The identification of charged particles is often based on the relationship between energy loss rate and total momentum (Fig. 1a). Particle reconstruction at CMS is limited by the acceptance (C a ) of the tracker (|η| < 2.4) and by the low tracking efficiency (C e ) at low momentum (p > 0.05, 0.10, 0.20, and 0.35 GeV/c for e, π, K, and p, respectively), while particle identification capabilities are restricted to p < 0.15 GeV/c for electrons, p < 1.20 GeV/c for pions, p < 1.05 GeV/c for kaons, and p < 1.70 GeV/c for protons (Fig. 1a). Pions are accessible up to a higher momentum than kaons because of their high relative abundance, as discussed in Section 5.2. The (y, p T ) region where pions, kaons and protons can all be identified is visible in Fig. 1b. The region −1 < y < 1 was chosen for the measurement, since it maximizes the p T coverage.

Data analysis
The 0.9 and 7 TeV data were taken during the initial low multiple-interaction rate (low "pileup") runs in early 2010, while the 2.76 TeV data were collected in early 2011. The requirement of similar amounts of produced particles at the three center-of-mass energies and that of small average number of pileup interactions led to 8.80, 6.74 and 6.20 million events for √ s = 0.9 TeV, 2.76 TeV, and 7 TeV, respectively. The corresponding integrated luminosities are 0.227±0.024 nb −1 , 0.143±0.008 nb −1 and 0.115±0.005 nb −1 [5,6], respectively.

Event selection and related corrections
The event selection consists of the following requirements: • at the trigger level, the coincidence of signals from both BPTX devices, indicating the presence of both proton bunches crossing the interaction point, along with the presence of signals from either of the BSCs; • offline, the presence of at least one tower with energy above 3 GeV in each of the HF calorimeters; at least one reconstructed interaction vertex (Section 3.3); the suppression of beam-halo and beam-induced background events, which usually produce an anomalously large number of pixel hits [7].
The efficiencies for event selection, tracking, and vertexing were evaluated by means of simulated event samples produced with the PYTHIA 6.420 [8] MC event generator at each of the three center-of-mass energies. The events were reconstructed in the same way as the collision data. The PYTHIA tunes D6T [9], Z1, and Z2 [10] were chosen, since they describe the measured event properties reasonably well, notably the reconstructed track multiplicity distribution. Tune D6T is a pre-LHC tune with virtuality-ordered showers using the CTEQ6L parton distribution functions (PDF). The tunes Z1 and Z2 are based on the early LHC data and generate p T -ordered showers using the CTEQ5L and CTEQ6L PDFs, respectively.
The final results were corrected to a particle level selection, which is very similar to the actual selection described above: at least one particle (τ > 10 −18 s) with E > 3 GeV in the range −5 < η < −3 and one in the range 3 < η < 5; this selection is referred to in the following as "double-sided" (DS) selection. The overall efficiency of the DS selection for a zero-bias sample, according to PYTHIA, is about 66-72% (0.9 TeV), 70-76% (2.76 TeV), and 73-78% (7 TeV). The ranges given represent the spread of the predictions of the different tunes. Mostly nondiffractive (ND) events are selected, with efficiencies in the 88-98% range, but a smaller fraction of double-diffractive (DD) events (32-38%), and single-diffractive dissociation (SD) events are accepted (13-26%) as well. About 90% of the selected events are ND, while the rest are DD or SD, in about equal measure. In order to compare to measurements with a non-singlediffractive (NSD) selection, the particle yields given in this study should be divided by factors of 0.86, 0.89, and 0.91 according to PYTHIA, for √ s = 0.9, 2.76, and 7 TeV, respectively. The systematic uncertainty on these numbers due to the tune dependence is about 3%.  Figure 2: The ratio of selected events to double-sided events (ratio of the corresponding efficiencies in the inelastic sample), according to the PYTHIA6 tunes (0.9 TeV-D6T, 2.76 TeV-Z2, 7 TeV-Z1), as a function of the reconstructed primary charged track multiplicity.
The ratios of the data selection efficiency to the DS selection efficiency are shown as a function of the reconstructed track multiplicity in Fig. 2 for the three center-of-mass energies studied. The ratios are used to correct the measured events; they are approximately independent of the PYTHIA tune. The different behavior of the 2.76 TeV data results from a change in the HF configuration in 2011. The results are also corrected for the fraction of DS events without a reconstructed track. This fraction, as given by the simulation, is about 4%, 3%, and 2.5% for 0.9, 2.76, and 7 TeV, respectively. Since these events do not contain reconstructed tracks, only the event yield must be corrected.

Tracking of charged particles
The extrapolation of particle spectra into the unmeasured regions is model dependent, particularly at low p T . A good measurement therefore requires reliable track reconstruction down to the lowest possible p T . The present analysis extends to p T ≈ 0.1 GeV/c by exploiting special tracking algorithms [11], used in previous studies [7,12], to provide high reconstruction efficiency and low background rate. The charged pion hypothesis was assumed when fitting particle momenta.
The performance of the charged-particle tracking was quantified in terms of the geometrical acceptance, the tracking efficiency, and the fraction of misreconstructed tracks; all these quantities were evaluated by means of simulated events and validated in previous studies [7,12]. The acceptance of the tracker (when at least two pixel hits are required) is flat in the region −2 < η < 2 and p T > 0.4 GeV/c, and its value is about 96-98%. The loss of acceptance at p T < 0.4 GeV/c is caused by energy loss and multiple scattering of particles, which both depend on the particle mass. Likewise, the reconstruction efficiency is about 80-90%, degrading at low p T , also in a mass-dependent way. The misreconstructed-track rate (C f ) is very small, reaching 0.3% only for p T < 0.25 GeV/c; it rises slightly above 2 GeV/c because of the steeply falling p T distribution. The probability of reconstructing multiple tracks (C m ) from a true single track is about 0.1% -mostly due to particles spiralling in the strong magnetic field. The efficiencies and background rates largely factorize in η and p T , but for the final corrections an (η, p T ) grid is used.

Vertexing and secondary particles
The region where pp collisions occur (beam spot) is well measured by reconstructing vertices from many events. Since the bunches are very narrow, the transverse position of the interaction vertices is well constrained; conversely, their z coordinates are spread over a relatively long distance and must be determined on an event-by-event basis. Reconstructed tracks are used for determining the vertex position if they have p T > 0.1 GeV/c and originate from the vicinity of the beam spot, i.e. their transverse impact parameter satisfies the condition d T < 3σ T ; here σ T is the quadratic sum of the uncertainty of d T and the RMS of the beam spot distribution in the transverse plane. The agglomerative vertex-reconstruction algorithm [13] was used, with the z coordinates (and their uncertainty) of the tracks at the point of closest approach to the beam axis as input. This algorithm keeps clustering tracks into vertices as long as the smallest distance between the vertices of the remaining groups of tracks, divided by its uncertainty, is below 35. Simulations indicate that this value minimizes the number of merged vertices (vertices with tracks from two or more true vertices) and split vertices (two or more vertices with tracks from a single true vertex). For single-vertex events, there is no lower limit on the number of tracks associated to the vertex. If multiple vertices are present, only those with at least three tracks are kept.
The distribution of the z coordinates of the reconstructed primary vertices is Gaussian, with standard deviations of 6 cm at 0.9 and 2.76 TeV, and 3 cm at 7 TeV. The simulated data were reweighted so as to have the same vertex z coordinate distributions as the data. The distribution of the distance ∆z between vertices was used to quantify the effect of pileup and the quality of vertex reconstruction. There is an empty region around ∆z = 0, which corresponds to cases in which two true vertices are closer than about 0.4 cm to each other and are merged during vertex reconstruction. The ∆z distribution was therefore used to determine the fraction of merged (and thus lost) vertices, and to estimate the fraction of split vertices (via the non-Gaussian tails). Both effects are at the 0.1% level and were neglected in this study.
The number of primary vertices in a bunch crossing follows a Poisson distribution. The fraction of events with more than one vertex (due to pileup) is small in the 0.9 and 7 TeV data (1.6% and 0.9%, respectively), but is 9.4% at 2.76 TeV. The interaction-region and pileup parameters are summarized in Table 1. For the 0.9 and 2.76 TeV data, bunch crossings with either one or two reconstructed vertices were used, while for the 7 TeV data the analysis was restricted to events with a single reconstructed vertex to suppress the larger background from pileup, split and merged vertices.
The hadron spectra were corrected for particles of non-primary origin. The main source of secondary particles is the feed-down from weakly decaying particles, mostly K 0 S , Λ/Λ, and Σ + /Σ − . While the correction (C s ) is around 1% for pions, it is up to 15% for protons with p T ≈ 0.2 GeV/c. This is expected because the daughter p or p takes most of the momentum of the primary Λ/Λ, and therefore has a higher probability of being (mistakenly) fitted to the primary vertex than a pion from a K 0 S decay. Since none of the weakly decaying particles mentioned decay into kaons, the correction for kaons is small. The corrections were derived from PYTHIA and cross-checked with data [14] by comparing measured and predicted spectra of particles. While data and simulation generally agree, the Λ/Λ correction had to be multiplied by a factor of 1.6.
For p < 0.15 GeV/c, electrons can be clearly identified. According to PYTHIA, the overall e ± contamination of the hadron yields is below 0.2%. Although muons cannot be separated from pions, their fraction is negligible, below 0.05%. Since both contaminations are small no corrections were applied.

Energy deposits and estimation of energy loss rate
The silicon layers of the tracker are thin and the energy depositions do not follow a Gaussian distribution, but exhibit a long tail at high values. Ideally, the estimates of the energy loss rate should not depend on the path lengths of the track through the sensitive parts of the silicon or on the detector details. However this is not the case with the often used truncated, power, or weighted means of the differential deposits, ∆E/∆x. Some of the dependence on the path length can be corrected for, but a method based on the proper knowledge of the underlying physical processes is preferable.
In the present paper a novel analytical parametrization [15] has been used to approximate the energy loss of charged particles. The method provides the probability density p(y|ε, l) of energy deposit y, if the most probable energy loss rate ε at a reference path-length l 0 and the path-length l are known. The method can be used in conjunction with a maximum likelihood estimation. The deposited energy is estimated from the measured charge deposits in individual channels (pixels or strips) contributing to hit clusters. Deposits below the readout threshold or above the saturation level of the readout electronics are estimated from the length of the track segment in the silicon. This results in a wider accessible energy deposit range and better particle identification power. The method can be applied to the energy loss rate estimation of tracks and to calibrate the gain of the tracker detector front-end electronics. In this analysis, for each track, the estimated ε value at l 0 = 450 µm was used for particle identification and yield determination.
For pixel clusters, the energy deposits (and their variances) were calculated as the sum of individual pixel deposits (and variances). The noise contribution is Gaussian, with a standard deviation σ n ≈ 10 keV per pixel. In the case of strips, the energy deposits were corrected for 4.1 Detector gain calibration with tracks capacitive coupling and cross-talk between neighboring strips. The readout threshold t, the coupling parameter α c , and the standard deviation σ n of the Gaussian noise for strips were determined from the data, by means of tracks with close-to-normal incidence ( Table 2). Table 3: Tight requirements for approximate particle identification. All ε values are functions of p. Subscripts π, K, and p refer to the most probable value for a given particle species, as expected from simulation.

Particle
Momentum Most probable energy loss rate

Detector gain calibration with tracks
For an accurate determination of ε, it is crucial to calibrate the response of all readout chips. It is also important to compare the measured energy deposit spectra to the energy loss parametrization, and introduce corrections if needed.
The value of ε was estimated for each track using an initial gain calibration of the pixel and strip readout chips. Approximate particle identification was performed starting from a sample of identified tracks selected as follows: a track was identified as pion, kaon, or proton if its momentum p and most probable energy loss rate ε satisfied the tight requirements listed in Table 3. In addition, tracks with p > 2 GeV/c, or ε < 3.2 MeV/cm, or from identified K 0 S two-body charged decays were assumed to be pions. Identified electrons were not used. The expected ε, path length l, and energy deposit y were collected for each hit, and stored for every readout chip separately. For each chip, the joint energy deposit log-likelihood, −2 ∑ j log p(g · y j |ε j , l j ), of all selected hits (index j) was minimized by varying the multiplicative gain correction g. At each center-of-mass energy, approximately 10% of the data were sufficient to perform a gain calibration with sufficient resolution. The expected gain uncertainty is 0.5% on average for pixel chips and 0.5-2% for strips readout chips, depending on the chip position.
After the detector gain calibration, the energy loss parametrization was validated with particles identified by the selection discussed above. As examples, the measured energy deposit distributions of positively charged hadrons for different path lengths at βγ = p/m = 1.39 and 3.49 are shown for PXB and TIB in Fig. 3, for the 7 TeV dataset. Similar results were obtained from the data taken at 0.9 and 2.76 TeV. Separate corrections for positive and negative particles were necessary since some effects are not charge symmetric. The energy loss parametrization [15] (solid lines in the figures) gives a good description of the data. In order to describe deviations from the parametrization, we allow for an affine transformation of the theoretical distributions (log ε → α log ε + δ), the parameters of which are determined from the hit-level residuals. The scale factors (α) and the shifts (δ) are both functions of the βγ value of the particle and the length of the track segment l in silicon. The scale factors are around unity for most βγ values and increase to 1.2-1.4 for βγ < 2. Shifts (δ) are generally a few keV with deviations up to 10 keV for βγ < 1. A slight path-length dependence was found for both scale factors and shifts. The observed behavior of these hit-level residuals, as a function of βγ and l, was parametrized with polynomials. These corrections were applied to individual hits during the determination of the log ε templates, as described below.  . The average cluster noise σ n is also given.

Estimation of the most probable energy loss rate for tracks
The best value of ε for each track was calculated with the corrected energy deposits. The log ε values in (η, p T ) bins were then used in the yield unfolding (Section 5). Removal of hits with incompatible energy deposits and the creation of fit templates, giving the expected log ε distributions for all particle species (electrons, pions, kaons, and protons), are discussed here.
The value of ε was estimated by minimizing the joint energy deposit negative log-likelihood of all hits on the trajectory (index i), χ 2 = −2 ∑ i log p(y i |ε, l i ). Distributions of log ε as a function of total momentum p are plotted in Fig. 4 for electrons, pions, kaons, and protons, and compared to the predictions of the energy loss method. The low momentum region is not well described, with the log ε estimates slightly shifted towards higher values. This is because charged particles slow down when traversing the detector, which leads to hits with higher average energy deposit than expected by the parameterization. The observed deviations were taken into account by means of track-level corrections (cf. Section 5).
Since the association of hits to tracks is not always unambiguous, some hits, usually from noise or hit overlap, do not belong to the actual track. These false hits, or "outliers", can be removed. The tracks considered for hit removal were those with at least three hits and for which the joint energy-deposit χ 2 is larger than 1.3 n hits + 4 √ 1.3 n hits , where n hits denotes the number of hits on the track. If the exclusion of a hit decreased the χ 2 by at least 12, the hit was removed. At most one hit was removed; this affected about 1.5% of the tracks. If there is an outlier, it is usually the hit with the lowest ∆E/∆x value.
In addition to the most probable value of log ε, the shape of the log ε distribution was also determined from the data. The template distribution for a given particle species was built from tracks with estimated ε values within three standard deviations of the theoretical value at a given βγ. All kinematical parameters and hit-related observables were kept, but the energy deposits were re-generated by sampling from the analytical parametrization. This procedure exploits the success of the method at the hit level to ensure a meaningful template determina-  Example distribution of log ε in a narrow momentum slice at p = 0.80 GeV/c for the high-purity pion sample. Curves are template fits to the data, with scale factors (α) and shifts (δ) also given. The inset shows the distributions with a logarithmic vertical scale. Both plots are from data at 7 TeV center-of-mass energy. tion, even for tracks with very few hits.

Fitting the log ε distributions
As seen in Fig. 4, low-momentum particles can be identified unambiguously and can therefore be counted. Conversely, at high momentum, the log ε bands overlap (above about 0.5 GeV/c for pions and kaons, and 1.2 GeV/c for protons); the particle yields therefore need to be determined by means of a series of template fits in bins of η and p T . This is described in the following.
The starting point is the histogram of estimated log ε values m i in a given (η, p T ) bin (i runs over the histogram bins), along with normalised template distributions x ki , with k indicating electron, pion, kaon, or proton. The goal is to determine the yield of each particle type (a k ) contributing to the measured distribution. Since the entries in a histogram are Poisson-distributed, the corresponding log-likelihood function to minimize is where t i = ∑ k a k x ki contains the quantity to be fitted. The minimum for this non-linear expression can be found by using Newton's method [16], usually within three iterations. Although the templates describe the measured log ε distributions reasonably well, for a precision measurement further (track-level) corrections are needed to account for the remaining discrepancies between data and simulation. Hence, we allow for an affine transformation of the templates with scale factors and shifts that depend on η and p T , the particle charge, and the particle mass.
For a less biased determination of track-level corrections, enhanced samples of each particle type were also employed. For electrons and positrons, photon conversions in the beam-pipe or in the first pixel layer were used. For high-purity π and enhanced p samples, weakly decaying hadrons were selected (K 0 S , Λ/Λ). Both photon conversions and weak decays were reconstructed by means of a simple neutral-decay finder, followed by a narrow mass cut. Invariantmass distributions of the selected candidates are shown in Fig. 5a. A sample with enhanced kaon content was obtained by tagging K ± mesons (with the requirements listed in Table 3) and looking for an opposite-sign particle which, with the kaon mass assumption, would give an invariant mass close to that of the φ(1020), within 2Γ = 8.52 MeV/c 2 . An example distribution of log ε for the high-purity pion sample in a narrow momentum slice is plotted in Fig. 5b.

Additional information for particle identification
At low momentum, the log ε templates for electrons and pions can be compared to the log ε distributions of high-purity samples, but this type of validation does not work at higher momenta because of lack of statistics; for the same reason, it does not work for kaons and protons. It is therefore important to study the log ε distributions in more detail: they contain useful additional information that can be used to determine the track-level corrections, thus reducing the systematic uncertainties of the extracted yields. This is discussed in the following. a) Fitting log ε in n hits slices. The n hits distribution in a given (η, p T ) bin is different for different particle types. Pions have a higher average number of hits per track, with fewer hits for kaons and even fewer for protons. These differences are due to physical effects, such as the different inelastic hadron-nucleon cross section, multiple Coulomb scattering, and decay in flight. It is therefore advantageous to simultaneously perform differential fits in n hits bins (Fig. 6a). b) Fitting log ε in track-fit χ 2 /ndf slices. The value of the global χ 2 per number of degrees of freedom (ndf) of the Kalman filter used for fitting the track [17], assuming the charged pion mass, can also be used to identify charged particles. Here ndf denotes the number of degrees of freedom for the track fit. This approach relies on the knowledge of the detector material and the local spatial resolution, and exploits the known physics of multiple scattering and energy loss; it can be used to enhance or suppress a specific particle type. The quantity x = χ 2 /ndf has an approximately Gaussian distribution with mean value 1 and standard deviation σ ≈ 1/ √ 2 · ndf if the track fitted is indeed a pion. If it is not, both the mean and sigma are larger by a factor β(m 0 )/β(m), where m 0 is the pion mass and m is the particle mass. Three classes were defined such that each contains an equal number of genuine pions. The condition x − 1 < −0.43σ favors pions, and the requirements −0.43σ ≤ x − 1 < 0.43σ and x − 1 ≥ 0.43σ enhance kaons and protons, respectively. An example of log ε distributions in a χ 2 /ndf slice, with the corresponding fits, is shown in Fig. 6b. The increase of the kaon and proton yields with increasing x is visible, when compared to pions. c) Difference of hit losses. The n hits distribution depends on the particle species, with pions producing more hits than other particles. Furthermore, the n hits distributions of two particle types are related to each other. Let f n denote the number of particles of type f with n hits (n ≥ 1), in an (η, p T ) bin. Let us assume that another particle species g produces fewer hits, i.e. has a higher probability of hit loss q, taken to be roughly independent of the hit position along the track. The distribution of the number of hits g k can then be predicted, with g k = r(1 − q) k f k + q ∑ n max n=k+1 f n , where r is the ratio of particle abundances (g/ f ). The hit loss (compared to pions) is primarily a function of momentum. At lower momenta, the best value of q can be estimated for each (η, p T ) bin by comparing the measured kaon or proton distributions to the ones predicted with the pion n hits distribution according to the formula above. An example of the n hits distributions and the corresponding fits is shown in Fig. 7a. The resulting values of q as a function of p are shown in Fig. 7b, for the kaon-pion and proton-pion pairs. The data points with q < 0.2 can be approximated with a sum of two exponentials in p. This can be motivated by the decay in flight for kaons, but also by the increase of multiple Coulomb scattering with decreasing momentum. The weaker dependence at low momentum (q > 0.2) is due to the increasing multiple scattering for pions; however, this region in not used in the present analysis. The relation between the n hits distributions of two particle types has very important consequences: since the number of charged particles at each n hits value is known, only the local ratio r of particle abundances (K/π, p/π) has to be determined from the fits. kinematical region varies. The fit parameters are therefore smoothed by taking the median of the (η, p T ) bin and its 8 neighbors.

e) Convergence of parameters.
While the track-level corrections are independent, they should converge to similar values at a momentum, p c , where the ε values are the same for two particle types, although the energy deposit distributions can be slightly different. These momenta are p c = 1.56 GeV/c for the pion-kaon and 2.58 GeV/c for the pion-proton pair. The differences of fitted scale factors and shifts were studied as a function of ∆ log ε, in narrow η slices. The parameter values were determined in the ranges 0.50 < p < 1.00 GeV/c for kaons and 1.30 < p < 1.65 GeV/c for protons. In these regions, the parameters were fitted and extrapolated to p c . At p c , the scale factors are expected to be the same and their ∆ log ε dependence is well described with first-order (proton-pion) or second-order polynomials (kaon-pion), in each η slice separately. More freedom had to be allowed for the shifts. While their ∆ log ε dependence can be described with first-order polynomials, their difference is not required to converge to 0, but to a second-order polynomial of η.

Determination of yields
In summary, in a given (η, p T ) bin, the free parameters are: the scale factors (usually in the range 0.98-1.02) and the shifts (from −0.01 to 0.01) for track-level corrections; the yields of particles for each χ 2 /ndf bin or their ratios if the relationship between the n hits distributions of different particle species is used. The fit was performed simultaneously in all (n hits , χ 2 /ndf) bins with nested minimizations. The optimization of the parameters was carried out with the SIMPLEX package [18], but the determination of local particle yields was performed with the log-likelihood merit function (Eq. (1)).
In order to obtain a stable result, the fits were carried out in several passes, each containing iterative steps. After each step, the resulting scale factors and shifts were the new starting points for the next iteration. In the first pass, log ε distributions in narrow momentum slices were fitted using the enhanced electron, pion, proton, and kaon samples, as defined in Section 5. The fitted parameters were then used for a fit in the same slices of the inclusive dataset. In this way the scale factors and shifts were estimated as a function of p. In the second pass, the log ε distributions in each (η, p T ) bin were fitted. The η bins are 0.1 units wide and cover the range −2.4 < η < 2.4. The p T bins are 0.05 GeV/c wide and cover the range p T < 2 GeV/c. The latter choice reflects the p T resolution (0.015-0.025 GeV/c). The procedure was repeated with the enhanced samples, followed again by the inclusive sample. The n hits distributions were used to extract the relationship between different particle species and this is used in all subsequent steps. The shifts are determined and constrained first, and then the scale factors are obtained. Example fits are shown in Fig. 8. In the last pass all parameters are kept constant and the final normalised log ε templates for each particle species are extracted and used to measure the particle yields.
The results of the fitting sequence are the yields for each particle species and charge, both inclusive and divided into track multiplicity bins. While the yields are flat in η, they decrease with increasing p T , as expected. At the end of the fitting sequence χ 2 /ndf values are usually close to unity, except for some low-p T fits. At low p the pions are well fitted, and the different species are well separated. Hence, instead of fitting kaon or proton yields, it is sufficient to count the number of entries above the fitted shape of the pion distribution. Table 4 summarizes the particle-specific momentum ranges for the following procedures: counting the yields (Count); using a particle species in the fits (Fit, paragraphs a and b in Section 5.1); using the correspondence between hit losses in the fits (Hit loss, paragraph c); using the principle of convergence for track-level corrections in the fits (Convergence, paragraph e); and using the fitted yields for physics (Physics). The use of these ranges limits the systematic uncertainties at high momentum. The ranges, after evaluation of the individual fits, were set such that the systematic uncertainty of the measured yields does not exceed 10%. For p > 1.30 GeV/c, pions and kaons were not fitted separately, but were regarded as one particle species (π+K row in Table 4). In fact, fitted pion and kaon yields were not used for p > 1.20 GeV/c and p > 1.05 GeV/c, respectively. Although pion and kaon yields cannot be determined in this high-momentum region, their sum can be measured. This information is an important constraint when fitting the p T spectra (Section 7).
The statistical uncertainties for the extracted yields are given by the fits. The observed local (η, p T ) variations of parameters for track-level corrections cannot be attributed to statistical fluctuations and indicate that the average systematic uncertainties of the scale factors and shifts are about 10 −2 and 2 · 10 −3 , respectively. The systematic uncertainties on the yields in each bin were obtained by refitting the histograms with the parameters changed by these amounts.

Corrections
The measured yields in each (η, p T ) bin, ∆N measured , were first corrected for the misreconstructedtrack rate (C f , Section 3.2) and the fraction of secondaries (C s , Section 3.3): Bins in which the misreconstructed-track rate was larger than 0.1 or the fraction of secondaries was larger than 0.25 were rejected.
The distributions were then unfolded to take into account the finite η and p T resolutions. The η distribution of the tracks is flat and the η resolution is very good. Conversely, the p T distribution is steep in the low-momentum region and separate corrections in each η bin were necessary. In addition, the reconstructed p T distributions for kaons and protons, at very low p T , are shifted with respect to the generated distributions by about 0.025 GeV/c. This bias is a consequence of using the pion mass for all charged particles (see Section 5.1). A straightforward unfolding procedure with linear regularization [16] was used, based on response matri-ces R obtained from MC samples for each particle species. With o and m denoting the vector of original and measured differential yields (d 2 N/dη dp T ), the sum of the chi-squared term (Ro − m) T V −1 (Ro − m) and a regularizer term λo T Ho is minimized by varying o, where H is a tridiagonal matrix. The covariance of measured values is approximated by V ij ≈ m i δ ij , where δ ij is Kronecker's delta. The value of λ is adjusted such that the minimized sum of the two terms equals the number of degrees of freedom. In practice the parameter λ is small, of the order of 10 −5 .
The corrected yields were obtained by applying corrections (cf. Section 3.2) for acceptance (C a ), efficiency (C e ), and multiple reconstruction rate (C m ): where N ev is the corrected number of DS events (see Section 3). Bins with acceptance smaller than 0.5, efficiency smaller than 0.5, or multiple-track rate greater than 0.1 were rejected.
Finally, the differential yields d 2 N/dη dp T were transformed to invariant yields as a function of the rapidity y by multiplying by the Jacobian E/p, and the (η, p T ) bins were mapped into a (y, p T ) grid. The invariant yields 1/N ev d 2 N/dy dp T as a function of p T were obtained by averaging over y in the range −1 < y < 1. They are largely independent of y in the narrow region considered, as expected.

Systematic uncertainties
The systematic uncertainties are summarized in Table 5; they are subdivided in three categories.
• The uncertainties of the corrections related to the event selection (Section 3.1) and pileup (Section 3.3) are fully or mostly correlated and were treated as normalisation uncertainties. They amount to a 3.0% systematic uncertainty on the yields and 1.0% on the average p T .
• The pixel hit efficiency and the effects of a possible misalignment of the detector elements are mostly uncorrelated. Their contribution to the yield uncertainty is about 0.3% [7].
• Other mostly uncorrelated systematic effects are the following: the tracker acceptance and the track reconstruction efficiency (Section 3.2) generally have small uncertainties (1% and 2%, respectively), but change rapidly at very low p T , leading to a 5-6% uncertainty on the yields in that range; for the multiple-track and misreconstructedtrack rate corrections (Section 3.2), the uncertainty is assumed to be 50% of the correction, while for the case of the correction for secondary particles it is 20% (Section 3.3). The uncertainty of the fitted yields (Section 5.2) also belongs to this category.
In the weighted averages and the fits discussed in the following, the quadratic sum of statistical and systematic uncertainties (referred to as combined uncertainty) is used. The fully correlated systematic uncertainties (event selection and pileup) are not displayed in the plots.

Results
In previously published measurements of unidentified and identified particle spectra, the following form of the Tsallis-Pareto-type distribution [19,20] was fitted to the data: where and m T = m 2 + p 2 T (c factors are omitted from the preceding formulae). The free parameters are the integrated yield dN/dy, the exponent n, and the inverse slope T. The above formula is useful for extrapolating the spectra to p T = 0, and for extracting p T and dN/dy. Its validity in the present analysis was cross-checked by fitting MC spectra and verifying that the fitted values of p T and dN/dy were consistent with the generated values. According to some models of particle production based on non-extensive thermodynamics [20], the parameter T is connected with the average particle energy, while n characterizes the "non-extensivity" of the process, i.e. the departure of the spectra from a Boltzmann distribution.
As discussed earlier, pions and kaons cannot be unambiguously distinguished at higher momenta (Section 5.2). Because of this, the pion-only (kaon-only) d 2 N/dydp T distribution was fitted for |y| < 1 and p < 1.20 GeV/c (p < 1.05 GeV/c); the joint pion and kaon distribution was instead fitted if |η| < 1 and 1.05 < p < 1.5 GeV/c. Since the ratio p/E for the pions (which are more abundant than kaons) at these momenta can be approximated by p T /m T at η ≈ 0, Eq. (4) becomes: In the case of pions and protons, the measurements cover a wide p T range: the yields and average p T can thus be determined with small systematic uncertainty. For the kaons the number of measurements is small and the p T range is limited. Therefore, for the combined pion and kaon fits, the kaon component was weighted by a factor of four, leading to the following function to be minimized: χ 2 π + χ 2 π+K + 4χ 2 K . This weight accounts for the p T range, which is narrower by a factor about two, and also for the partial correlation between the pion measurement and that of the sum of pions and kaons, which gives another factor two.
The average transverse momentum p T and its uncertainty were obtained by numerical integration of Eq. (4) with the fitted parameters.
The results discussed in the following are for |y| < 1 at √ s = 0.9, 2.76, and 7 TeV. In all cases, error bars indicate the uncorrelated statistical uncertainties, while bands show the uncorrelated systematic uncertainties. The fully correlated normalisation uncertainty (not shown) is 3.0%. For the p T spectra, the average transverse momentum, and the ratio of particle yields, the data are compared to the D6T and Z2 tunes of PYTHIA6 [8] as well as to the 4C tune of PYTHIA8 [21].

Inclusive measurements
The transverse momentum distributions of positive and negative hadrons (pions, kaons, protons) are shown in Fig. 9, along with the results of the fits to the Tsallis-Pareto parametrization (Eqs. (4) and (6)). The fits are of good quality with χ 2 /ndf values in the range 0.6-1.5 for pions, 0.6-2.1 for kaons, and 0.4-1.1 for protons. Figure 10 presents the data compared to various PYTHIA tunes. Tunes D6T and 4C tend to be systematically below or above the spectra, whereas Z2 is generally closer to the measurements (except for low-p T protons).
Ratios of particle yields as a function of the transverse momentum are plotted in Fig. 11. While the p/π ratios are well described by all tunes, there are substantial deviations for the K/π ratios, also seen by other experiments and at different energies. CMS measurements of K 0 S and Λ/Λ production [14] are consistent with the discrepancies seen here. The ratios of the yields for oppositely charged particles are close to one, as expected for pair-produced particles at midrapidity. Ratios for pions and kaons are compatible with unity, independently of p T . While the p/p ratios are also flat as a function of p T , they increase with increasing √ s.

Multiplicity-dependent measurements
This study is motivated by the intriguing hadron correlations measured in pp collisions at high track multiplicities [22], which suggest possible collective effects in "central" pp collisions at the LHC. In addition, the multiplicity dependence of particle yield ratios is sensitive to various final-state effects (hadronization, color reconnection, collective flow) implemented in MC models used in collider and cosmic-ray physics [23].
The normalized transverse-momentum distributions of identified charged hadrons in selected multiplicity classes, for |y| < 1 and √ s = 0.9, 2.76, and 7 TeV, are shown in Figs. 12, 13, and 14, for pions, kaons, and protons, respectively. The distributions of negatively and positively charged particles have been summed. The distributions are fitted to the Tsallis-Pareto   parametrization. In the case of pions, the distributions are remarkably similar, and essentially independent of √ s and multiplicity. For kaons and protons, there is a clear evolution as the multiplicity increases. The inverse slope parameter T increases with multiplicity for both kaons and protons, while the exponent n is independent of the multiplicity (not shown in the figures).
The ratios of particle yields as a function of track multiplicity are displayed in Fig. 15. The K/π and p/π ratios are flat as a function of N tracks . Although the trend at low N tracks is not reproduced by any of the tunes, the values are approximately correct for tunes D6T and Z2, while 4C is off, especially for K/π. The ratios of yields of oppositely charged particles are independent of N tracks .
The average transverse momentum p T is shown as a function of multiplicity in Fig. 16. The plots are similar, and largely independent of √ s, for all the particle species studied. Pions and kaons are well described by the Z2 and 4C tunes, while D6T predicts values that are too high at high multiplicities. None of the tunes provide an acceptable description of the multiplicity dependence of p T for protons, and the measured values lie between D6T and Z2. For the dependence of T on multiplicity (not shown in the figures), the predictions are consistently higher than the pion data for all tunes; the kaon and proton data are again between D6T and Z2, somewhat closer to the latter. Tune 4C gives a flat multiplicity dependence for T and is not favored by the kaon and proton measurements.
The center-of-mass energy dependence of dN/dy, the average transverse momentum p T , and the particle yield ratios are shown in Fig. 17. For dN/dy, the Z2 tune gives the best overall description. The p T of pions is reproduced by tune 4C, that of the kaons is best described by Z2, and that of the protons is not reproduced by any of the tunes, with D6T closest to the data. The ratios of the yields for oppositely charged mesons are independent of √ s and have values of about 0.98 for the pions; the kaon ratios are compatible with those of the pion and also with unity. The slight deviation from unity observed for the pions probably reflects the initial charge asymmetry of pp collisions. The p/p yield ratio appears to increase with √ s, though it is difficult to draw definite conclusions because of the large systematic uncertainties. The K/π and p/π ratios are flat as a function of √ s, and have values of 0.13 and 0.06-0.07, respectively. The exponent n (not shown in the figures) decreases with increasing √ s for pions and protons. For the kaons the systematic uncertainties are too large to draw a definite conclusion. The inverse slope T (also not shown) is flat as a function of √ s for the pions but exhibits a slight increase for the protons. The universality of the relation of p T and the particle-yield ratios with the track multiplicity, and its independence of the collision energy is demonstrated in Fig. 18.
The transverse-momentum distributions of identified charged hadrons at central rapidity are compared to those of the ALICE Collaboration [24] at √ s = 0.9 TeV in Fig. 19 (|y| < 1 for CMS, |y| < 0.5 for ALICE). While the rapidity coverage is different, the measurements can be compared because the p T spectra are largely independent of y for |y| < 1. The results from the two experiments agree well for the mesons, and exhibit some small discrepancies for the protons.
The center-of-mass energy dependence of dN/dy in the central rapidity region and the average transverse momentum for pions, kaons, and protons are shown in Fig. 20. Measurements from UA2 [25], E735 [26], PHENIX [27], STAR [28], ALICE [24], and CMS are shown. The observed √ s evolution of both quantities is consistent with a power-law increase.
The comparison of the central rapidity p/p ratio as a function of the rapidity interval ∆y is displayed in Fig. 21. This quantity is defined as ∆y = y beam − y baryon , where y beam (y baryon ) is the rapidity of the incoming beam (outgoing baryon). Measurements from ISR energies [29,30], NA49 [31], BRAHMS [32], PHENIX [33], PHOBOS [34], and STAR [35] are shown together with LHC (ALICE [36] and CMS) data. The curve represents the expected ∆y dependence in a Regge-inspired model, where baryon pair production is governed by Pomeron exchange, and baryon transport by string-junction exchange [37]. The functional form used is (p/p) −1 = 1 + C exp[(α J − α P )∆y] with C = 10, α P = 1.2, and α J = 0.5, as used in the ALICE paper. While the low ∆y region is not properly described, the agreement is good at higher ∆y. The CMS data are consistent with previous measurements, as well as with the proposed function. New data from the LHCb Collaboration [38]in the forward region could further constrain the parameters of the model.

Conclusions
Measurements of identified charged hadrons produced in pp collisions at √ s = 0.9, 2.76, and 7 TeV have been presented, based on data collected in events with simultaneous hadronic activity at pseudorapidities −5 < η < −3 and 3 < η < 5. Charged pions, kaons, and protons were identified from the energy deposited in the silicon tracker (pixels and strips) and other track information (number of hits and goodness of track-fit). CMS data extend the center-ofmass energy range of previous measurements and are consistent with them at lower energies. Moreover, in the present analysis the data have been studied differentially, as a function of the particle multiplicity in the event and of the collision energy. The results can be used to further constrain models of hadron production and contribute to the understanding of basic non-perturbative dynamics in hadronic collisions.   Figure 14: Normalized transverse momentum distributions of charged protons in a few representative multiplicity classes, in the range |y| < 1, at √ s = 0.9, 2.76, and 7 TeV, fitted to the Tsallis-Pareto parametrization (solid lines). For better visibility, the result for any given N tracks bin is shifted by 0.5 units with respect to the adjacent bins. Error bars indicate the uncorrelated statistical uncertainties, while bands show the uncorrelated systematic uncertainties.
The measured track multiplicity dependence of the rapidity density and of the average transverse momentum indicates that particle production at LHC energies is strongly correlated with event particle multiplicity rather than with the center-of-mass energy of the collision. This correlation may reflect the fact that at TeV energies the characteristics of particle production in hadronic collisions are constrained by the amount of initial parton energy available in a given collision.  (K + +K − )/(π + +π − ) (p+ − p)/(π + +π − ) CMS Figure 17: Center-of-mass energy dependence of dN/dy, average transverse momentum p T , and ratios of particle yields. Error bars indicate the uncorrelated combined uncertainties, while boxes show the uncorrelated systematic uncertainties. For dN/dy ( p T ) the fully correlated normalisation uncertainty (not shown) is 3.0% (1.0%). Curves indicate predictions from PYTHIA6 (D6T and Z2 tunes) and the 4C tune of PYTHIA8. (K + +K − )/(π + +π − ) (p+ − p)/(π + +π − ) CMS Figure 18: Left: average transverse momentum of identified charged hadrons (pions, kaons, protons) in the range |y| < 1, for all particle types, as a function of the true track multiplicity for |η| < 2.4, for all energies. Right: ratios of particle yields as a function of particle multiplicity for |η| < 2.4, for all energies. Error bars indicate the uncorrelated combined uncertainties, while boxes show the uncorrelated systematic uncertainties. For p T the fully correlated normalisation uncertainty (not shown) is 1.0%. Lines are drawn to guide the eye (solid -0.9 TeV, dotted -2.76 TeV, dash-dotted -7 TeV).  Figure 19: Comparison of transverse momentum distributions of identified charged hadrons (pions, kaons, protons) at central rapidity (|y| < 1 for CMS, |y| < 0.5 for ALICE [24]), for positive hadrons (left) and negative hadrons (right), at √ s = 0.9 TeV. To improve clarity, the kaon and proton points are scaled by the quoted factors. Error bars indicate the uncorrelated statistical uncertainties, while bands show the uncorrelated systematic uncertainties. In the CMS case the fully correlated normalisation uncertainty (not shown) is 3.0%. The ALICE results were corrected to inelastic pp collisions and therefore the CMS points are scaled by an empirical factor of 0.78 so as to correct for the different particle level selection used by ALICE. CMS DS π ± K ± p, − p CMS UA2, PHENIX, STAR, E735 ALICE inel Figure 20: Comparison of the center-of-mass energy dependence of the central rapidity density dN/dy (left) and the average transverse momentum p T (right). Low-energy data (UA2 [25], E735 [26], PHENIX [27], STAR [28]) are shown with LHC data (ALICE [24] and CMS). For the CMS points, the error bars indicate the uncorrelated combined uncertainties, while boxes show the uncorrelated systematic uncertainties. The fully correlated normalisation uncertainty (not shown) is around 3.0% (left plot) and 1.0% (right plot).  Figure 21: Comparison of the central rapidity p/p yield ratio as a function of the rapidity difference ∆y, plotted together with the prediction of the Regge-inspired model [37]. Measurements at low energies (ISR, [29,30]), NA49 [31], BRAHMS [32], PHENIX [33], PHOBOS [34], and STAR [35]