Shape, transverse size, and charged-hadron multiplicity of jets in pp collisions at sqrt(s) = 7 TeV

Measurements of jet characteristics from inclusive jet production in proton-proton collisions at a centre-of-mass energy of 7 TeV are presented. The data sample was collected with the CMS detector at the LHC during 2010 and corresponds to an integrated luminosity of 36 inverse picobarns. The mean charged hadron multiplicity, the differential and integral jet shape distributions, and two independent moments of the shape distributions are measured as functions of the jet transverse momentum for jets reconstructed with the anti-kT algorithm. The measured observables are corrected to the particle level and compared with predictions from various QCD Monte Carlo generators.


Introduction
The jet transverse momentum profile (shape) [1,2], transverse size, and charged-hadron multiplicity in jets are sensitive to multiple parton emissions from the primary outgoing parton and provide a powerful test of the parton showering approximation of quantum chromodynamics (QCD), the theory of strong interactions. Recently, there have been many methods proposed to search for heavy particles by studying the substructure of jets formed by their decay products, as these particles can be highly boosted and thus their decay products are well collimated [3][4][5][6]. Jets arising from the fragmentation of a single parton, hereafter referred to as QCD jets, contribute to backgrounds in searches for such boosted-object jets. A good understanding of the QCD jet structure is very important for these searches to be successful. The structures of gluon-initiated and quark-initiated jets are different due to their different fragmentation properties. QCD predicts gluoninitiated jets to have a higher average particle multiplicity and a broader distribution of particle transverse momentum with respect to the jet direction compared to quarkinitiated jets. Jet structure measurements test these predictions and can be used to develop techniques to discriminate between gluon and quark jets. Such discrimination techniques can enhance both standard model measurements and the ability to search for physics beyond the standard model.
Historically the jet shape has been used to test perturbative QCD (pQCD) calculations up to the third power in the coupling constant α s [7,8]. These leading-order calculations, with only one additional parton in a jet, showed reasonable agreement with the observed jet shapes. While confirming the validity of pQCD calculations, jet shape studies also JHEP06(2012)160 indicated that jet clustering, underlying event contributions, and hadronization effects must be considered. Currently, these effects are modelled within the framework of Monte Carlo (MC) event generators, which use QCD parton shower models, in conjunction with hadronization and underlying event models, to generate final-state particles. These MC event generators are used extensively to model the signal and background events for a variety of standard model studies and searches for new physics at hadron colliders. Jet shapes are used to tune phenomenological parameters in the event generators. Jet shapes have been measured previously in pp collisions at the Tevatron [7][8][9][10] and ep collisions at HERA [11][12][13][14][15].
We present measurements of the charged-hadron multiplicity, shape, and transverse size for jets with transverse momentum up to 1 TeV and rapidity up to 3 using 36 pb −1 of pp collisions at a centre-of-mass energy of 7 TeV collected by the CMS experiment at the Large Hadron Collider (LHC). A similar measurement has been performed by the ATLAS Collaboration [16].
This paper is organised as follows. Section 2 contains a brief description of the CMS detector. In section 3 we present the event selection and reconstruction. The jet observables are defined in section 4 and the results are given in section 5. The conclusions are summarized in section 6.

The CMS detector
CMS uses a right-handed coordinate system in which the z axis points in the anticlockwise beam direction, the x axis points towards the centre of the LHC ring, and the y axis points up, perpendicular to the plane of the LHC ring. The azimuthal angle φ is measured in radians with respect to the x axis, and the polar angle θ is measured with respect to the z axis. A particle with energy E and momentum p is characterized by transverse momentum p T = | p| sin θ, rapidity y = 1 2 ln [(E + p z )/(E − p z )], and pseudorapidity η = − ln [tan(θ/2)].
The CMS superconducting solenoid, 12.5 m long with an internal diameter of 6 m, provides a uniform magnetic field of 3.8 T. The inner tracking system is composed of a pixel detector with three barrel layers at radii between 4.4 and 10.2 cm and a silicon strip tracker with 10 barrel detection layers extending outwards to a radius of 1.1 m. This system is complemented by two endcaps, extending the acceptance up to |η| = 2.5. The momentum resolution for reconstructed tracks in the central region is about 1% at p T = 100 GeV/c.
The calorimeters inside the magnet coil consist of a lead tungstate crystal electromagnetic calorimeter (ECAL) and a brass-scintillator hadron calorimeter (HCAL) with coverage up to |η| = 3. The quartz/steel forward hadron calorimeters extend the calorimetry coverage up to |η| = 5. Muons are measured in gas-ionization detectors embedded in the steel return yoke of the magnet. The calorimeter cells are grouped in projective towers of granularity ∆η × ∆φ = 0.087 × 0.087 for the central rapidities considered in this paper. The ECAL was initially calibrated using test beam electrons and then, in situ, with photons from π 0 and η meson decays and electrons from Z boson decays [17]. The JHEP06(2012)160 energy scale in data agrees with that in the simulation to better than 1% in the barrel region (|η| < 1.5) and better than 3% in the endcap region (1.3 < |η| < 3.0) [18]. Hadron calorimeter cells in the |η| < 3 region are calibrated primarily with test-beam data and radioactive sources [19,20]. A detailed description of the CMS detector may be found in [21].

Event selection and reconstruction
The data were recorded using a set of inclusive single-jet high-level triggers [22] requiring at least one jet in the event to have an online jet p T of at least 15, 30, 50, 70, 100, or 140 GeV/c. These jets are reconstructed only from energy deposits in the calorimeters using an iterative cone algorithm. In addition, a minimum-bias trigger, defined as a signal from at least one of two beam scintillator counters in coincidence with a signal from one of two beam pickup timing devices, was used to collect low p T jets. These datasets are combined to measure the jet characteristics in bins spanning the range 20 GeV/c < p T < 1 TeV/c, so that the trigger contributing to each bin is fully efficient. Only a fraction of events satisfying the lower threshold jet triggers were recorded because of limited data acquisition system bandwidth. Thus the effective integrated luminosity for jets with p T < 140 GeV/c is less than 36 pb −1 .
Jets are reconstructed offline using the anti-k T jet clustering algorithm [23][24][25]. This algorithm is similar to the well-known k T algorithm, except that it uses 1/p T instead of p T as the weighting factor for the scaled distance. The algorithm is collinear-and infrared-safe, and it produces circular jets in y-φ space except when jets overlap. Two different types of inputs are used with this algorithm. In the first method, individually calibrated particle candidates are used as inputs to the jet clustering algorithm. These particle candidates, photons, electrons, muons, charged hadrons, and neutral hadrons, are reconstructed using the CMS particle flow (PF) algorithm [26]. This algorithm combines the information from all the subdetectors including the silicon tracking system, the electromagnetic calorimeter, the hadron calorimeter, and the muon system in order to reconstruct and identify individual particles in an event. The charged-particle information is primarily derived from the tracking system, and the photons are reconstructed using information from the electromagnetic calorimeter. The neutral hadrons, e.g. neutrons and K 0 L mesons, carry on average about 15% of the jet momentum, and are reconstructed using information from the hadron calorimeter. Jets reconstructed from these inputs are referred to as PF jets.
In the second method, called the jet-plus-track (JPT) algorithm [27], the energy deposits in the electromagnetic and hadron calorimeter cells, which are combined into calorimeter towers, are used as inputs to the clustering algorithm to form calorimeter jets. Tracks originating from the interaction vertex [28] are associated with these calorimeter jets based on the separation in η-φ space between the jet direction and track direction at the interaction vertex. In the case of partially overlapping jets, tracks are assigned to the jet with the minimum p T -weighted distance between each track and the jet axis. These tracks are categorized as muon, charged pion, and electron candidates, and the jet momentum is corrected by substituting their expected particle energy deposition in the calorimeter with their momentum. These track-corrected jets are referred to as JPT jets.

JHEP06(2012)160
The p T of both types of jets are corrected to the particle-level jet p T [29]. In both cases, the ratio of the reconstructed jet p T to the particle jet p T is close to unity, and only small additional corrections to the jet energy scale, of the order of 5-10%, are needed. These corrections are derived from GEANT4-based [30] CMS simulations, based on the p T ratio of the particle jet formed from all stable (cτ > 1 cm) particles to the reconstructed jet, and also in situ measurements using dijet and photon + jet events [29]. The uncertainty on the absolute jet energy scale is studied using both data and MC events and is found to be less than 5% for all values of jet p T and η. In order to remove jets coming from instrumental noise, jet quality requirements are applied [31].
The JPT jets are reconstructed with the anti-k T jet clustering algorithm and distance parameter D = 0.5 [23]. The tracks associated with JPT jets are used to measure the charged-hadron multiplicity and the transverse size of the jets in the jet p T range 50 GeV/c < p T < 1 TeV/c. The PF jets reconstructed with a distance parameter D = 0.7 are used to measure the jet shapes in the jet p T range 20 GeV/c < p T < 1 TeV/c. Owing to the larger jet size, jet shape measurements evaluate a larger fraction of the momentum from the originating parton and are relatively more sensitive to momentum deposited by multipleparton interactions (MPIs), thus providing important information to tune both the parton showering and MPI models in the event generators. To minimize the contribution from additional pp interactions in a triggered event (pileup), events with only one reconstructed primary vertex are selected for jet shape measurements, as the measurements use both charged and neutral particles. For charged-hadron multiplicity and jet transverse size studies, the events with multiple vertices are also considered as these studies use only those tracks that are associated with the primary vertex. The primary vertex is defined as the vertex with the highest sum of transverse momenta of all reconstructed tracks pointing to it.

Jet observables
We have studied several observables to characterize the jet structure. These observables are complementary and they can provide a more comprehensive picture of the composition of jets. In order to compare the resulting measurements with theoretical predictions, all the observables are corrected back to the particle level by taking into account detector effects using MC simulations.

Jet shapes
The differential jet shape ρ(r) is defined as the average fraction of the transverse momentum contained inside an annulus of inner radius r a = r − δr/2 and outer radius r b = r + δr/2 as illustrated in figure 1: Figure 1. Pictorial definition of the differential (top) and integrated (bottom) jet shape quantities. Analytical definitions of these quantities are given in the text.
The integrated jet shape Ψ(r) is defined as the average fraction of the transverse momentum of particles inside a cone of radius r around the jet axis: The sums run over the reconstructed particles, with the distance r i = (y i − y jet ) 2 + (φ i − φ jet ) 2 relative to the jet axis described by y jet and φ jet , and R = 0.7.
The observed detector-level jet shapes and true particle-level jet shapes differ because of jet energy resolution effects, detector response to individual particles, smearing of the jet directions, smearing of the individual particle directions, and inefficiency of particle reconstruction, especially at low p T . The data are unfolded to the particle level using binby-bin corrections derived from the CMS simulation based on the pythia 6.4 (pythia6) MC generator [32] tuned to the CMS data (tune Z2). The Z2 tune is identical to the Z1 tune described in [33], except that Z2 uses the CTEQ6L [34] parton distribution function (PDF), while Z1 uses CTEQ5L [35] PDF. The correction factors are determined as functions of r for each jet p T and rapidity bin and vary between 0 and 20%. Since the MC model affects the momentum and angular distributions and flavour composition of particles in a jet, and therefore the simulated detector response to the jet, the unfolding factors depend on the MC model. In order to estimate the systematic uncertainty due to the fragmentation model, the corrections are also derived using pythia8 [36], pythia6 tune D6T [32], and herwig++ [37]. The largest difference of these three sets of correction factors from those of pythia6 tune Z2 is assigned as the uncertainty on the correction. This uncertainty is typically 2-3% in the region where the bulk of the jet energy is deposited and increases to as high as 15% at large radii where the momentum of particles is very small. For very high p T jets where the fraction of jet momentum deposited at large radii is extremely small, the uncertainty is less than 1% at r = 0.1 and reaches 25% at high radii.

JHEP06(2012)160
The impact of the calibration uncertainties for particles used to measure the jet shapes is studied separately for charged hadrons, neutral hadrons, and photons. The calibration of each type of particle is varied within its measurement uncertainty, depending on its p T and η. The resulting change in the jet shape distributions is negligible as expected since the effect is largely cancelled out in the jet shapes, which are defined as p T sum ratios.
The jet energy scale uncertainty has a larger impact on the jet shape measurements because it affects the migration of jets between different jet p T bins. The jet energy scale uncertainty is estimated to be less than 5% for all jet p T and η bins [29] and results in a maximum uncertainty of 2-3% in both the differential and integrated jet shape distributions.

The charged-hadron multiplicity and the transverse size of jets
In addition to the study of ρ(r) and Ψ(r), we have measured characteristics of the charged components of jets, namely, the mean charged-hadron multiplicity per jet, N ch , and the second moments of the transverse jet size, defined by and p T,i , η i , and φ i are the transverse momentum, pseudorapidity, and azimuthal direction of a particle i in the jet. These moments are combined to obtain the second moment of the jet transverse width: We measure N ch and δR 2 using tracks with p T > 0.5 GeV/c associated with JPT jets. The tracks identified as electrons or muons are explicitly removed. As the tracks are required to be attached to the primary vertex, the tracks resulting from photon conversions are not used either.
The particle-level N ch and δR 2 values, defined to correspond to all stable charged hadrons with p T > 0.5 GeV/c, are obtained by separately correcting the measured observables for the tracking inefficiency and the jet energy resolution. The corrections to the track detection efficiency are applied in two steps: first, corrections for the tracker acceptance and for losses due to interactions in the detector material are determined for isolated charged pions as functions of p T and η using CMS simulation and applied as a weight assigned to each track [38]. Next, residual corrections for both the tracking inefficiency and misidentified tracks inside the dense high-p T jet environment are calculated for N ch and δR 2 as functions of jet p T in two jet rapidity ranges: |y| < 1 and 1 < |y| < 2. These corrections are derived from MC by comparing the detector-level and particle-level N ch JHEP06 (2012)160 and δR 2 for each jet p T bin. The correction factors for N ch increase from about 2% for jets with p T = 40 GeV/c to 5% for jets with p T = 200 GeV/c. The corrections increase to 20% for a jet with p T of 800 GeV/c. For δR 2 , the corrections increase from 3 to 8% as the jet p T goes from 40 GeV/c to 200 GeV/c, and rise to 20% for 800 GeV/c p T jets. The uncertainty on N ch due to these residual corrections is 1%, while the uncertainty on δR 2 is 2-5%.
The jet energy resolution corrections are extracted bin by bin from the CMS simulation based on the pythia6 tune Z2 MC samples. The uncertainty on N ch ( δR 2 ) due to jet energy resolution is 1-2% (2-5%). A cross-check of the correction procedure is performed using the Tikhonov regularization method with a quasi-optimal solution [39,40]. The results obtained with these two methods are consistent to within 2%.

Results
In this section, data results are compared to MC simulations using the pythia6 [32], pythia8 [36], and herwig++ [37] event generators. Three different tunes of the pythia 6.4 generator are considered: tune D6T, tune Z2, and the Perugia2010 tune [41]. Tune D6T uses virtuality-ordered parton showers, while tune Z2 and the Perugia2010 tune use p T -ordered parton showers. Generator input parameters controlling the underlying event, radiation, and hadronization are tuned in order to provide a better description of collider data. Tune D6T was developed using previous hadron and lepton collider data, while tune Z2 also uses CMS soft p T data [42]. The Perugia2010 tune [41] was tuned using LEP and Tevatron data, notably the CDF jet shape results [9]. Tunes D6T and Z2 are simulated with pythia 6.4.22, and the Perugia2010 tune is simulated with pythia 6.4.24. The CTEQ6L1 [34] parton distribution function (PDF) of the proton is used with tunes D6T and Z2, while the CTEQ5L [35] PDF is used with the Perugia2010 tune. The pythia 8.145 generator, tune 2C, uses an improved diffraction model, and the herwig++ 2.4.2 generator uses angular-ordered parton showers and a cluster-based fragmentation model. For herwig++ 2.4.2, the default underlying event tune is used together with the MRST2001 [43] PDF.
The differential jet shape measurements for central jets (|y| < 1) for representative bins in jet p T , along with their statistical and systematic uncertainties, compared with predictions from different MC generators and tunes are presented in figures 2 and 3.
Larger values of ρ(r) denote larger transverse momentum fraction in a particular annulus. At high jet p T , the data are peaked at low radius r, indicating that jets are highly collimated with most of their p T close to the jet axis while they widen at lower jet p T . For the lowest jet p T bins, the p T distribution within the jet flattens considerably. For 20 GeV/c jets in the central rapidity region, approximately 15% of the jet p T is within a radius of r = 0.1 around the jet axis, whereas at 600 GeV/c this fraction increases to about 90%. This behaviour is illustrated in figures 4 and 5 where the amount of jet energy deposited outside a cone of r = 0.3, 1 − Ψ(r = 0.3), is shown as a function of jet p T for central jets and also in six different jet rapidity regions up to |y| = 3. These figures also show comparisons of the data with the pythia6, pythia8, and herwig++ generators.  Figure 3. Differential jet shape as a function of the distance from the jet axis for central jets (|y| < 1) with jet transverse momentum ranging from 140 to 1000 GeV/c for representative jet p T bins. The data are compared to particle-level herwig++, pythia8, and pythia6 predictions with various tunes. Statistical uncertainties are shown as error bars on the data points and the shaded region represents the total systematic uncertainty of the measurement. Data points are placed at the bin centre; the horizontal bars show the size of the bin. The ratio of each MC prediction to the data is also shown in the lower part of each plot.   Figure 5. Measured integrated jet shape, 1 − Ψ(r = 0.3), as a function of jet p T in different jet rapidity regions, compared to herwig++, pythia8, and pythia6 predictions with various tunes. Statistical uncertainties are shown as error bars on the data points and the shaded region represents the total systematic uncertainty of the measurement. Data points are placed at the bin centre; the horizontal bars show the size of the bin. The ratio of each MC prediction to the data is also shown in the lower part of each plot.

JHEP06(2012)160
As depicted in figures 4 and 5, at low jet p T the pythia8 generator predicts somewhat narrower jets than those found in data, while pythia6 tune D6T predicts wider jets. Tune Z2 provides a good description of data at low jet p T . At jet p T 40 GeV/c the Perugia2010 and D6T tunes describe the data better than tune Z2. This trend holds for all rapidity ranges. herwig++ predicts wider jets than observed in data over most of the jet p T region except at the forward rapidity regions where the agreement is better. The measurement is presented as a function of jet rapidity for different p T regions in figure 6, which shows that jets become somewhat narrower with increasing |y| in both data and simulation.
The measured N ch and δR 2 as functions of jet p T are presented in figures 7 and 8 for two different rapidity intervals, |y| < 1 and 1 < |y| < 2, along with their statistical and systematic uncertainties. The total systematic uncertainty includes the uncertainty on the jet energy scale, jet energy resolution, tracking inefficiency, jet unsmearing procedure, and pileup contribution. The ratios of the MC predictions to data, corrected to the particle level, of these two observables are shown at the bottom of the figures. The measured values of N ch are systematically lower than the values predicted by both pythia6 and herwig++. In the case of δR 2 the predicted values are in agreement with the measured values with the exception of some disagreement observed with pythia6 tune Z2 at |y| < 1.
The ratio of the second moments in the η and φ directions is shown as a function of jet p T for |y| < 1 in figure 9. Systematic uncertainties largely cancel in this ratio. The measured jet width in the η direction is slightly wider than in the φ direction. These results agree with pythia6 predictions, while herwig++ predicts a larger difference of the jet width in the η and φ directions.
A comparison of the N ch and δR 2 values obtained from the data as functions of jet p T in two ranges of jet rapidity is shown in figure 10. The data are in good agreement with the hypothesis that the fraction of quark-induced jets increases with increasing jet p T and jet rapidity.
Tables containing the measured jet shape, charged-hadron multiplicity, and transverse size data are available as a supplement to the online version of this article.

Summary
We have presented measurements of jet shapes, mean charged-hadron multiplicity, and transverse width for jets produced in proton-proton collisions at a centre-of-mass energy of 7 TeV, collected by the CMS detector at the LHC. Jets become narrower with increasing jet p T , and they also show a mild rapidity dependence in which jets become somewhat narrower with increasing |y|, in the manner predicted by various QCD Monte Carlo models. At low jet p T , the pythia6 Z2 model tuned to the initial CMS soft p T data [42] provides a fair description of the measured jet shapes. At jet p T 40 GeV/c, the tune Z2 predicts slightly narrower jets than those observed in data whereas the D6T and Perugia2010 tunes describe the data better. The measurements may be used to further improve these Monte Carlo models.
The mean charged-hadron multiplicity and the second moment of the jet width are compared with predictions from the pythia6 (tunes D6T and Z2) and herwig++ gen- erators. All these models predict slightly higher mean charged-hadron multiplicities than found in the data; however, good agreement is observed between the models and the measured second moment of the jet transverse width. The observed behaviour of the mean multiplicity and jet transverse width agrees with the predicted increase in the fraction of quark-induced jets at higher jet transverse momentum and rapidity. Decomposition of the transverse width second moment into second moments for η and φ demonstrates that jets are slightly wider in the η direction than in the φ direction. This observation is in good quantitative agreement with pythia6 predictions, while herwig++ predicts a larger difference between jet widths in the η and φ directions.  Figure 10. Average charged-particle multiplicity N ch (top) and average transverse jet size δR 2 (bottom) as functions of jet p T for jets with 0 < |y| < 1 (solid squares) and with 1 < |y| < 2 (open squares). Data are shown with statistical error bars and a band denoting the systematic uncertainty. Also shown are predictions for quark-induced and gluon-induced jets for |y| < 1 based on the pythia6 tune Z2 event generator. [17] CMS collaboration, ECAL 2010 performance results, CMS-DP-2011-008 (2011).