Study of hadronic event-shape variables in multijet final states in pp collisions at sqrt(s) = 7 TeV

Event-shape variables, which are sensitive to perturbative and nonperturbative aspects of quantum chromodynamic (QCD) interactions, are studied in multijet events recorded in proton-proton collisions at sqrt(s) = 7 TeV. Events are selected with at least one jet with transverse momentum pt>110 GeV and pseudorapidity abs(eta)<2.4, in a data sample corresponding to integrated luminosities of up to 5 inverse femtobarns. The distributions of five event-shape variables in various leading jet pt ranges are compared to predictions from different QCD Monte Carlo event generators.


Introduction
Event-shape variables measure the properties of the energy flow in the final states of high energy particle collisions. Their simplicity combined with their sensitivity to many important signatures of quantum chromodynamics (QCD) [1,2] make them interesting observables extensively studied in hadronic final states of electron-positron (e + e − ) and deep inelastic scattering (DIS) collisions [3][4][5]. Such event-shape variables are theoretically defined in an infrared-and collinear-safe manner and can be computed using perturbative techniques. Their measurements have improved our understanding of many perturbative and nonperturbative aspects of QCD including the determination of the strong coupling constant α s , details of parton radiation and hadronization, tests of the colour structure of the theory, as well as modelling and validation of Monte Carlo (MC) event generators.
Measurements of event-shape variables in hadron-hadron collisions are more complicated than in e + e − or DIS collisions, because a larger fraction of the final-state activity is emitted at very forward pseudorapidities not covered by the detectors, and also because the elementary (parton-parton) kinematics cannot be determined as precisely. These difficulties have led to the redefinition of event-shape variables in the transverse plane, where the energy flow can be measured with small systematic uncertainty. A large set of event-shape variables in proton-proton (pp) collisions, which are sensitive to different aspects of the rich dynamics of the strong interaction from soft (hadronization) to hard (multijet radiation) scales has been proposed [1,2]. These variables are normalized to the sum of the measured transverse momenta (p T ) of all reconstructed objects in the event to reduce the systematic uncertainty due to the jet energy scale.
Previous studies of event-shape variables at hadron colliders include those of the CDF experiment at the Tevatron [6], and early measurements at the LHC [7,8]. More recently event-shape variables have been studied in the associated production of Z bosons with jets [9]. In the previous analysis of the CMS experiment with 3.2 pb −1 of data [7], the transverse thrust and thrust minor variables were studied to improve the modelling of multijet production in MC generators. This study is expanded here using a larger data set corresponding to an integrated luminosity of 5 fb −1 in pp collisions at √ s = 7 TeV with an expanded set of five event-shape variables [1,2]: the transverse thrust, jet broadening, jet mass (both total and in the transverse plane), and the third-jet resolution parameter. The significant increase in luminosity allows the measurement of variables with three jets, not accessible with earlier data, and the latter four observables are analysed in CMS for the first time. Therefore this analysis is sensitive to features of the event generators that were not probed in the previous CMS result.
The paper is organized as follows. In Section 2, elements of the CMS detector relevant to this analysis are described. Section 3 introduces the event-shape variables studied in this work. The data and MC simulated event samples are summarized in Section 4 along with the event selection criteria. Section 5 describes the unfolding technique employed and the propagated systematic uncertainties. Section 6 compares the five event-shape distributions in data with several QCD event generators. The results are summarized in Section 7. measured in the x-y plane in radians. Pseudorapidity is defined as η = − ln[tan(θ/2)].
The central feature of the CMS detector is a superconducting solenoid of 6 m internal diameter, providing a magnetic field of 3.8 T. Within the field volume, there are silicon pixel and strip trackers, a lead tungstate crystal electromagnetic calorimeter (ECAL), and a sampling hadron calorimeter made up of layers of brass plates and plastic scintillators. The calorimeters provide coverage in pseudorapidity up to |η| = 3.0. A preshower detector consisting of two planes of silicon sensors interleaved with lead is located in front of the ECAL at 1.7 < |η| < 2.6. An iron and quartz fiber Cherenkov hadron calorimeter covers pseudorapidities 3.0 < |η| < 5.0. The muons are measured in the pseudorapidity range |η| < 2.4, with detection planes made using three technologies: drift tubes, cathode strip chambers, and resistive plate chambers.
The PF algorithm [11,12] combines information on charged particles from the tracking system, energy deposits in the electromagnetic and hadron calorimeters, as well as signals in the preshower detector and muon systems to assign a four-momentum vector to particles, i.e. γ, e ± , µ ± , charged, and neutral hadrons. Jets are reconstructed using these particles. The energy calibration of individual particle types is performed separately. At the PF level, the jet constituents are almost fully calibrated and require only a small correction (less than 10%) [13] due to tracking inefficiencies and threshold effects. The jet clustering is performed using the anti-k T clustering algorithm [14,15] with a distance parameter R = 0.5. The jets are ordered by descending p T with p T,1 and p T,2 representing the transverse momenta of the leading and the second leading jets, respectively.

Event-shape variables
Five event-shape variables are analysed in this paper: the transverse thrust τ ⊥ , the total jet broadening B tot , the total jet mass ρ tot , the total transverse jet mass ρ T tot and the third-jet resolution parameter Y 23 . In the formulae below, p T,i , η i , and φ i represent the transverse momentum, pseudorapidity, and azimuthal angle of the ith jet, andn T is the unit vector that maximizes the sum of the projections of p T,i . The transverse thrust axisn T and the beam form the so-called event plane. Based on the direction ofn T , the transverse region is separated into an upper side C U , consisting of all jets with p T ·n T > 0, and a lower side C L , with p T ·n T < 0. The the jet broadening and third-jet resolution variables require at least three selected jets.
Transverse thrust: The event thrust observable in the transverse plane is defined as This variable probes the hadronisation process and is sensitive to the modelling of twojet and multijet topologies. In this paper "multijet" refers to "more-than-two-jet". In the limit of a perfectly balanced two-jet event, τ ⊥ is zero, while in isotropic multijet events it amounts to (1 − 2/π).

Jet broadening:
The pseudorapidities and the azimuthal angles of the axes for the upper and lower event regions are defined by where X refers to upper (U) or lower (L) side. From these, the jet broadening variable in each region is defined as where P T is the scalar sum of the transverse momenta of all the jets. The total jet broadening is then defined as Jet masses: The normalized squared invariant mass of the jets in the upper and lower regions of the event is defined by where M X is the invariant mass of the constituents of the jets in the region X, and P is the scalar sum of the momenta of all constituents in both sides.
The jet mass variable is defined as the sum of the masses in the upper and lower regions, The corresponding jet mass in the transverse plane, ρ T tot , is also similarly calculated in transverse plane.

Third-jet resolution parameter:
The third-jet resolution parameter is defined as where i, j run over all three jets, (∆R ij ) 2 = (η i − η j ) 2 + (φ i − φ j ) 2 , and p T,3 is the transverse momentum of the third jet in the event. If there are more than three jets in the event, they are iteratively merged using the k T algorithm [16,17] with a distance parameter R = 0.6. To compute P 12 , three jets are merged into two using the procedure described above and P 12 is then defined as the scalar sum of the transverse momenta of the two remaining jets.
The Y 23 variable estimates the relative strength of the p T of the third jet with respect to the other two jets. It vanishes for two-jet events, and a nonzero value of Y 23 indicates the presence of hard parton emission, which tests the parton showering model of QCD event generators. A test like this is less sensitive to the details of the underlying event (UE) and parton hadronization models than the other event-shape variables [2].

Event selection and Monte Carlo samples
This analysis extends the phase space compared to the previous study [7] to |η| < 2.4, and considers several different p T ranges for the leading jet. The events used are collected with single-jet triggers, which are reconstructed from calorimeter information only, where the p T of at least one jet is above a certain threshold, p T,th . Events are divided into five bins of p T,1 where each bin uses data from one trigger path. The choice of p T,1 ranges (Table 1) has been determined by the trigger criteria, while the p T threshold (>30 GeV) for the other jets and their geometric acceptance (|η| < 2.4) are restricted to give the good jet energy scale and resolution. Spurious jets, which are due to noise in the calorimeters or other noncollision backgrounds,

Event selection and Monte Carlo samples
are eliminated using jet quality criteria, e.g. jets must consist of at least two particles, including at least one charged hadron, and not more than 99% of the jet energy may be carried by neutral hadrons alone, or by photons alone. These jets that do not satisfy the identification requirements are not included in the calculation of the event-shape variables.
An event is discarded if • any one of the two highest-p T jets in the event lies outside the central region (|η| < 2.4); for the measurement of B tot and Y 23 a third jet satisfying the jet selection criteria is required within the same detector acceptance region; • any one of the two highest-p T jets is spurious; • all selected jets of an event lie only on one side of the line perpendicular ton T . This criterion ensures that events will be rejected if jets are missed in the forward direction. Events of interest for this analysis should be well-balanced in p T and hence have jets on both sides of this line. Table 1 shows the numbers of events, as well as the fractions of events with two, three, four, or more jets, for various ranges of the leading jet p T,1 , along with the effective integrated luminosity for each data sample. The effective luminosities differ due to variations of the prescale factor of the trigger paths associated with each p T,1 . The average number of additional pp interactions per bunch crossing (pileup) on the collected dataset is ≈8. The effect of pileup in the distributions of event-shape variables has been studied by grouping the events in different ranges of number of reconstructed primary vertices, and no bias has been found. This is expected due to fact that after the jet energy calibration, there is no residual pileup dependence. Table 1: Characteristics of the data samples selected for this analysis, in categories of leading jet transverse momentum p T,1 : effective integrated luminosity, selected number of events, and relative abundances of the numbers of selected jets, N jet , for jets with p T > 30 GeV and |η| < 2.4.

Range of Luminosity Number
Fraction of events (%)  [20], and MADGRAPH 5.1.5.7 (MADGRAPH) [21] are chosen to generate multijet events. Particles with a lifetime larger than 30 ps are declared stable and handled by the full CMS detector simulation based on GEANT4 [22]. These generators reproduce the single differential jet spectra measured at the LHC [23-25]. The simulated events are then reconstructed in the same way for real data. The MC simulations are also used to obtain the unfolding corrections, described in the next section, and to estimate the associated uncertainties.
Events are generated with PYTHIA 6 using three different models: (i) D6T [26], which uses virtuality-ordered parton showering (PS) and is based on Tevatron data; and two models that use p T -ordered PS: (ii) Perugia-P0 [27] based on LEP and Tevatron data, and (iii) Z2 [28] based on CMS data collected at √ s = 900 GeV and 7 TeV. The generator PYTHIA 8 uses p T -ordered PS, an UE description based on the multiple parton interaction (MPI) model of PYTHIA 6 in-terleaved with initial and final state radiation, and the tune4C [29] settings. The HERWIG++ generator is run with tune23 settings, where the PS evolution is based on angular ordering and an eikonal MPI model for the UE. Finally, the MADGRAPH MC employs matrix element (ME) calculations to generate events with two to four partons plus PYTHIA 6-tuneZ2 for the PS and UE. The MLM matching procedure [30] is imposed to avoid a double counting of jets between the ME and PS, for a minimum jet p T threshold of 20 GeV.
All MC generators use the CTEQ6L1 parametrization as the choice of parton distribution function (PDF), except for Perugia-P0 which uses CTEQ5L [31].

Unfolding and systematic uncertainties
Jets at generator level are defined as a collection of stable particles with the same kinematic criteria used for the real data. The distribution of a variable obtained using parton-and detectorlevel information differs because of the finite energy and angular resolutions of the experimental apparatus. In order to correct the measured distributions for bin migrations due to detector effects, a response matrix is constructed with simulated events. The D'Agostini method [32] is employed to unfold the experimental data, using the response matrix obtained from PYTHIA 6-tuneZ2, PYTHIA 8, and MADGRAPH samples. Although the results are consistent for the generators, small differences (<3%) are observed, which are taken as a systematic uncertainty. Another source of systematic uncertainty in the unfolding procedure is the choice of the unsmearing method. A regularized unfolding method based on singular value decomposition (SVD) of the response matrix [33] is also used as a consistency check. The difference between the D'Agostini and SVD unfolding methods is less than 5% for the τ ⊥ distribution. It can be as high as 20% for the distributions of other event-shape variables, which require more than three jets in the event, mainly as a consequence of the lower number of events in the lower ranges of p T,1 .
Other sources of systematic uncertainty include the finite jet energy and angular resolutions and the jet energy scale [13]. In order to propagate the uncertainties due to the jet resolutions, the unfolded response matrix is obtained with jets randomly spread at the generator level with increasing and decreasing values of the resolution parameters. The corresponding differences in the unfolded data distributions are considered as systematic uncertainties, which are found to be less than 2% in most cases, but can be as high as 5% in some corners of phase space. Similarly, the jet energy scale is increased and decreased by one standard deviation with respect to its central value and the unfolded distributions are compared with the nominal one to estimate the effect of this scale correction, the resulting uncertainty is less than 3%. The effect of pileup on the event-shape variables is found to be negligible.

Results
The distributions of the logarithms of the five event-shape variables analysed (τ ⊥ , B tot , jet mass (both ρ tot and ρ T tot ), and Y 23 ) are shown in Figs. 1-5 for the five leading jet p T ranges listed in Table 1. All distributions are unfolded, normalized to unitary; they are compared to the predictions from the six generator models. The error bars around the data points indicate the statistical uncertainties and the shaded bands represent the sum in quadrature of statistical and systematic uncertainties. The corresponding ratios between the model predictions and the data are shown in the lower plots. The distributions are plotted in a logarithmic horizontal axis scale so that the details at small values of the event-shape variables are also visible.
Overall, the models tend to reproduce the transverse thrust τ ⊥ , total transverse jet mass ρ T tot , and third-jet resolution parameter Y 23 distributions better than the total jet mass ρ tot and jet broadening B tot ones. The model that consistently reproduces all the distributions within the uncertainties is the MADGRAPH matrix-element calculator combined with PYTHIA 6-tuneZ2 for the PS and UE.
Similar data-MC comparisons are performed using these different jet clustering algorithms: (i) anti-k T with a distance parameter R = 0.7, (ii) k T with a distance parameter R = 0.4, and (iii) energy deposits using calorimeter information only instead of PF candidates. In all cases, the results are similar and in agreement with each other. Also, the effect due to the choice of a particular PDF set in the MC predictions has been estimated using the MSTW2008lo68cl set [34,35]. A negligible effect has been found by varying the PDF eigenvalues within one standard deviation.
The transverse thrust variable τ ⊥ (Fig. 1) is insensitive to the longitudinal component of the particles' momenta, and thus to the modelling of MPI and colour connection between soft scatters and beam remnants. The data-MC agreement for this observable is at the 5-10% level for all p T bins except at the highest τ ⊥ where differences as large as 20% are observed. The agreement is better than the other event-shape variables, which are more sensitive to MPI and colour connection effects. The τ ⊥ distributions also reveal that the predictions for the lower p T bins from PYTHIA 8, HERWIG++, MADGRAPH, and PYTHIA 6 (with model D6T) are closer to the data than the ones from PYTHIA 6 with model Z2 and Perugia-P0.
The jet broadening distribution (Fig. 2) is poorly described by all the models at both low and high B tot values except for the MADGRAPH generator. This variable is insensitive to the UE and hadronization details, but a precise modelling of the ME and PS is crucial in order to correctly predict its distribution. Both model ingredients are expected to be more adequately described in MADGRAPH, where the multijet final-states are directly obtained from the hard ME calculations, unlike PYTHIA and HERWIG++ parton showers, which work best for 2→2 processes. In addition, the jet broadening is sensitive to colour coherence effects, which have an improved description [36] in the current version of HERWIG++, which explains the best relative agreement of this model compared to all PYTHIA models. Similar arguments are also applicable for the total jet mass ρ tot and the third-jet resolution parameter Y 23 .
The total jet mass ρ tot distribution (Fig. 3) shows a similar behaviour between the measurement and the different model predictions as observed for the jet broadening case. MADGRAPH and HERWIG++ reproduce this observable better than the various PYTHIA models. This variable is more sensitive to (initial-state) forward radiation than the jet broadening [2], which indicates that such QCD emission is adequately described in the former two models.
The transverse jet mass ρ T tot distributions (Fig. 4) show agreement between data and predictions within 20%, which is better than that seen for the total jet mass ρ tot . This is expected for ρ tot , because transverse variables are less sensitive to the longitudinal energy flow [2] and to colour connection effects. Among the PYTHIA 6 models, D6T one shows the best agreement with the data.
The third-jet resolution parameter Y 23 distribution (Fig. 5) is sensitive to the properties of multijet emission and it is robust with respect to the modelling of the UE and hadronization. The MADGRAPH generator shows again, for such a ME-sensitive observable, a better data-model agreement than the rest of MC simulations.

Summary
An extended set of five event-shape variables (the transverse thrust τ ⊥ , the jet broadening B tot , the total jet mass ρ tot , the total transverse jet mass ρ T tot , and the third-jet resolution parameter Y 23 ) have been studied in multijet final states measured in pp collisions at √ s = 7 TeV. Such observables are sensitive to perturbative and nonperturbative aspects of QCD, and allow the validation of hadronic event generators. The experimental distributions have been measured in five different ranges of leading jet transverse momenta from 110 < p T < 170 GeV up to p T > 390 GeV, and compared to the predictions of six different event generators.
For the transverse thrust, all generators show an overall agreement with the data within 10%, with PYTHIA 8 and HERWIG++ exhibiting a better agreement than the others. A 20% level of agreement is also found for the total transverse jet mass distributions. However, event-shape variables that are more sensitive to the longitudinal energy flow (such as the total jet mass) or to hard parton emissions (such as the jet broadening) show a larger discrepancy between data and parton shower MC simulations. The predictions of PYTHIA 6-D6T show better agreement with data for the event-shape variables that make use only of the jet p T , but have worse agreement for Y 23 compared to other PYTHIA 6 models. The modelling of colour connection between the soft scatters and beam remnants, and initial-and final-state radiations are the major sources of differences between the various QCD event generators. The generator that consistently reproduces all distributions within the uncertainties is the MADGRAPH matrix-element calculator combined with PYTHIA 6-tuneZ2 for multiparton interactions and parton showering and hadronization. The study of infrared-and collinear-safe event-shape variables presented here provides detailed information to further improve the modelling of parton radiation and hadronization in event generators for high energy hadronic collisions.

Acknowledgments
We congratulate our colleagues in the CERN accelerator departments for the excellent performance of the LHC and thank the technical and administrative staffs at CERN and at other CMS institutes for their contributions to the success of the CMS effort. In addition, we gratefully acknowledge the computing centres and personnel of the Worldwide LHC Computing Grid for delivering so effectively the computing infrastructure essential to our analyses. Finally, we acknowledge the enduring support for the construction and operation of the LHC and the CMS detector provided by the following funding agencies: BMWFW and FWF (Austria); FNRS and [5] L3 Collaboration, "Study of hadronic event shape in flavour tagged events in e + e − annihilation at √ s = 197 GeV", PMC Phys. A 2 (2008) 6, doi:10.1186/1754-0410-2-6, arXiv:0907.2658.