Study of quark and gluon jet substructure in Z+jet and dijet events from pp collisions

Measurements of jet substructure describing the composition of quark- and gluon-initiated jets are presented. Proton-proton (pp) collision data at $\sqrt{s}$ =13 TeV collected with the CMS detector are used, corresponding to an integrated luminosity of 35.9 fb$^{-1}$. Generalized angularities are measured that characterize the jet substructure and distinguish quark- and gluon-initiated jets. These observables are sensitive to the distributions of transverse momenta and angular distances within a jet. The analysis is performed using a data sample of dijet events enriched in gluon-initiated jets, and, for the first time, a Z+jet event sample enriched in quark-initiated jets. The observables are measured in bins of jet transverse momentum, and as a function of the jet radius parameter. Each measurement is repeated applying a"soft drop"grooming procedure that removes soft and large angle radiation from the jet. Using these measurements, the ability of various models to describe jet substructure is assessed, showing a clear need for improvements in Monte Carlo generators.


Introduction
Interactions between quarks and gluons, commonly referred to as partons, are governed by quantum chromodynamics (QCD). When quarks or gluons are produced by collisions in particle colliders, collimated showers of hadrons are formed. Theoretical calculations and experimental methods cluster such showers into jets using dedicated clustering algorithms [1] that correlate the kinematic properties of the jets with those of the participating partons. Furthermore, the study of the distribution of particles within a jet, the jet substructure, can be analyzed to determine the likelihood that a jet originates either from a quark or from a gluon. Heavy, Lorentz-boosted particles that decay to quarks and/or gluons, e.g. top quarks or W, Z or Higgs bosons, have characteristic distributions of jet substructure observables. Several algorithms have been developed to distinguish such jets from those originating from single partons [2][3][4]. These are used in numerous measurements and searches at the CERN LHC, and frequently the modelling of jet substructure contributes significantly to the systematic uncertainties [2]. Many of them would benefit from a better understanding and simulation of jet substructure observables.
Although there is no unique way of defining whether a jet originates from a quark or gluon in QCD [5], at leading order (LO) in perturbation theory, one can label jets according to the parton initiating the particle shower. Jets originating from single quarks, quark jets, and those arising from single gluons, gluon jets, are known to differ in their substructure. At LO in perturbative QCD, the difference in the Casimir colour factor between quarks (C F = 4/3) and gluons (C A = 3) leads to a higher probability for gluons to radiate a soft gluon by a factor of C A /C F = 9/4 [6]. Therefore, we expect gluon jets to contain, on average, more constituents, with a wider geometric spread in the detector [7][8][9][10][11][12]. However, there are also a variety of other nonperturbative effects that determine the jet substructure, including hadronization and colour reconnection, which are typically described by phenomenological models in Monte Carlo (MC) event generators. The description of jet fragmentation and substructure thus requires a combined understanding of both perturbative calculations and nonperturbative effects. Although the description of jets initiated by quarks is well constrained by many previous measurements, the modelling of gluon jets is less well understood [2,5].
In this paper, we report on a measurement of a set of five observables that can be used to distinguish between quark-and gluon-initiated jets, the generalized angularities λ κ β [48], defined as: where z i is the fractional transverse momentum carried by the ith jet constituent, R is the jet radius parameter, and ∆R i is the displacement of the constituent from the jet axis, defined as ∆R i = √ (∆y i ) 2 + (∆φ i ) 2 where ∆y i and ∆φ i are the separations in rapidity and azimuthal angle (in radians), respectively, between the jet axis and the i th constituent. The parameters κ ≥ 0 and β ≥ 0 control the momentum and angular contributions, respectively. The five observables measured in this paper are: Les Houches Angularity (LHA) = λ 1 0.5 [49]; width = λ 1 1 ; thrust = λ 1 2 [50]; multiplicity = λ 0 0 ; and (p D T ) 2 = λ 2 0 . These are shown in Fig. 1 as points in the (κ, β) plane. These quantities have previously been used by the CMS [51,52] and ATLAS [53] Collaborations to discriminate between quark and gluon jets. They have also been proposed as a tool to measure parton distribution functions [54] and double-parton scattering [55] in Z+jets events. Figure 1: The five generalized angularities λ κ β used in this analysis, represented in the (κ, β) plane. The Les Houches Angularity is denoted by LHA. Adapted from Ref. [48].
We measure these observables in quark-jet enriched Z+jet and gluon-jet enriched dijet samples. In the Z+jet sample, we study a jet recoiling against a Z boson decaying to a pair of muons. The measurement focuses on muons rather than electrons, since effects from bremsstrahlung are negligible for them. Within the dijet event sample, we separately measure the distributions for the more central (smaller absolute rapidity |y|) of the two jets, and the more forward (larger |y|) of the two jets. The sample of the more central jets is predicted to provide a higher purity of gluon jets [17]. Each quantity is measured as a function of jet transverse momentum p T , with different jet radius parameters, including all particles and only charged particles, and with and without applying a soft drop grooming procedure [56] that removes soft and wide angle radiation from the jet, thereby making the jet substructure observables more resilient to effects from pileup, underlying event, and initial state radiation. The detector level distributions are unfolded to particle level, i.e. only considering final state particles after hadronization with proper lifetime cτ > 10 mm, for each configuration and observable, and normalized to unity in each p T bin. The measurements are finally compared with predictions based on analytic resummation [57].
After a brief description of the CMS detector in Section 2, details about the data and simulated samples are given in Section 3, followed by a description of the event selection in Section 4. The jet substructure observables are then more precisely introduced in Section 5. The unfolding procedure and uncertainties of the measurement are described in Section 6. Finally, results are presented and discussed in Section 7, and briefly summarized in Section 8.

The CMS detector and event reconstruction
The central feature of the CMS apparatus is a superconducting solenoid of 6 m internal diameter, providing a magnetic field of 3.8 T. Within the solenoid volume are a silicon pixel and strip tracker, a lead tungstate crystal electromagnetic calorimeter (ECAL), and a brass and scintillator hadron calorimeter (HCAL), each composed of a barrel and two endcap sections. Forward calorimeters extend the pseudorapidity (η) coverage provided by the barrel and endcap detectors. Muons are detected in gas-ionization chambers embedded in the steel flux-return yoke outside the solenoid.
A more detailed description of the CMS detector, together with a definition of the coordinate system used and the relevant kinematic variables, can be found in Ref. [58].
The silicon tracker measures charged particles within the range |η| < 2.5. It consists of 1440 silicon pixel and 15 148 silicon strip detector modules. For particles of 1 < p T < 10 GeV and |η| < 1.4, the track resolutions are typically 1.5% in p T and 25-90  µm in the transverse (longitudinal) impact parameter [59].
The candidate vertex with the largest value of summed physics-object p 2 T is the primary protonproton (pp) interaction vertex. The physics objects are jets, clustered using the anti-k T jet finding algorithm [60,61] with the tracks assigned to a candidate vertex as inputs, and the associated missing transverse momentum, which is the negative vector sum of the p T of those jets.
Muons are measured in the pseudorapidity range |η| < 2.4, with detection planes made using three technologies: drift tubes, cathode strip chambers, and resistive plate chambers. The single muon trigger efficiency exceeds 90% over the full η range, and the efficiency to reconstruct and identify muons is greater than 96%. Matching muons to tracks measured in the silicon tracker results in a relative transverse momentum resolution, for muons with p T up to 100 GeV, of 1% in the barrel and 3% in the endcaps. The p T resolution in the barrel is better than 7% for muons with p T up to 1 TeV [62].
The particle-flow algorithm [63] is used to reconstruct and identify each individual particle in an event, with an optimized combination of information from the various elements of the CMS detector. The energy of photons is obtained from the ECAL measurement. The energy of electrons is determined from a combination of the electron momentum at the primary interaction vertex as determined by the tracker, the energy of the corresponding ECAL cluster, and the energy sum of all bremsstrahlung photons spatially compatible with originating from the electron track. The energy of muons is obtained from the curvature of the corresponding track. The energy of charged hadrons is determined from a combination of their momentum measured in the tracker and the matching ECAL and HCAL energy deposits, corrected for the response function of the calorimeters to hadronic showers. Finally, the energy of neutral hadrons is obtained from the corrected ECAL and HCAL energies.
For each event, hadronic jets are clustered from these reconstructed particles using the infrared and collinear (IRC) safe anti-k T algorithm [60,61] with distance parameters of 0.4 and 0.8, referred to as AK4 and AK8 jets in the following. Jet momentum is determined as the vectorial sum of all particle momenta in the jet. It is found from simulation to be, on average, within 5 to 10% of the true momentum over the whole p T spectrum and detector acceptance.
Additional pp interactions within the same or nearby bunch crossings (pileup) can contribute additional tracks and calorimetric energy depositions to the jet momentum. The pileup per particle identification algorithm [64,65] is used to mitigate the effect of pileup at the level of reconstructed particles, making use of the local momentum distribution, event pileup properties and tracking information. For each neutral particle, a momentum distribution variable log ∑ p 2 T /∆R 2 is computed using the surrounding charged particles compatible with the primary vertex within the tracker acceptance (|η| < 2.5), and using both charged and neutral particles in the region outside of the tracker coverage, where p T and ∆R correspond to the transverse momenta and distances of the surrounding particles in the η-φ plane w.r.t. the particle direction. The probability to originate from the primary interaction vertex is deduced by comparing the momentum distribution variable to its event median characterizing the expected value for particles from pileup vertices. The momenta of the neutral particles are rescaled according to their probability, superseding the need for jet-based pileup corrections [65]. Charged particles identified to be originating from pileup vertices are discarded.
Jet energy corrections are derived from simulation to adjust the measured response of jets to that of particle level jets on average. In situ measurements of the momentum balance in dijet, photon+jet, Z+jet, and multijet events are used to account for any residual differences in jet energy scale in experimental data and simulation [66]. The jet energy resolution amounts typically to 15% at 10 GeV, 8% at 100 GeV, and 4% at 1 TeV. Additional selection criteria are applied to each jet to remove jets potentially dominated by anomalous contributions from various subdetector components or reconstruction failures [65].
Events of interest are selected using a two-tiered trigger system [67]. The first level, composed of custom hardware processors, uses information from the calorimeters and muon detectors to select events at a rate of around 100 kHz within a fixed latency of about 4 µs [68]. The second level, known as the high-level trigger, consists of a farm of processors running a version of the full event reconstruction software optimized for fast processing, and reduces the event rate to around 1 kHz before data storage.

Data and simulated samples
The analysis is based on pp collision data collected by the CMS experiment in 2016 at a centreof-mass energy of √ s = 13 TeV, corresponding to an integrated luminosity of 35.9 fb −1 and with an average of 23 interactions per bunch crossing. Events are selected by the trigger system using multiple sets of trigger paths. The Z+jet candidate events are collected with a trigger requiring at least one isolated muon with p T > 24 GeV and |η| < 2.4; the efficiency to select an event with at least one muon exceeds 90% [69]. Dijet events with high-p T jets are collected with triggers requiring at least one AK4 or AK8 jet above a certain threshold in p T with an efficiency exceeding 99%. The triggers with p T < 450 GeV are "prescaled", i.e. only a fraction of the collision events passing the trigger requirements are accepted, which results in a lower integrated luminosity for these triggers. Dijet events with jet p T < 40 GeV are collected with a zero bias trigger that randomly selects a fraction of the collision events. Table 1 summarizes the zero bias and jet triggers with their integrated luminosity and the offline p T bin thresholds for which the triggers are used. The processes of interest in this analysis are the Z boson production in association with jets, and events uniquely composed of jets produced through the strong interaction (QCD multijet production). Their simulation is performed with multiple combinations of MC event generators to estimate the detector response and systematic uncertainties. The Z boson production in association with jets is simulated at LO with the MADGRAPH5 aMC@NLO v2.2.2 [70] generator with up to four outgoing partons in the matrix element calculations, and fragmented with PYTHIA v8.212 [71]. PYTHIA8 implements a dipole shower ordered in p T and the hadronization of quarks and gluons into final hadrons is described by the Lund string model [72,73]. Double counting of jets from the matrix element calculation and parton shower is eliminated using the MLM merging scheme [74]. This sample is hereafter referred to by MG5+PYTHIA8. The PYTHIA8 parameters for the underlying event are set according to the CUETP8M1 tune [75,76]. A second Z+jet sample is generated at LO with HERWIG++ v2.7.1 [77,78] in standalone mode with the UE-EE-5C underlying event tune [79] to assess systematic uncertainties related to the modelling of the parton showering and hadronization. In HERWIG++, the parton shower follows angular-ordered radiation [80], and the hadronization is described by the cluster model [81]. This sample has one outgoing parton in the matrix element calculations, recoiling against the Z boson. The MADGRAPH5 aMC@NLO and HERWIG++ samples are normalized to the differential cross sections at next-to-LO (NLO) in electroweak and next-to-NLO (NNLO) in strong interactions [82].
The QCD multijet production is simulated separately with two different generator configurations: MADGRAPH5 aMC@NLO combined with PYTHIA8, and HERWIG++ in a standalone mode. In the former sample, again referred to by MG5+PYTHIA8, up to four outgoing partons are allowed in the matrix element calculations using the MLM jet merging scheme. In the standalone HERWIG++ sample, the matrix element has two outgoing partons, and the same underlying event tune is used as in the HERWIG++ Z+jet sample. The LO NNPDF 3.0 [83] parton distribution functions (PDFs) with α S (m Z ) = 0.130 are used in all generated samples, where α S is the strong coupling and m Z is the Z boson mass. Backgrounds from top quarkantiquark pair production, W boson production in association with jets, as well as diboson production, are estimated to contribute less than 1% to the analysis phase space from MC and are not included.
All generated samples are passed through a detailed simulation of the CMS detector using GEANT4 [84]. To simulate the effect of pileup, multiple inelastic events are generated using PYTHIA8 and superimposed on the primary interaction events. The MC simulated events are scaled to reproduce the distribution of the number of interactions observed in the experimental data. The MG5+PYTHIA8 and HERWIG++ simulations have additional selection requirements to ensure that the energy scale of the generated primary interaction events is greater than those of the overlapping pileup events. This can affect the MC p T distribution slope at low p T , within the total MC uncertainty; however, this has a negligible effect on the unfolded results and uncertainties.
Additional samples of Z boson production in association with jets and QCD multijet production, without detector simulation, are compared with the unfolded experimental data distributions. Predictions are computed at LO with PYTHIA v8.243 with the CP2 and CP5 tunes [85], HERWIG v7.1.4 [86] at LO with angular-ordered showering and the CH3 tune [87], and with SHERPA v2.2.10 [88][89][90] at LO, and also with two (one) extra jets for QCD multijet (Z+jet) production with the CKKW matching procedure [91][92][93]. The parton shower in SHERPA is based on the Catani-Seymour dipole factorization [94], and hadrons are formed by a modified cluster hadronization model [95].
As discussed in Refs. [41,85,87,96], the choice of α S (m Z ) used in the final-state shower evolution in MC event generators has a significant impact on the shape of jet substructure observables for quark and gluon jets. Whereas the MG5+PYTHIA8, HERWIG++, and PYTHIA8 CP2 predictions use larger values of α S (m Z ) for the final-state showering of 0.1383, 0.127, and 0.130, respectively, the PYTHIA8 CP5, HERWIG CH3, and SHERPA predictions all use a common value of 0.118. The PYTHIA8 CP2 and PYTHIA8 CP5 tunes differ mainly in their choice of α S (m Z ) that is also used in the underlying event model. They were obtained by tuning the other parameters controlling the underlying event in PYTHIA8 (except α S (m Z )) to describe LHC data [85]. Their comparison thus reflects the difference due to the choice of α S (m Z ), while keeping a good description of the underlying event. In addition to the choice of α S (m Z ), the shapes of jet substructure observables are sensitive to the description of gluon splitting to quarks and colour reconnection in the final state [5].

Event selection
The measurement of the generalized angularities ( Fig. 1) is carried out using two event samples: a Z+jet sample that is enriched in quark jets, and a dijet sample that is enriched in gluon jets. To calculate each angularity for AK4 and AK8 jets, the following event selection is carried out with the AK4 and AK8 jet collections, respectively. The Z+jet selection requires at least two reconstructed muons with p T > 26 GeV, which results in a trigger efficiency exceeding 99%. A Z boson candidate is reconstructed from two oppositely-charged muons with |η| < 2.4 and a dimuon invariant mass within 20 GeV of m Z , requiring p Z T > 30 GeV. At least one reconstructed jet with p T > 30 GeV and |y| < 1.7 that does not overlap with the muons of the Z boson candidate (∆R(j 1 , µ 1,2 ) > R = 0.4 or 0.8 for AK4 or AK8 jets, respectively) is required. The requirement on the jet rapidity ensures that jets are fully within the tracker coverage. The Z boson candidate with transverse momentum p Z T , and the jet with the largest p T , with transverse momentum p j 1 T , are required to have a p T asymmetry |p , and an azimuthal separation ∆φ(j 1 , Z) > 2 radians. The requirements on p T asymmetry and azimuthal separation are introduced to restrict the measurement to a phase space dominated by two-body scattering where a single parton is produced together with the Z boson in the hard scattering process. In this phase space large contributions from higher order QCD corrections, arising when p j 1 T is significantly larger than p Z T [97], are suppressed. The substructure of this jet is measured.
The dijet selection requires at least two reconstructed jets with p T > 30 GeV and |y| < 1.7. The two jets with largest transverse momentum, p j 1,2 T , are required to have a transverse momentum asymmetry |p T ) < 0.3, and an azimuthal separation larger than 2 radians. The angularities are calculated for these two jets. In the following, the jet with smaller |y| is labelled as the central jet, whereas the jet with larger |y| is labelled as the forward jet. Table 2 summarizes the selection criteria of the Z+jet and dijet event samples. The detectorlevel distributions are unfolded to particle-level, i.e. only considering final-state particles. At particle level the same selection criteria are applied except that the muons of the Z boson candidate are removed from the particles used as input to the jet clustering, instead of requiring ∆R(j 1 , µ 1,2 ) > R. Muons at particle level are not "dressed", i.e. radiated photons are not considered part of the muon candidate, since this has a negligible effect on the muons. Although the detector resolution of jet |y| and φ, as well as that of the muon p T , is small enough to maintain equivalent particle-and detector-level selection criteria, the resolution of jet p T is significantly worse. The measurement is defined for jets with p T > 50 GeV at particle level. However, because of resolution effects, jets with p T ∼ 50 GeV at particle level can migrate to lower transverse momentum bins at detector level. The unfolding procedure is therefore carried out for jets with p T > 30 GeV to minimise modelling uncertainties associated with such migrations.
We quantify the fraction of quark and gluon jets in these event samples using the MG5+PYTHIA8 simulation. As discussed in Ref. [5], there is no unique way of defining a quark or gluon jet, and our choice of approach is as follows. The parton from which a jet originates is obtained by first clustering jets from all generated partons (from the matrix element and the shower up to hadronization) together with the generated final-state particles, where the four-momenta of the generated partons are scaled by a very small number. The jets clustered in this way are almost identical to those clustered without generator partons because the added partons, with extremely small energy, do not affect the clustered jet momentum. The quark or gluon with highest p T in the jet then defines the origin of the jet. Figure 2 shows the resulting fraction of AK4 jets which are gluon jets in the Z+jet sample, and central and forward dijet samples as a function of jet p T . The Z+jet sample is enriched in quark jets, making up 64-76% of the jets in the sample. The central and forward dijet samples are dominated by 69-72% gluon jets at lowest p T , and by 55-68% quark jets at highest p T . The central dijet jets have a consistently higher fraction of gluon jets than the forward dijet jets across the entire p T range. The fraction of gluon jets was also computed with HERWIG++, PYTHIA8 CP5, PYTHIA8 CP2, HERWIG7 CH3 and SHERPA. Although the gluon jet fraction in all generators agrees with MG5+PYTHIA8 within 6% in the dijet samples, up to 25% lower gluon fractions are predicted in the Z+jet sample. The spread reflects the rather large uncertainties associated with LO MC generator predictions.
In Fig. 2, we consider all quark flavours, except for top, to identify quark jets. Typically, however, quark jets are considered to be jets initiated only by light-flavour quarks (u, d, s). The Z+jet and dijet samples also have a small contribution of 5-15% from jets initiated by heavyflavour quarks (b, c), whose behaviour was measured in Ref. [41], and is in between that of light-flavour quark jets and gluon jets. Their presence will therefore reduce the expected differences between jets from the Z+jet and dijet samples.
The p T distribution in the CMS data and simulation for the Z+jet and central dijet AK4 jets is shown in Fig. 3. The data distribution in bins of jet p T collected with prescaled triggers (see Table 1) is multiplied by the corresponding prescale factors to reproduce the smoothly falling physical spectrum. The data distribution is compared with predictions from two generator configurations, where the total number of events in the simulations has been scaled to match the data distribution. In both selection regions, the MG5+PYTHIA8 simulation provides a better description of the experimental data, and is therefore used to estimate the detector response consistently in both samples in this paper, whereas HERWIG++ is used to evaluate the systematic uncertainties. The disagreements observed in the Z+jet and central dijet p T distributions indicate a sensitivity to higher order corrections not fully modelled in these two generators. These differences have, however, a smaller effect on the jet substructure distributions that are measured in bins of jet p T .

Jet substructure observables
For the AK4 and AK8 jets of interest, we compute and measure five generalized angularities as defined in Equation 1 that discriminate quark and gluon jets. Of these variables, only LHA, width, and thrust are IRC-safe as a result of having κ = 1 [98]. However, multiplicity and (p D T ) 2 have been widely used for quark and gluon discrimination [51,52]. For β > 1, the antik T jet axis is used to calculate ∆R. For β ≤ 1, ∆R is calculated using the jet axis constructed by the winner-takes-all (WTA) method [98]. The use of the WTA axis significantly simplifies theoretical calculations of the angularities, whereas for angularities with β > 1 computation is feasible with both axis definitions. However, using the anti-k T jet axis results in an observable that is more akin to the jet mass [42]. To calculate the WTA axis, the constituents of the antik T jet are reclustered using the p T -based WTA scheme. The resultant jet has the (y, φ) of the constituent with the largest p T . The calculation of multiplicity includes only constituents with p T > 1 GeV, at both the detector and particle levels.
Each of the angularities is designed to emphasize a particular feature of the jet. The LHA, width, and thrust include both the momentum fraction and the angular distribution of the constituents within the jet. Since the weighting of the angular distribution differs across these variables, comparing them can highlight differences in the constituent positions within the jet. We expect gluon jets to have, on average, larger values of these angularities due to the larger number of constituents that are further from the jet axis. In contrast, (p D T ) 2 places more value on those higher-momentum constituents, irrespective of their position in the jet. With gluon jets typically having more lower-momentum constituents, we therefore expect them to generally have smaller values of (p D T ) 2 . Whereas the three IRC-safe angularities are particularly sensitive to the modelling of perturbative emissions in jets, the other two have larger contributions from nonperturbative effects and are thus subject to larger uncertainties in their predictions and measurements.
In this paper, we measure each substructure observable with various configurations. Each quantity is computed in multiple bins of jet p T , and for two different jet radius parameters, R = 0.4 and 0.8. For each quantity, we define a variant where the observables is calculated using only the charged constituents in anti-k T algorithm ("charged"). While observables computed with both charged and neutral constituents can be described more easily from first-principle calculations, the charged variants can be measured with a better resolution as a result of the high efficiency and precision of the tracking detector.
Additionally, we compute a groomed variant of each observable, where the jet is reclustered with the Cambridge-Aachen (CA) algorithm [99], and then groomed using the modified massdrop algorithm [100,101], known as the soft drop algorithm [56]. This splits the jet into two subjets by undoing the last step of the CA jet clustering. It regards the jet as the final soft drop jet if the two subjets satisfy the condition: where p (1) T and p (2) T are the transverse momenta of the two subjets, R is the jet radius parameter, ∆R 12 = (∆y 12 ) 2 + (∆φ 12 ) 2 is the distance between the two subjets, and z cut and β sd are tunable parameters of the soft drop algorithm, set to z cut = 0.1 and β sd = 0 in this study. If the condition is not met, the subjet with the lower p T is rejected. The declustering procedure is repeated splitting the higher p T subjet into two subjets by undoing another CA clustering step, iterating until the condition is met. This grooming procedure removes soft and wideangle radiation from the jet, thereby making the jet substructure observables more resilient to effects from pileup, underlying event, and initial-state radiation. In all cases, the jet p T used for binning is taken from the original ungroomed anti-k T jet (i.e. using charged+neutral constituents, before calculating the WTA axis, and before grooming). A summary of all variations of the observables and the p T ranges measured in this paper is given in Table 3. Whereas the measurements are carried out for all variations of the observables and made public in HEP-Data record [102], this paper will focus on representative distributions, typically taking the ungroomed charged+neutral distribution with R=0.4 at 120 < p T < 150 GeV as a reference for comparisons. This p T range is chosen as a compromise between lowest p T with highest enrichment in gluons for the dijet central sample and higher p T yielding lower statistical uncertainties for the dijet sample (see Table 1) and lower systematic uncertainties because of better jet energy resolutions.

Unfolding and systematic uncertainties
The experimental data distributions are unfolded to particle level using unregularized unfolding as implemented in the TUNFOLD package [103]. The particle-level distribution and covariance matrix are obtained by minimizing , where A is the particle-to-reconstructed phase space migration matrix, V yy is an estimate of the variance of the observations y, and x is an estimator of the true particle-level values. Here, V yy is a diagonal matrix formed by the square of the bin errors. The migration matrix in the 2D phase space of (p T , λ κ β ) is derived from the MG5+PYTHIA8 simulation, matching particle-level jets with detector-level jets by requiring an angular separation of less than half of the jet radius parameter. Before the detector-level distribution is unfolded, background from misreconstructed jets, i.e. jets that pass the reconstruction criteria, but fail the particle-level selection, are subtracted. The proportion of misreconstructed jets ranges from <10% for jets at 50 GeV to <1% at 1 TeV. The contribution from other standard model processes was negligible. The migration matrix also includes those particle-level jets that fail the reconstruction criteria, to correct for the reconstruction inefficiency.
The binning of the migration matrix includes the detector resolution. We define purity as the fraction of reconstructed events that are generated in the same (p T , λ κ β ) bin, and stability as the fraction of generated events that are reconstructed in the same bin. Both quantities are ≥50% in most bins. The detector-level bins have half the width of the particle-level bins in both p T and λ κ β , to guarantee an over constrained system for the minimization carried out during unfolding. Various sources of systematic uncertainty are considered, which are treated as uncorrelated among each other, but correlated in the (p T , λ κ β ) plane. These are divided into experimental sources, uncertainties related to the theoretical model used to derive the migration matrix, and inherent unfolding uncertainties. Experimental and physics model uncertainties are treated by constructing variations in the migration matrix, and propagating the changes to the final unfolded 1D distribution.
Uncertainties in the calibration of the jet energy scale (JES) and jet energy resolution (JER) measurement [66] are included by varying the jet p T when constructing the migration matrix. Furthermore, we account for variations in the measurement of the particle-flow candidates, which are propagated to the computation of the jet substructure observables in the migration matrix. These include variations of the charged-hadron momentum scale by ±1%, photon energy scale by ±1%, and neutral-hadron energy scale by ±3% [63,66] that are estimated from single-particle response measurements [104]. We find uncertainties related to the angular resolution of the particle-flow reconstruction and charged-hadron reconstruction efficiency and momentum resolution are negligible. The simulation of pileup is tuned to match the particle multiplicities and p T spectra observed in data with a dedicated MC tune [85]. We estimate the uncertainty in jet p T and substructure observables related to pileup by reweighting the simulated events within the uncertainty of the distribution of the average number of interactions per bunch crossing. For the Z+jet region, this corresponds to a ±4.6% shift in the total inelastic cross section [105], whereas for the dijet region it is a +4.6/−13.8% shift to account for larger variations coming from the use of prescaled triggers. Since the luminosity collected by prescaled triggers varies over the data-taking periods, the pileup profile differs from the pileup distribution of the full dataset. Uncertainties related to the inelastic pp model [76] used to simulate pileup interactions are assumed to be negligible, because the variations in the model would produce a smaller impact on the charged particle multiplicity than the changes in the average number of interactions per bunch crossing (discussed above). No uncertainty related to the integrated luminosity is considered since this affects only the overall normalization, and therefore does not affect the normalized distributions presented in this paper. Uncertainties in the muon reconstruction and identification are negligible.
Uncertainties because of the choices of the factorization and renormalization scales (µ F , µ R ) are computed from the envelope of six variations by factors (0.5, 0.5), (0.5,1), (1,0.5), (2,2), (2,1) and (1,2) [106,107]. The uncertainty from the PDFs is determined using the LO NNPDF replica set, where the root-mean-square of 100 pseudo-experiments provided by the PDF set represents the uncertainty. For the MG5+PYTHIA8 prediction, these uncertainties related to scale choices and PDFs mainly vary the prediction of jet momenta, whereas the substructure observables are only mildly modified. The uncertainty in the modelling of the parton shower and hadronization process is estimated using HERWIG++ to construct an alternative response matrix. The statistical uncertainty in the HERWIG++ simulation is small compared to the parton shower and hadronization uncertainty. This uncertainty is treated as single-sided when building the covariance matrix, since the experimental data distributions are generally enveloped by the MG5+PYTHIA8 and HERWIG++ predictions. We tested that rescaling the p T spectrum of the HERWIG++ prediction to match that of the MG5+PYTHIA8 prediction does not significantly change the estimated uncertainty, confirming that this uncertainty mainly captures differences in the shower and hadronization modelling of substructure rather than the modelling of the p T spectrum. We conclude that the mismodeling of the p T spectrum in Fig. 3 has a negligible effect on the result.
The statistical uncertainty in the simulation that is used to derive the migration matrix is propagated through the unfolding and results in a contribution to the covariance matrix. The statistical uncertainties in the presented distributions correspond to those of the input experimental data, whereas statistical uncertainties in the response matrix are considered as part of the systematic uncertainties. The unfolding procedure reproduces the particle-level distributions when unfolding a statistically independent sample from the same generator that is used to derive the response matrix. Figure 4 shows the distributions of two representative jet substructure observables in experimental data, computed using AK4 jets with 120 < p T < 150 GeV in the central dijet and Z+jet regions, at both the detector and particle levels. The discrimination power between quark and gluon jets for each observable can be deduced by comparing the distributions in the Z+jet and central dijet regions that are enriched in quark and gluon jets, respectively. As previously mentioned, gluon jets exhibit typically larger values of LHA, width, thrust, and multiplicity, and smaller values of (p D T ) 2 , than quark jets. The distortion of the distribution by the detector effects can be inferred by comparing the detector-level data distributions with their corresponding particle-level (unfolded) data distributions. The charged (p D T ) 2 (Fig. 4, right) is an example where the difference between detectorlevel data and particle-level (unfolded) data is small as a result of considering only charged constituents. To aid comparison, the mean of each distribution is quoted. The mean is computed from the binned distribution, which is an approximation to the real mean of the underlying distribution, treating experimental data and simulation consistently, both at detector and particle levels. Although the uncertainty quoted in the mean of the detector-level data is purely statistical, the uncertainty quoted for the mean in particle-level (unfolded) data also includes systematic uncertainties.  Figure 4: Detector-level and particle-level (unfolded) experimental data distributions of LHA (λ 1 0.5 ) (left) using charged+neutral constituents, and (p D T ) 2 (λ 2 0 ) (right) using only charged constituents, for jets with 120 < p T < 150 GeV in the Z+jet (red) and central dijet (black) regions. The detector-level data uncertainties are statistical. The particle-level (unfolded) data uncertainties include the systematic components. Also shown is the mean of each distribution, calculated from the binned data. The ratio plots show the ratio of dijet to Z+jet distributions, for both the detector-and particle-level distributions. jets with 120 < p T < 150 GeV in the central dijet region before (left) and after (right) the distribution for the given p T bin is normalized. Although the JES uncertainty has the largest effect in the yield variations due to migration across the jet p T boundaries, it plays a limited role for the normalized jet substructure distributions, where the shower and hadronization model uncertainty is typically dominant. The shape of the shower and hadronization uncertainty is related to some extend to the shape of the LHA (λ 1 0.5 ) distribution of HERWIG++, exhibiting a lower mean than MG5+PYTHIA8. The total systematic uncertainty is computed from the sum in quadrature of statistical and systematic uncertainties in each bin.

Results and discussion
Experimental data distributions unfolded to particle level in the Z+jet and central dijet regions are presented in Figs. 6-8 for the ungroomed substructure observables of AK4 jets with 120 < p T < 150 GeV. The data are compared with MG5+PYTHIA8 and HERWIG++ predictions at particle level. For the IRC-safe observables (LHA, width, and thrust) in the Z+jet region, the data are also compared with predictions from Ref. [57] based on analytic resummation of large logarithms at next-to-leading logarithmic accuracy (NLL), matched to the exact NLO prediction, plus nonperturbative (NP) corrections derived from Sherpa, labeled NLO+NLL'+NP, where the superscript ' of NLL indicates that a matching procedure keeping track of the jet flavour is used. The uncertainty band of the NLO+NLL'+NP prediction includes variations of For intermediate values of λ κ β the prediction has small contributions from NP effects, which dominate at very low and high values of λ κ β . As a simple quantitative assessment of agreement of each prediction with CMS data, values of χ 2 /N bins are quoted, where N bins is the number of bins of the distribution. The χ 2 /N bins is computed including the covariance between the measured cross sections including all statistical and experimental systematic uncertainties. For the NLO+NLL'+NP prediction theoretical uncertainties are taken into account as well, whereas for the MC generator prediction no theoretical systematic uncertainties are considered.
The agreement of the MG5+PYTHIA8 and HERWIG++ simulations with the experimental data varies depending on the observable and sample. In both Z+jet and central dijet regions, the MG5+PYTHIA8 and HERWIG++ predictions tend to envelop the experimental data distributions. The significant difference between the MG5+PYTHIA8 and HERWIG++ predictions reflects the uncertainty in the prediction from MC event generators. These two predictions differ in the modelling of perturbative (matrix element, parton shower) and nonperturbative (hadronization) effects. In the Z+jet region, MG5+PYTHIA8 provides the best description of all angularities. HERWIG++ predicts smaller values than MG5+PYTHIA8 for all observables, except (p D T ) 2 . In the central dijet region, the agreement of both generators is worse than in the Z+jet region. Here, HERWIG++ shows a slightly better description of the experimental data than MG5+PYTHIA8, except for multiplicity and (p D T ) 2 . The NLO+NLL'+NP prediction describes the thrust well, with a χ 2 /N bins below unity. The description of the width distribution is slightly worse, in particular at very low and high values of width, where NP effects contribute. The LHA shows a significant disagreement, indicating the need for further investigation of this prediction. These observations are consistent with the conclusions from Ref. [57] where reasonable control of the thrust and width was obtained, although the LHA was not well under control. In the following, we focus on the LHA distribution where data may guide investigations. Figures 9-10 show multiple variants of LHA. Based on the χ 2 /N bins , the level of data-tosimulation agreement of MG5+PYTHIA8 for AK4 jets with 408 < p T < 1500 GeV in the Z+jet region (Fig. 9, upper left) is worse than in the low-p T Z+jet region (Fig. 8, left). The level of data-to-simulation agreement of MG5+PYTHIA8 for 1 < p T < 4 TeV in the central dijet region (enriched in quark jets, Fig. 9, upper right) is better than in the low-p T central dijet region (enriched in gluon jets, Fig. 8, right). For HERWIG++, the level of agreement changes in opposite directions going from low-p T to high-p T . This is consistent with the hypothesis that the level of agreement is related to the expected gluon fraction in the sample. The NLO+NLL'+NP prediction yields a similar level of agreement at low-p T and high-p T .
The LHA distribution for AK8 jets with 120 < p T < 150 GeV (Fig. 9 lower) is similar to that for AK4 jets. Its level of agreement between data and the MG5+PYTHIA8 and NLO+NLL'+NP predictions is worse than for AK4 jets, although the level of agreement remains more similar for HERWIG++. Compared to LHA from charged+neutral constituents, the charged LHA distribution for ungroomed AK4 jets with 120 < p T < 150 GeV (Fig. 10 upper) is binned more finely taking advantage of the good track resolution, resolving LHA values well below 0.1. The data-to-theory agreement is similar in magnitude and shape above 0.1, though the χ 2 /N bins is worse, suggesting that the majority of the underlying difference to experimental data in the theory prediction may be probed using a charged observable. This measurement is consistent with conclusions in Ref. [57], where the NP corrections were very similar between charged+neutral and charged observables.
The LHA distribution for groomed AK4 jets with 120 < p T < 150 GeV (Fig. 10 lower) is wider than for ungroomed jets. The data-to-theory agreement for LHA in groomed jets is similar compared with that in ungroomed jets. Although grooming is expected to reduce the influence from pileup, underlying event, and initial-state radiation, which are all difficult to model, we find no significant improvement in the description of LHA with grooming. This measurement is consistent with conclusions in Ref. [57], where groomed LHA has a large remaining contribution from NP effects.
To study the behaviour of the measured jet substructure observables in the dimensions summarized in Table 3, we focus on the mean value of each substructure distribution. Figure 11 shows the measured mean of ungroomed LHA (λ 1 0.5 ) for AK4 jets as a function of jet p T , including also predictions from more recent generators. The mean decreases with the jet p T in both the Z+jet and central dijet regions, as a result of constituents being located closer to the jet axis due to the larger Lorentz boost at higher p T . This trend is displayed by all generators. The mean in the Z+jet region is well-described at low p T and overestimated at high p T by MG5+PYTHIA8, whereas all other predictions significantly underestimate it across the entire p T range. In the central dijet sample, MG5+PYTHIA8 and HERWIG++ generator predictions deviate significantly from the measurement. Although MG5+PYTHIA8 predicts a larger mean than that measured in  2 ) and (lower) ungroomed width (λ 1 1 ) in 120 < p T < 150 GeV in the Z+jet region (left) and central dijet region (right). The error bars on the data correspond to the total uncertainties. For the NLO+NLL'+NP prediction, the theory uncertainty is displayed as a red hashed band. The coarse-grained blue hashed region in the ratio plot indicates the statistical uncertainty of the data, and the fine-grained grey hashed region represents the total uncertainty. The lowest bin extends down to λ κ β >= 0. 0.5 ) in 120 < p T < 150 GeV in the Z+jet region (left) and central dijet region (right). The error bars on the data correspond to the total uncertainties. For the NLO+NLL'+NP prediction, the theory uncertainty is displayed as a red hashed band. The coarse-grained blue hashed region in the ratio plot indicates the statistical uncertainty of the data, and the fine-grained grey hashed region represents the total uncertainty. The lowest bin extends down to λ κ β >= 0.
the experimental data, HERWIG++ underestimates it across the whole p T range. The HERWIG7 CH3 and PYTHIA8 CP2 simulations provide the best description, followed by PYTHIA8 CP5 and SHERPA. One can infer from this observation that the data-to-simulation agreement depends on how generators model the difference between quark and gluon jets, consistent with previous measurements of observables related to jet fragmentation [23]. Figure 12 summarizes the behaviour of jet substructure observables across all dimensions under study using the mean of each distribution. The upper plot in Fig. 12 shows the results from CMS data and the MG5+PYTHIA8 and HERWIG++ simulations. Across most observables and dimensions, the measurements are enveloped by the two generator predictions. The behaviour as a function of p T and jet radius parameter as well as the behaviour of charged and groomed observables discussed previously for LHA holds also for the other substructure observables. Figure 12 (lower) compares the same measured data to predictions from more recent generators. In the gluon-enriched sample, HERWIG7 CH3, PYTHIA8 CP5, PYTHIA8 CP2, and SHERPA generally provide a better description than either HERWIG++ or MG5+PYTHIA8. In the quarkenriched sample, MG5+PYTHIA8 provides the best description, followed by HERWIG7, PYTHIA8 CP2, SHERPA, HERWIG++, and PYTHIA8 CP5. Thus, the previous observations in Fig. 12 (upper) of more accurate modelling of quark jet substructure, and less accurate modelling of gluon jet substructure, does not necessarily hold for more recent generators and tunes under study. Improved modelling of gluon jets at the cost of poorer modelling of quark jets is observed. In addition to the more recent generator versions and tunes used for HERWIG and PYTHIA8, an  (   Figure 9: Particle-level distributions of (upper) ungroomed AK4 LHA (λ 1 0.5 ) in 408 < p T < 1500 GeV in the Z+jet region (left) and in 1 < p T < 4 TeV in the central dijet region (right) and (lower) ungroomed AK8 LHA (λ 1 0.5 ) in AK8 120 < p T < 150 GeV in the Z+jet region (left) and central dijet region (right). The error bars on the data correspond to the total uncertainties. For the NLO+NLL'+NP prediction, the theory uncertainty is displayed as a red hashed band. The coarse-grained blue hashed region in the ratio plot indicates the statistical uncertainty of the data, and the fine-grained grey hashed region represents the total uncertainty. The lowest bin extends down to λ κ β >= 0.  (   Figure 10: Particle-level distributions of (upper) ungroomed charged AK4 LHA (λ 1 0.5 ) and (lower) groomed AK4 LHA (λ 1 0.5 ) in 120 < p T < 150 GeV in the Z+jet region (left) and central dijet region (right). The error bars on the data correspond to the total uncertainties. For the NLO+NLL'+NP prediction, the theory uncertainty is displayed as a red hashed band. The coarse-grained blue hashed region in the ratio plot indicates the statistical uncertainty of the data, and the fine-grained grey hashed region represents the total uncertainty. The lowest bin extends down to λ κ β >= 0.  (1) ungroomed AK4 120 < p T < 150 GeV, (2) ungroomed AK4 1 < p T < 4 TeV, (3) ungroomed AK8 120 < p T < 150 GeV, (4) ungroomed charged AK4 120 < p T < 150 GeV, and (5) groomed AK4 120 < p T < 150 GeV; shown for each of the observables LHA (λ 1 0.5 ), width (λ 1 1 ), thrust (λ 1 2 ), multiplicity (λ 0 0 ), and (p D T ) 2 (λ 2 0 ). The central jet in the dijet region is used for the gluon-enriched sample, whereas for the quarkenriched sample the jet in the Z+jet region is used for 120 < p T < 150 GeV, and the forward jet in the dijet region is used for 1 < p T < 4 TeV. The upper and lower plots show the same data distribution compared with different generator predictions. The error bars on the data correspond to the total uncertainties. The error bars on the simulation correspond to the statistical uncertainties.
important difference among the predictions is the choice of α S (m Z ) in the final state shower, to which the jet substructure modelling is sensitive [41]. The PYTHIA8 CP2 prediction using α S (m Z ) = 0.130 predicts systematically larger values of LHA, width, thrust, and multiplicity, and smaller value of (p D T ) 2 , for both quarks and gluons than the PYTHIA8 CP5 prediction using α S (m Z ) = 0.118. Figure 13 shows the ratio of the means in the gluon-and quark-enriched samples, illustrating how well the generators model both the differences between quarks and gluons, and the relative quark/gluon composition of the samples. As previously mentioned, gluon jets exhibit, on average, larger values of LHA, width, thrust, and multiplicity, and smaller values of (p D T ) 2 , than quark jets. At low p T , the ratio of the means in the central dijet and Z+jet regions is significantly larger than unity for LHA, width, thrust, and multiplicity, and significantly smaller for (p D T ) 2 , indicating that these observables have significant separation power between quark and gluon jets. Strikingly, all generators overestimate the difference between quark and gluon jets at low p T when compared with experimental data. At high p T , however, the ratio of the means in the central and forward dijet regions is significantly closer to unity, and all generators give a reasonable description of the ratio that is consistent with that measured in data. The gluonand quark-enriched samples are each expected to have a more equal relative quark/gluon jet composition at this p T (see Fig. 2). The best overall data-to-simulation agreement for the ratio is achieved by SHERPA, followed by HERWIG++, MG5+PYTHIA8, HERWIG7 CH3, PYTHIA8 CP2, and PYTHIA8 CP5. The PYTHIA8 CP2 and CP5 predictions are very similar, showing that the quark and gluon discrimination power is not governed by the choice of α S (m Z ) in the final state shower, but rather by the model used for jet fragmentation. Comparing the behaviour of SHERPA LO to LO+jet, the effect of the additional outgoing partons is an improvement in the description of the ratio of the means in the gluon-and quark-enriched samples. Since the additional outgoing partons can affect the fraction of gluon jets in the signal regions, this demonstrates that the measurement of the ratio provides information not only about the differences between quarks and gluons, but also about the fraction of gluon jets. A higher order prediction of the angularities in the dijet region, matching the precision of the NLO+NLL'+NP predictions in the Z+jet region, may yield better understanding of the source of this disagreement.

Summary
Measurements of distributions of generalized jet angularities in proton-proton collision data taken by the CMS detector at √ s = 13 TeV in dijet and, for the first time, also in Z+jet topologies have been presented. Whereas the dijet topology allows access to a sample of jets that predominantly originate from gluon fragmentation, the Z+jet topology yields a sample enriched in quark-initiated jets. Five generalized angularities are measured to study different features in the modelling of jet substructure. Three infrared-and collinear-safe angularities are particularly sensitive to perturbative emissions in jets, whereas the other two have larger contributions from nonperturbative effects. For the first time, a measurement of angularities with different jet radii, both with and without the application of a grooming algorithm, was carried out to further discriminate between different features in the modelling. Although a subset of these distributions was discussed in this paper, the full range of measurements is made public in HEPData record [102]. The measurements for quark and gluon jets yield values in between the predictions from the MADGRAPH5 aMC@NLO+PYTHIA8 and HERWIG++ simulations. The quality of modelling for the infrared-and collinear-safe angularities is sensitive to the quark and gluon composition of the sample of jets. Although grooming is expected to reduce the influence from pileup, underlying event, and initial-state radiation, which are all  Figure 13: Ratio of the mean of substructure observables in regions with gluon-and quarkenriched jets, for the following configurations: (1) ungroomed AK4 120 < p T < 150 GeV, (2) ungroomed AK4 1 < p T < 4 TeV, (3) ungroomed AK8 120 < p T < 150 GeV, (4) ungroomed charged AK4 120 < p T < 150 GeV, and (5) groomed AK4 120 < p T < 150 GeV; for the observables LHA (λ 1 0.5 ), width (λ 1 1 ), thrust (λ 1 2 ), multiplicity (λ 0 0 ), and (p D T ) 2 (λ 2 0 ). The central jet in the dijet region is used for the gluon-enriched sample, whereas for the quark-enriched sample the jet in the Z+jet region is used for 120 < p T < 150 GeV, whereas the forward jet in the dijet region is used for 1 < p T < 4 TeV. The upper and lower plots show the same data distribution compared with different generator predictions. The error bars on the data correspond to the total uncertainties. The error bars on the simulation correspond to the statistical uncertainties.
difficult to model, we find no significant improvement in the description with grooming. A calculation based on analytic resummation of large logarithms of the collinear-safe angularities in the Z+jet topology at next-to-leading order + next-to-leading logarithm (NLO+NLL') accuracy including nonperturbative effects best describes the thrust with a χ 2 /N bins (where N bins is the number of bins of the distribution) as low as 4.1/5.0, whereas the Les Houches Angularity was described significantly worse with χ 2 /N bins up to 58/8. A comparison of the means of the angularities in quark-and gluon-enriched data samples demonstrated their discrimination power, which was overestimated by all generators under study, showing a clear need for improvements in the simulation.
knowledge the computing centres and personnel of the Worldwide LHC Computing Grid and other centres for delivering so effectively the computing infrastructure essential to our analyses. Finally, we acknowledge the enduring support for the construction and operation of the LHC, the CMS detector, and the supporting computing infrastructure provided by the follow-  [17] ATLAS Collaboration, "Measurement of the charged-particle multiplicity inside jets from √ s = 8 TeV pp collisions with the ATLAS detector", Eur. Phys. J. C 76 (2016) 322, doi:10.1140/epjc/s10052-016-4126-5, arXiv:1602.00988.