Measurement of b jet shapes in proton-proton collisions at s\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$ \sqrt{s} $$\end{document} = 5.02 TeV

We present the first study of charged-hadron production associated with jets originating from b quarks in proton-proton collisions at a center-of-mass energy of 5.02 TeV. The data sample used in this study was collected with the CMS detector at the CERN LHC and corresponds to an integrated luminosity of 27.4 pb−1. To characterize the jet substructure, the differential jet shapes, defined as the normalized transverse momentum distribution of charged hadrons as a function of angular distance from the jet axis, are measured for b jets. In addition to the jet shapes, the per-jet yields of charged particles associated with b jets are also quantified, again as a function of the angular distance with respect to the jet axis. Extracted jet shape and particle yield distributions for b jets are compared with results for inclusive jets, as well as with the predictions from the pythia and herwig++ event generators.


Introduction
Jets, the collimated showers of particles produced by fragmentation and hadronization of hard-scattered quarks or gluons, are long established experimental probes for studies of quantum chromodynamics (QCD) [1]. The internal structure of the jet, defined by the energy, momentum, and spatial distribution of its constituents, is sensitive to the details of the evolution from an initial hard scattering through fragmentation and hadronization into observable hadrons in the final state. The angular distributions of constituent particle yields and jet shapes, studied in this work, are affected by parton fragmentation and hadronization processes. At high transverse momenta (p T ) with respect to the beam direction in the core of the jet, the dominant contribution to these distributions is set by the initial branching of the hard scattered parton which is calculable in perturbative QCD (pQCD). However, for lower p T particles and those at larger radial distances from the jet direction, higher order corrections and nonperturbative processes become of major importance. Characterizing the effect of these additional contributions on the internal structure of jets remains challenging for theoretical calculations [2][3][4].
In this paper, the internal structure of jets is studied at the charged particle level using the data from proton-proton (pp) collisions at a center-of-mass energy of √ s = 5.02 TeV. These data, corresponding to an integrated luminosity of 27.4 pb −1 , were collected by the CMS experiment in 2015. For this study, b jets are defined by the presence of at least one b quark, which is inferred from the properties of b hadron decays. A b jet sample selected via a combined secondary vertex (CSV) discriminator [5], is composed of jets initiated by a single bottom quark, as well as of a contribution from bb pairs produced from gluon splitting. Jet-correlated charged particle transverse momentum distributions, referred to -1 -

JHEP05(2021)054
as jet shapes, are measured as a function of radial distance ∆r = (∆η) 2 + (∆φ) 2 from the jet axis. Here ∆η = η jet − η trk and ∆φ = φ jet − φ trk are the pseudorapidity and azimuthal differences between the jet axis and a given charged particle, respectively. To extend the jet shape measurements further into the region where nonperturbative effects dominate, we use a jet-track correlation technique [6,7]. This method has been shown to reliably subtract the part of the event unrelated to the hard scattering (the underlying event), as well as the contribution of additional pp interactions in the same or nearby bunch crossings (pileup). We study the p T -differential distributions of jet shapes and particle yields for b jets. By comparing these measurements with the results for inclusive jets and with herwig++ [8] and pythia [9, 10] simulations for the b jet and inclusive jet shapes at large angles from the jet axis, this study provides new constraints on pQCD calculations, as well as on the nonperturbative contribution to jet shapes. This measurement also constitutes a baseline for future measurement of the same observable at the same per-nucleon center-of-mass energy in PbPb collisions, which will probe the parton flavor dependence of the interaction of jets with the quark-gluon plasma [11] that is created in high energy heavy-ion collisions.

The CMS detector
The central feature of the CMS apparatus is a superconducting solenoid of 6 m internal diameter, providing a magnetic field of 3.8 T. Within the solenoid volume are a silicon pixel and strip tracker, a lead tungstate crystal electromagnetic calorimeter (ECAL), and a brass and scintillator hadron calorimeter (HCAL), each composed of barrel and endcap sections. Two forward hadron (HF) steel and quartz-fiber calorimeters complement the barrel and endcap detectors, extending the calorimeter from the range |η| < 3.0 to |η| < 5.2. Events of interest are selected using a two-tiered trigger system [12].
In the region |η| < 1.74, the HCAL cells have widths of 0.087 in both pseudorapidity η and azimuth φ. Within the central barrel region of |η| < 1.48, the HCAL cells map onto 5×5 ECAL crystal arrays to form calorimeter towers projecting radially outwards from the nominal interaction point. Within each tower, the energy deposits in ECAL and HCAL cells are summed to define the calorimeter tower energies, which are subsequently used in the particle flow algorithm to reconstruct the jet energies and directions [13]. In this work, jets are reconstructed within the η range of |η| < 1.6.
The silicon tracker measures charged particles within |η| < 2.5. It consists of 1440 silicon pixel and 15 148 silicon strip detector modules. For nonisolated particles with 1 < p T < 10 GeV in the barrel region, the track resolutions are typically 1.5% in p T and 25-90 (45-150) µm in the impact parameter direction transverse (longitudinal) to the colliding beams [14]. A detailed description of the CMS detector, together with a definition of the coordinate system used and the relevant kinematic variables, can be found in ref. [15].

Event selection and simulated event samples
The data used in this analysis were taken in a special low-luminosity running period in which there were reduced levels of pileup (approximately 1.5 events per bunch crossing -2 -JHEP05(2021)054 assuming a total inelastic cross section of 65 mb −1 [16]). The jet samples are collected with a calorimeter-based trigger that uses the anti-k T jet clustering algorithm with a distance parameter of R = 0.4 [17]. This trigger requires events to contain at least one jet with p T > 80 GeV, and is fully efficient for events containing jets with reconstructed p T > 90 GeV. The data selected by this trigger are referred to as "jet-triggered" and are used to study the jet-related particle yields and for data-driven estimation of acceptance effects via an event mixing technique as described in section 5. To reduce contamination from non-collision events, such as calorimeter noise and beam-gas collisions, vertex and noise reduction selections are applied as described in refs. [18,19]. These selections include a requirement for events to contain at least 3 GeV of energy in one of the calorimeter towers in the HF on each side of the interaction point, and to have a primary vertex (PV) with at least two tracks which are consistent with originating from the same vertex within 15 cm of the center of the nominal interaction region along the beam axis (|v z | < 15 cm).
Monte Carlo (MC) simulated event samples are used to evaluate the performance of the event reconstruction, particularly the track reconstruction efficiency, and the jet energy response and resolution. The MC samples of two different pythia tunes (version 6.424 with the Z2 tune [20] and version 8.230 with the CP5 tune [21]) were used to simulate the hard scattering, the parton showering, and the hadronization of the partons. A sample of b jets in MC simulations is obtained from the inclusive simulated QCD jet sample by selecting the jets that are matching to a generator-level b quark within a cone of radius ∆R = 0.3 [22]. The jets from gluon splitting to bb are considered as b jets based on this flavour definition. The Geant4 (10.02p02) [23] toolkit is used to simulate the CMS detector response. An additional reweighting procedure is performed to match the simulated v z distribution to that observed in data. Another QCD jet MC sample is generated using the herwig++ 2.7.1 with the EE5C tune [20] and is also used as a theoretical reference.

Jet and track reconstruction
Jets are reconstructed offline from the particle-flow (PF) candidates [24], clustered using the anti-k T algorithm [17, 25] with a distance parameter of R = 0.4. The PF candidates are reconstructed by the PF algorithm, which aims to reconstruct and identify each particle in an event, with an optimized combination of information from the various elements of the CMS detector. Simulation-derived corrections have been applied to the reconstructed jets to correct the measured energy distortion arising from the limited detector resolution, to the particle level [13,26]. Jets with p T > 120 GeV and |η| < 1.6 are selected to be consistent with a previous study [7].
A widely used type of the jet axis, the anti-k T E-Scheme jet axis, is calculated by merging all the jet candidates, as well as input particles to the jet clustering by simply adding the four-momenta during the clustering procedure in the anti-k T algorithm [27]. However, the jet axis for this work is re-calculated by the winner-takes-all recombination scheme [28,29], which is applied to the constituents found by the nominal anti-k T E-Scheme algorithm for this jet.

JHEP05(2021)054
The b jet candidates for this work are selected by the CSV discriminator [14,22]. The CSV discriminator is a multivariate classifier that makes use of information about reconstructed secondary vertices (SV) as well as the impact parameters of the associated tracks with respect to the primary vertex, to discriminate b jets from charm-flavor and light-flavor jets. The working point selected for this analysis leads to a 65% b jet selection efficiency and 69% purity (the b jet fraction of all jets that passed the CSV selection criteria) from the multijet sample (referring to the background of charm jets and light jets). Possible differences in the purity between data and MC are assessed using a negative-tag technique [5]. This technique selects non-b jets using the same variables and techniques as the standard CSV algorithm both in data and the simulation to extract a scale factor, which indicates the data-to-MC difference. A correction for a bias resulting from the discriminator is discussed in section 5.
In both data and simulation, charged particles are reconstructed using an iterative tracking method [14] based on the hit information from both the pixel and silicon strip subdetectors, permitting the reconstruction of charged particles within |η| < 2.4. The tracking efficiency ranges from approximately 90% at p T = 1 GeV to no less than 90% for p T > 10 GeV. Tracks with p T > 1.0 GeV and |η| < 2.4 are used in this study.

Jet-track angular correlations
To study the distributions of the charged particles associated with jets, a two-dimensional (2D) array of the ∆η and ∆φ values of the tracks relative to the jet axis were produced. This is computed for six bins of p trk T bounded by the values 1, 2, 3, 4, 8, 12, and 300 GeV. Each of these 2D correlations is normalized by N jets , the number of jets in the sample. This procedure, the same one as used in ref.
[30], creates a per-jet averaged ∆η-∆φ distribution of raw charged particle densities for each p trk T : 1) where N same represents the yield of jet-track pairs from the same event. For the jet shape measurements, the 2D correlations are weighted by p trk T on a per-track basis, producing a per-jet averaged ∆η-∆φ distribution of p trk T with respect to the jet axis direction. An event mixing method [7] is applied following the construction of the raw 2D correlations RS(∆η, ∆φ) to account for the shape of single inclusive jet and track distributions and the effects of the detector acceptance for tracks. For this correction, a mixed-event pair distribution M E(∆η, ∆φ) is constructed by using the jets from one event and the tracks from a different event, matched in the vertex position along the beam axis in 1 cm bins: where N mix represents the number of jet-track pairs from the mixed-event.

JHEP05(2021)054
The per-jet associated yield is corrected for the jet-track pair efficiency via the following relation: The ratio M E(0, 0)/M E(∆η, ∆φ) is the normalized correction factor and M E(0, 0) is the mixed event yield for jet-track pairs that are approximately collinear and hence have the maximum pair acceptance. The signal of b-tagged jets S tag (∆η, ∆φ) is then corrected for residual light-flavor jet contamination. We use an approach partially relying on data for the decontamination procedure, expressed via the following equation: where the S decont (∆η, ∆φ) and S mistagged (∆η, ∆φ) are signals of the (decontaminated) b jets and the mistagged light-flavor jets, respectively. The S mistagged (∆η, ∆φ) is approximated by the inclusive jet-track correlation signal S inclusive (∆η, ∆φ) from the data, with a modification for simulating the jet multiplicity bias that is discussed later in this section. The purity c purity is defined as the ratio of the number of tagged true jets to the number of jets tagged by the CSV discriminator in simulations. The resulting decontaminated signal S decont (∆η, ∆φ) has the residual underlying event contribution and uncorrelated backgrounds from tracks unrelated to selected jets. These backgrounds are then removed in a data-driven manner by using the measured chargedparticle yields far from the jet axis in a large-∆η region. The ∆φ distribution averaged over 1.5 < |∆η| < 2.5 is used to estimate the ∆φ dependence of the background contribution to the correlations over the entire |∆η| < 4.0 region and is subtracted from the signal S decont (∆η, ∆φ). After that, the background-subtracted signal pair distribution S(∆r), as a function of radius ∆r is obtained from the integration of the corrected signal pair S(∆η, ∆φ), over a ring area with radius ∆r.
The discriminator used for b tagging relies on the properties of the SVs associated with the jet as input, therefore biasing the jet selection towards jets with a better SV or tracking resolution. This bias, though slight, is present in distributions for both true b jets selected by the tagger, and in the mistagged light-flavor jets contaminating the sample. We calculate corrections for the tagging bias as a function of ∆r from MC simulation by constructing the following per-jet normalized ratios of radial distributions: where S mistagged (∆r), S inclusive (∆r), S all-b (∆r), and S tagged-b (∆r) represent the signal of tracks correlated with the mistagged jets, inclusive light-flavor jets, and b jets, and the tagged b jets, respectively. This bin-by-bin correction is applied to the backgroundsubtracted signal S decont to remove the tagging bias. Finally, simulation-based corrections are applied to account for the jet axis resolution, tracking reconstruction efficiency, and the bias in the charged particle yield and jet shapes -5 -JHEP05(2021)054 that comes from the b-tagging discriminator. A large fraction of tracks associated with b jets originate from an SV and have a slightly different reconstruction efficiency from that of tracks originating from a PV. Therefore, we derive the efficiency corrections as a function of track p T and radial distance ∆r from the MC b jet simulation by taking the ratio of correlated signals built with reconstructed tracks over those with generated tracks. This bin-by-bin correction has been applied to the signal data distributions obtained in the previous step accordingly. All of these procedures correct the data to a particle level which can be compared with theoretical calculations directly.
The fully corrected 2D correlations are integrated over annular rings in the ∆η-∆φ plane (as illustrated in [31]) to study distributions of charged-particle yields Y (∆r): with respect to the jet axis as a function of ∆r for b and inclusive-jet samples and, where N trk is the number of the charged particles from jets. The jet shape distributions ρ(∆r), defined as: where ∆r a and ∆r b define the annular edges of ∆r, δr = ∆r b − ∆r a , and p trk T stands for the p T of the charged particles, are also examined.

Systematic uncertainties
A number of sources of systematic uncertainties are considered, including the tracking efficiency, tagging bias corrections, decontamination procedure, jet reconstruction, acceptance corrections, and background subtraction. The systematic uncertainties are summarized in table 1, and the evaluation of each source of uncertainty is discussed below.
The tracking reconstruction efficiencies for b jet and inclusive jet tracks have been compared to account for the uncertainty in reconstruction efficiency for displaced tracks, and a maximum difference of about 4% was observed. The full magnitude of the observed difference is assigned as a conservative estimation to cover the MC-based tracking reconstruction uncertainty. To study possible differences in track reconstruction between data and simulation, a study of D meson decays was used [32]. The D meson branching fraction ratio of 3-prong to 5-prong decays was calculated in data with MC-based efficiency corrections and compared with the world-average value [33]. The observed difference is used to derive a 4% systematic uncertainty for this source. For the full tracking-related uncertainty these two errors were added in quadrature.
The uncertainty for correcting the bias induced by the CSV discriminator is dominated by the uncertainties in the contributions from gluon-splitting and primary b quarks to the b jet sample. Jets originating from different mechanisms of b-quark production (i.e. flavor creation, flavor excitation, and gluon splitting) can be studied individually in pythia simulations. We note that the fraction of b jets from the gluon splitting in pythia simulation JHEP05 (2021)054 is less than that indicated by data. The corresponding systematic uncertainty has been evaluated by varying this fraction by 20% (as estimated in refs. [5,34]), and the observed 5% difference in the correction from this variation is propagated as an uncertainty.
The decontamination procedure is affected by the uncertainties in the purity estimation. Using the negative tagging method (described in section 4) we have derived the data-to-simulation scale factor, which amounted to about 7% difference in estimated contamination levels. We evaluate the related systematic uncertainty by comparing results obtained with and without the derived scale factor; less than 5% variation is observed in the correlation results. This 5% maximum variation is taken as a systematic uncertainty for the decontamination.
The overall jet energy scale (JES) is sensitive to the relative fraction of quark and gluon jets in the sample. The energy scale uncertainty is found to be 2% for jets in the study in ref. [26]. Therefore, we varied the energy threshold of selected jets by this amount in both directions and saw no statistically significant changes in the measured jet shapes. This is not unexpected since the in-jet multiplicity and the jet fragmentation function change slowly with the jet p T . We also investigated the effects of a more conservative 5% jet energy scale uncertainty by varying the energy threshold of selected jets by 5% in both directions and repeating the analysis to uncover any possible differences with respect to the nominal result. The resulting variations in the correlated track yields are found to be below 2%. Thus, we assigned a 2% uncertainty for this source. The jet energy resolution (JER) data-to-MC difference is about 15% based on the γ+jet studies [35]. The corresponding uncertainty in the reported measurements was evaluated in a data-driven way by smearing the reconstructed jet p T by 15% and repeating the study. The resulting variation in correlation distributions was found to be below 3.5%. In total, a systematic uncertainty of 4% is assigned for the JER-and JES-related effects.
The uncertainties from the mixed-event acceptance correction are estimated by looking for an asymmetry of the sideband regions, which is defined by the difference of the sideband value between the positive and negative ∆η. Additionally, the sideband regions (1.5 < |∆η| < 2.5) that are far away from the jet axis are expected to have no short-range correlation contributions and, thus, to be independent of ∆η. Any deviations from this expectation and the measured asymmetry are used to quantify the related systematic uncertainty, which was found to be between 1 and 2%.
Uncertainties associated with the background subtraction are evaluated by considering the average point-to-point difference between two sideband regions (1.5 < |∆η| < 2.0 and 2.0 < |∆η| < 2.5) following the background subtraction. The background subtraction uncertainty is found to be roughly 3% for the lowest p trk T bin, where the signal-to-background ratio is the lowest, and decreases to negligible levels as functions of p trk T . These systematic uncertainties are treated as uncorrelated, and the total systematic uncertainty is calculated by adding the individual sources in quadrature.  Table 1. Systematic uncertainties in percentage for the measurements of the jet-track correlations.
Where an uncertainty range is given, the upper edge of the range corresponds to the bin with the smallest p trk T values. The sources from the decontamination and tagging bias are exclusive for b jets.

Results
Figure 1 presents the charged-particle yields for inclusive and b jets in proton-proton collisions as a function of the radial distance ∆r from the jet axis. The results are shown with stacked histograms to indicate the intervals in p trk T , and dots to denote the total summed yields in the region 1 < p trk T < 12 GeV. It illustrates that the high-p T charged particles are mostly distributed around the small ∆r region while the larger ∆r region is dominated by the low-p T charged particles. Figure 2 compares the radial distributions of the total charged-particle yields associated with the inclusive and b jets studied in data and in pythia simulations. Total uncertainties of the measurement are dominated by the systematic source. Statistical uncertainties of the signal correlations and data-driven mixed-event acceptance correction contribute to the total statistical uncertainties of the data. The total statistical uncertainties are negligible for most data points, except at large ∆r. Statistical uncertainties of the Monte Carlo samples are accounted for in the evaluation of relevant systematic sources and propagated as part of the assigned systematic errors. It is also worth noticing that the systematic uncertainties coming from the event mixing technique are important in the larger ∆r region. Charged-particle yield distributions for both b and inclusive jets are found to be generally described by pythia predictions, although pythia 6.424 shows a better agreement with the data than that found using the pythia 8.230 predictions. The herwig++ simulation predicts a smaller excess of hadron yields in b jets over inclusive jets compared to what is observed in the data and pythia simulations. Larger charged-particle yields are observed to be associated with b jets as compared with inclusive jets, particularly in the low-∆r region (see figure 2, right). This larger contribution in soft tracks at small radial distance ∆r implies the presence of different fragmentation patterns and decay kinematics between the b jets and inclusive jets.
Measurements of the jet shapes ρ(∆r) are presented in figures 3 and 4.The left and right panels of figure 3 show p T -differential ρ(∆r) distribution for inclusive and b jets, respectively. The comparison between data and simulations from both pythia and her-wig++ is presented in figure 4. We note that, while small-∆r trends are mostly well described by pythia and herwig simulation for both jet selections, the distributions at larger radial distances are only well-estimated by herwig++, indicating a shortage of soft radiative contributions. The right panel of figure 4 shows the ratio of b to inclusive jet shapes for data and simulation.
Observed variations in the ratio of jet shapes indicate a shift of transverse momentum from small to large ∆r for the constituents of the b jets compared to that carried by the particles from inclusive jets. These differences may arise from the dead-cone effect, the suppression of radiation from a charged particles with mass m q and energy E q in the region with emission angle θ m q /E q [36,37], as this phenomenon is expected to be more apparent in b jets than in inclusive jets, which mostly originate from light partons. T bins for data. The shadowed boxes represent the systematic uncertainties for p trk T > 1 GeV, although they are generally too small to be visible.
simulations have better performance capturing the details of jet shapes for both inclusive and b jets distributions in this region, comparing to pythia simulations. Additionally, we observe that a higher fraction of transverse momentum is distributed towards the higher radial distances from the center of the jet for the b jets as compared to the inclusive jet sample.
A similar tendency, albeit insufficient to fully capture this trend, is seen in pythia simulations. pythia 8.230 simulations show a slightly better description than that from pythia 6.426 in the larger ∆r region. On the other hand, herwig++ 2.7.1 predictions capture this trend well, as illustrated in the right panel of figure 4. The observed data to pythia discrepancy in the b-to-inclusive jet shape ratios at large radii may arise from the difference in the gluon splitting contributions between data and simulation, as mentioned earlier [38]. We note that Monte Carlo studies show that b and b jets from gluon splitting result in significantly broader jet shapes than those of inclusive jets.

Summary
The first measurements of charged-particle yields and jet shapes for b jets in proton-proton collisions are presented, using data collected with the CMS detector at the LHC at a center-of-mass energy of √ s = 5.02 TeV. The correlations of charged particles with jets are studied, using the particles with transverse momentum p trk T > 1 GeV and pseudorapidity |η| < 2.4, and the jets with p T > 120 GeV and |η| < 1.6. Charged-particle yields associated with jets are presented as functions of the relative angular distance ∆r = (∆η) 2 + (∆φ) 2 from the jet axis. In these studies, a large number of associated charged particles at low ∆r are found for b jets compared to those for inclusive jets, which are produced predominantly by gluons and light flavor quarks. The trends observed in pp data for particle yield distributions associated with both types of jets are reproduced by pythia calculations (in versions 6.426 and 8.230). In addition to the charged-particle yields, we examine the jet transverse momentum profile variable ρ(∆r), defined using the distribution of charged particles in annular rings around the jet axis, with each particle weighted by its p trk T value. The measured shapes of b jets are broader than those of inclusive jets. The shapes for both types of jets are reproduced by herwig and pythia calculation in the small ∆r region, with herwig++ 2.7.1 giving a better agreement. Moreover, measured transverse momenta distributions at larger ∆r are consistent with the herwig simulations for b and inclusive jets, with at most 1.2 σ data-to-simulation differences observed for b jets. However, this trend is generally underestimated by pythia simulations.
This result provides new constraints on perturbative quantum chromodynamics calculations for flavor dependence in parton fragmentation and gluon radiation, as well as the relative contributions of different processes to b quark production. These measurements are also expected to offer an important reference for future studies of flavor dependence for parton interactions with the quark-gluon plasma formed in relativistic heavy-ion collisions. Open Access. This article is distributed under the terms of the Creative Commons Attribution License (CC-BY 4.0), which permits any use, distribution and reproduction in any medium, provided the original author(s) and source are credited. [32] CMS collaboration, Measurement of tracking efficiency, CMS-PAS-TRK-10-002 (2010).