A Fragmentation Approach to Jet Flavor

An intuitive definition of the partonic flavor of a jet in quantum chromodynamics is often only well-defined in the deep ultraviolet, where the strong force becomes a free theory and a jet consists of a single parton. However, measurements are performed in the infrared, where a jet consists of numerous particles and requires an algorithmic procedure to define their phase space boundaries. To connect these two regimes, we introduce a novel and simple partonic jet flavor definition in the infrared. We define the jet flavor to be the net flavor of the partons that lie exactly along the direction of the Winner-Take-All recombination scheme axis of the jet, which is safe to all orders under emissions of soft particles, but is not collinear safe. Collinear divergences can be absorbed into a perturbative fragmentation function that describes the evolution of the jet flavor from the ultraviolet to the infrared. The evolution equations are linear and a small modification to traditional DGLAP and we solve them to leading-logarithmic accuracy. The evolution equations exhibit fixed points in the deep infrared, we demonstrate quantitative agreement with parton shower simulations, and we present various infrared and collinear safe observables that are sensitive to this flavor definition.


Introduction
The idea of quark and gluon jets appears so intuitive that it is absolutely necessary to provide a robust, precise theoretical definition of what one means. 1 Of course, the jet flavor exists exclusively perturbatively and cannot be measured directly from detected hadrons, but can have significant consequence for fixed-order matching, parton shower tuning, parton distribution function extractions, new physics searches, etc., or anywhere where the notion of jet flavor is vital for interpretation. A first infrared and collinear (IRC) safe definition of jet flavor was proposed in Ref. [1] (BSZ) in which the k T jet clustering algorithm [2,3] was modified to ensure that soft particles could not affect a jet's flavor, as determined by summing over the flavors of particles in a jet. While this demonstrated features necessary to have a well-defined notion of jet flavor, it left a lot to be desired. In particular, experiments cannot find jets with a flavor-sensitive jet algorithm and anyway nearly exclusively use the anti-k T algorithm [4].
In a companion paper [5], we propose exploiting jet grooming to define a flavor algorithm that is IRC safe through next-to-next-to-leading order, while being experimentally viable. In this paper, we propose a novel definition of jet flavor that can be applied to an arbitrary collection of partons as found by an arbitrary jet algorithm.
A robust, theoretically well-defined jet flavor definition is vital for connecting precision theoretical predictions to data. Numerous predictions for processes involving flavored jets at the Large Hadron Collider (LHC) exist at next-to-leading order (NLO) [6][7][8][9][10][11][12], with most relevant NLO predictions now automated in standard codes [13][14][15][16][17][18][19]. Recently, significant progress has been made on predictions at next-to-next-to-leading order for heavy flavored jets [20][21][22]. This is also the first perturbative order at which just defining the jet flavor to be the sum of the flavors of its constituent particles is ambiguous and highly sensitive to the clustering of soft quarks [1]. Recent measurements of heavy-flavored jets in 13 TeV collisions at the LHC have also been performed, see, e.g., Refs. [23][24][25], so a robust, particle-level definition of jet flavor is essential for maximal utilization of data.
Since the BSZ flavor algorithm was proposed, numerous other proposals for jet flavor have been introduced. Examples of jet flavor definitions (also referred as jet flavor tagging procedures) include other modifications of the clustering algorithm [26]; a "jet topics" definition that extrapolates pure regions of phase space to mixed regions [27,28]; through cuts on continuous, energy-weighted observables [29]; through correlations to identify a gluon jet from initial state radiation [30,31]; by identifying jet flavor correlations between collections of jets nearby in angle [32,33]; or separating jet flavor through an IRC safe definition of particle multiplicity in a jet [34]. Through a Les Houches study [35], the community identified features of what is often meant by jet flavor, presenting a spectrum of ideas ranging from the desired output of event simulation software to a purified region of an experimental analysis. In this context, the CMS collaboration has recently presented measurements of jet angularities [36] as a probe of quark and gluon dynamics, and flavor-sensitive all-order calculations where performed and compared to these data in Refs. [37,38].
In many or even most phenomenological studies of flavored jets, however, the naive definition of jet flavor as the output of some simulation software is used. This definition of jet flavor was used in early jet substructure studies to identify features of quark and gluon jets [39]. More recently, this naive definition of jet flavor has been used to demonstrate that the likelihood ratio observable for discrimination of quark from gluon jets is itself IRC safe [40][41][42][43], which has been validated in machine learning studies of simulated data [44][45][46]. Clearly, significant understanding of flavored jets, distinctions between quark or gluon jets, their correlations, etc., has been established using this naive definition, but it is also severely lacking theoretical soundness. We wish to assuage this issue in this paper.
The first step is to re-evaluate what this naive flavor definition as output of simulation software actually means. In some collider event simulator, the user requests a particular short-distance process, as defined by a leading-order or perhaps next-to-leading-order matrix element. That is, the user defines the process in the deep ultraviolet (UV). Because quantum chromodynamics (QCD) is an asymptotically-free theory, jet flavor is completely unambiguous in the deep UV: if the value of the strong coupling α s → 0, then any jet will consist of a single particle which correspondingly defines the jet's flavor. However, measurements are not performed in the UV, they are performed in the infrared (IR), at long-distances and a useful definition of jet flavor must act on actual measured particles. Ideally, one would like to just define the flavor in the IR as a procedure that returns the UV flavor, which might be expected because perturbative splittings in QCD preserve net flavor. However, it is impossible to have perfect correlation between IR and UV flavor definitions, for a few reasons. Any jet algorithm defines a restricted region of phase space, and QCD only preserves flavor when all of phase space is summed over. Also, particle creation in flow from the UV to the IR is controlled by the renormalization group, and this procedure is not invertible. Having identified this feature of jet flavor, one is freed from the shackles of attempting to reproduce the UV in the IR, and instead focus on providing a new flavor definition exclusively from IR physics. In other words, instead of tagging jets with an IR label and implying the existence of a "true" UV flavor we should simply say we give a jet flavor definition in the IR, which in principle could differ from the UV one. Using this language, bad jet flavor definitions pointed out in Ref. [35] arise only when we refer to jet defined in the IR using flavor definitions that live in the UV.
With this focus, we can also impose desired theoretical properties onto a potential jet flavor definition. To simplify classification, we would like a jet flavor definition that returns only those partons of QCD: gluons and any flavor of quark. For broad utility, we need a jet flavor definition that can be applied to any fixed set of particles, and does not require reassociating constituents of jets. We will demand that the jet flavor is IR safe, and completely insensitive to soft emissions that land in the jet. We can accomplish this by ignoring the contribution of soft particles to jet flavor. By contrast, we will not require collinear safety. By IRC safety of a jet algorithm used at the LHC, the deep collinear region of a jet is necessarily independent of the algorithm. Particle production in the deep collinear region is described by DGLAP evolution [47][48][49], and exactly collinear splittings conserve flavor. So, we desire a jet flavor definition that is inclusive over exactly collinear splittings, but for splittings at any finite angle can associate jet flavor differently than from a UV perspective.
These considerations motivate the following definition of jet flavor. We define a jet's flavor to be the sum of the flavors of all particles whose momenta lie exactly along the direction of the Winner-Take-All (WTA) clustering scheme axis [50][51][52]. In a pairwise jet clustering algorithm, the WTA scheme follows the harder of the two particles at any stage in the clustering, rendering the flavor definition completely insensitive to soft emissions in the jet. The sum of particle flavor along a direction of strictly 0 angular size is necessarily a welldefined quantity that coincides with the flavor of an individual parton of QCD. However, the flavor of the WTA axis does not have a smooth limit as the angle of a collinear splitting goes to 0. Collinear divergences are both universal and local, so this collinear unsafety can simply be remedied by introduction of a flavor fragmentation function. Unlike other fragmentation functions used to define jets [53][54][55][56][57][58], this WTA flavor fragmentation function is completely perturbative, and defined as the probability of pulling a parton in the IR out of an initial parton in the UV. Unlike non-linear evolution equations for the hardest subjet of some radius R 1 in a jet [59], the evolution equations for WTA flavor fragmentation are linear and a small modification to DGLAP evolution.
The outline of this paper is as follows. In Sec. 2, we provide a detailed definition of the WTA flavor algorithm. In Sec. 3, we derive the leading-logarithmic evolution equations for the flow of WTA flavor from the UV, where the hard process initiates the jet, to the IR, after the conclusion of the perturbative parton shower. We also explicitly solve the evolution equations and compare to parton shower Monte Carlo. In Sec. 4, we study a few observables measured about the WTA axis of a jet that are sensitive to its flavor, providing both some simple calculations as well as comparison to event simulation. We conclude in Sec. 5.

WTA Flavor Algorithm
The desires expressed in the introduction motivate the following definition of jet flavor in the IR, at the scale where measurements are made: 1. Cluster and find jets in your collision event with any desired jet algorithm.
2. On a given jet, recluster its constituents with a pairwise, IRC safe, algorithm, using the WTA recombination scheme [50][51][52]. Specifically: (a) For all pairs i, j of particles in your jet, calculate the pairwise metric d ij .
(b) For the pair i, j that corresponds to the smallest d ij , recombine their momenta into a new massless particle ij such that E ij = E i + E j , and the direction of ij is along the direction of the harder of i and j. 2 (c) Replace particles i and j with their combination ij in the collection of particles in the jet.
(d) Repeat clustering until there is a single, combined particle that remains. The direction of this particle corresponds to the direction of the WTA axis of the jet.
3. The sum of the flavors of all particles in the jet whose momenta lie exactly along the WTA axis is defined to be the flavor of the jet.
Any pairwise jet algorithm can be used to recluster the jet, so that the WTA axis of the jet can be defined, and further the jet algorithm to find the jets initially does not in any way need to be related to the jet algorithm used to recluster the jet. For results presented later and comparison to analytic predictions, we will use the k T algorithm [2,3], but other algorithms can be used depending on one's own goals with the definition of jet flavor. Thus, unlike the BSZ flavor algorithm [1], for example, this WTA flavor algorithm in no way modifies the constituents of the jets, and so can be applied to jets in any experimental analysis.
This flavor algorithm is soft safe to any order in perturbation theory; the addition of arbitrarily soft particles into the jet cannot affect the particle that lies along the WTA axis. Thus the WTA flavor does not suffer from the infrared ambiguities that arise starting at next-to-next-to-leading order from simply defining jet flavor as the sum of the flavors of the constituents of the jet. On the other hand, this WTA flavor definition is collinear-unsafe, but in a controlled way. For exactly collinear splittings, the WTA flavor is well-defined because jet flavor is conserved by exactly collinear splittings in QCD. However, for any finite, non-zero, splitting angle, the WTA axis can lie on any particle produced from the splitting, according to their relative energies. This procedure is not collinear safe because in the limit that the splitting angle goes to 0, the WTA axis is not well-defined; or, that the WTA flavor for exactly 0 angle splitting is not the WTA flavor found by taking the small-angle limit.
However, due to collinear universality of QCD and the fact that WTA flavor is defined in the exactly collinear limit, this means that the collinear divergences can be absorbed into a fragmentation function, f i (Q 2 ), for WTA flavor i measured at scale Q 2 . Unlike familiar fragmentation functions which quantify the probability to pull a particular hadron out of a parton, this WTA flavor fragmentation function is completely perturbative, as it is defined as the probability to pull a perturbative parton in the infrared out of a perturbative parton in the ultraviolet. We will derive the leading-logarithmic evolution of the WTA flavor fragmentation function in the following section.

Derivation of UV to IR Flavor Evolution
In this section, we derive the evolution equations of the WTA flavor of a jet from the UV where jets are produced to the scale in the IR where measurements are made. We will forgo any formal factorization justification or proof of this evolution prediction, leaving a more robust construction for future work. As such, our analysis will be restricted to resummation of the ratio of the UV to the IR jet resolution scales to leading-logarithmic accuracy. Related fragmentation function evolution about a recoil-free jet axis was studied in Ref. [60], but not binned by IR flavor.

Leading-Logarithmic Evolution
We can derive evolution equations for the WTA flavor as a function of the dimensionful resolution scale Q 2 . Call f q (x, Q 2 ) the fraction of jets for which a quark lies along the WTA axis, with momentum fraction x. To figure out how the WTA condition affects parton evolution and their flavor, consider a quark with initial momentum a fraction z of the center of mass energy. After a q → qg splitting we denote by x the fraction of initial momentum taken by the offspring quark so that (1 − x ) will be the fraction flowed into the emitted gluon, as represented in Fig. 1. The WTA condition requires that the quark after the splitting is harder than the gluon so that it remains the WTA axis. In other words it implies that zx > z(1−x ) and therefore 2x > z. Analogously the quark has to be harder than the anti-quark if you consider the g → qq splitting as in the second term of Eq (3.1). Thus, moving from a resolution scale Q 2 + δQ 2 to Q 2 , f q (x, Q 2 ) satisfies, to leading logarithmic accuracy, the following evolution equation Here and in the following α s ≡ α s (Q 2 ) is the one-loop running coupling while P kj←i represents the splitting probability for the i → jk process (where i, j and k can be either quarks or gluons according to QCD splittings) as a function of the momentum fraction of the parent parton. Note that this differs from the usual DGLAP evolution [47][48][49] by the constraint 2x > z but since the cut leaves untouched the singular x → 1 region cancellation between real and virtual contributions takes place as usual. This will be taken into account via the regularized splitting functions that at the leading order are and the moments are Here we denoted the splitting functions P ij with the standard notation, where i and j are partons that satisfy the process j → ik and k is defined according to QCD. Now, taking the δQ 2 → 0 limit of Eq. (3.4) we find the evolution equation The evolution equation for the anti-quark as the WTA axis is found from interchanging q ↔q: For n f flavors of quarks with masses below the scale Q 2 , the evolution equation for gluons along the WTA axis takes the form where we introduced the following regularized splitting functions and their moments Noting that the splitting function for gluon emission off of quarks and anti-quarks is identical, P gq (y) = P gq (y) the differential equation for the WTA gluon fraction simplifies to For just determining the jet flavor according to the WTA axis, we do not care about the energy fraction x and its evolution, so we can integrate over it. Focusing on the quark fraction for illustration, we define where now f q (Q 2 ) is just the fraction of jets at scale Q 2 that have a quark (of a specific flavor) that lies along the WTA axis. Integrating over energy fractions in the evolution equations, for the quark they simplify to Using the explicit form previously introduced for the regulated leading-order splitting functions the WTA quark flavor fraction evolution becomes The WTA anti-quark fraction evolution is of identical form: Doing the same on the WTA gluon fraction evolution can be explicitly calculated. C F is the fundamental quadratic color Casimir which takes the value C F = 4/3 in QCD and C A = 3 is the adjoint quadratic Casimir. T R = 1/2 is the normalization of the Killing form of the fundamental representation of SU(3) color. Then, the WTA gluon fraction evolution equation is When summed over n f active quarks, we note that the total WTA flavor is conserved: So, in the evolution equation for gluon flavor, we can replace the sum over quark flavors according to: The gluon evolution equation then reduces to a linear, uncoupled, inhomogeneous differential equation

Gluon Flavor Fraction Solution
This differential equation can be equivalently expressed in terms of the β-function, where Then, the evolution equation is To lowest order, the β-function is Then, the evolution of the gluon fraction to leading-logarithmic accuracy is This has a solution of where Q 2 0 > Q 2 is the scale of the hard process, and Q 2 is the scale where the parton shower ends. In perturbative QCD, Q 2 ∼ 1 GeV 2 , while Q 2 0 is set by the energy or transverse momentum of the jet.
There are a few interesting things to note. First, there is an IR fixed point, where 3 Eventually, the perturbative splitting washes out any information about the initial flavor fraction. However, the running is extremely slow. In 5-flavor QCD, the exponent of the running evaluates to The largest imaginable jet p ⊥ at the LHC is about 5 TeV, and the ratio of the couplings is Then, the largest suppression of the initial flavor fraction is about 0.35. It is important to emphasize the physical interpretation of this fixed point. Any individual jet has an unambiguous, unique flavor in the UV, and then as it flows to the IR, the flavor along the WTA axis can evolve to anything, according to the evolution equations. Thus, for an individual jet, there is no direct way to observe the fixed point except for evolving sufficiently long. However, given an ensemble of jets, one can observe the fixed point. In the UV, if the fraction of jets in the ensemble that are gluons is given byf g , the fixed-point value, then as the ensemble evolves to the IR, the fraction of jets in the ensemble that are gluons does not change. Which individual jets have a gluon along the WTA axis will change in evolving to the IR, but when summed over the ensemble, the total fraction is fixed. This may suggest a way to experimentally observe the fixed point, at least in principle, if the quark and gluon jet fractions in the UV can be controlled by other selection cuts on the events. We leave a detailed study of this possibility to future work.

Quark Flavor Fraction Solution
Using the solution for the WTA axis gluon fraction, the solution to the quark fraction evolution is: .
As observed for gluons, there is an IR fixed point, where for each flavor of quark or anti-quark, perturbative splittings equilibrate all quarks to contribute a fraction of With n f = 5, the total fraction of jets that have any quark along the WTA axis is That is, after enough running, about 2/3 of the jets in any ensemble will have a quark or anti-quark that lies along the WTA axis.

Relationship to Other Flavor Definitions
In Ref. [59], a study of jet flavor was briefly introduced within the context of small subjet radius R resummation. A jet can be reclustered into subjets of radius R, and then one could define the flavor of the jet as the flavor of the hardest such subjet. Unlike the WTA axis flavor fractions, the evolution equations for the hardest subjet are non-linear because which subjet is the hardest requires tracking energy fractions for all splittings. On the other hand, with the WTA axis flavor fractions we only follow a single particle at each splitting, namely the particle that takes higher energy. Hence, this memoryless process is described by linear equations. However, for a jet with at most two particles, the hardest subjet and the particle along WTA axis are identical, and so the probability that a gluon, say, lies along the WTA axis or is the hardest subjet from an initial quark jet p(g|q) has the lowest order expression where the UV and IR scales Q 2 0 and Q 2 are appropriately defined depending on the natural scale of the subjets. Beyond this leading order, the WTA flavor and the hardest subjet flavor differ, and further the non-linearity forbids an analytic understanding of the solutions of the evolution equations in Ref. [59]. Related non-linear evolution equations were also studied in Ref. [63].
A further interesting question presented in Ref. [59] was whether or not there was a fixed point in the subjet flavor evolution equations, given a sufficiently long running. Some numerical evidence for a fixed point was presented there as the flavor fraction solutions appear to approach a limit, but the running was very limited in range, and a clear asymptote was not observed. However, with the WTA flavor definition, with simple, linear evolution equations, it is immediately apparent that there is indeed a fixed point. It would be fascinating to determine if there was a fixed point of the non-linear evolution of the hardest subjet and further if the fixed point is identical to the WTA flavor fixed point. Additionally, as also noted in Ref. [59], the flavor of finite-radius subjets is not soft safe at next-to-next-to-leading order, and so even defining this subjet flavor is problematic. Again, because the WTA axis necessarily lies along the direction of a single particle, the WTA flavor definition is completely soft safe, to all orders.

Comparison to Parton Shower Simulation
These analytic predictions for the WTA flavor fraction evolution can be compared to parton shower simulation. To do this, we generated tree-level pp → gg and pp → cc events at the 14 TeV LHC and showered in Pythia 8.306 [64]. To directly access partonic jet flavor, all hadronization has been turned off, but otherwise, default settings are used. Anti-k T jets [4] with R = 1.0 are found with FastJet 3.4.0 [65] and we require that the pseudorapidity of the jets is less than 2.5. Then, on the jet with the highest transverse momentum, we determine the flavor of the particle that lies along the WTA axis, which we subsequently label as the jet flavor. We find the WTA axis by reclustering with the k T algorithm [2,3] with the WTA recombination scheme. In other contexts, reclustering with the Cambridge/Aachen algorithm [66,67] may be used, but k T is preferred here because of the simple connection between the dimensionful scale of the algorithm and the evolution scale Q 2 . We then plot the IR flavor fraction (determined by this reclustering procedure) for the different UV jet samples (determined by the short-distance scattering processes).
The results are shown in Figs. 2 and 3. In Fig. 2, we plot the WTA gluon flavor fractions as a function of jet transverse momentum from jets that are initially in the UV pure gluons or pure charm quarks. The initial high scale in our leading-logarithmic running expressions is set to be the jet transverse momentum, Q 0 = p ⊥ . The Pythia parton shower terminates at a scale comparable to about 1 GeV, and so we terminate the running of our leadinglogarithmic resummation at a comparable scale. For all plots, we set the low scale to be Q = 2.8 GeV, which we find gives the best global agreement with Pythia, and the value of the strong coupling at the Z pole to be α s (m Z ) = 0.118. An initial gluon jet in the UV means that f g (Q 2 0 ) = 1 and an initial quark in the UV means that f g (Q 2 0 ) = 0. In Fig. 3, the corresponding plots for WTA quark flavor jets are presented with three different scenarios now. The flavors of the quarks in the UV and IR can be the same, the jet in the UV could be a gluon, or the flavor of the jet in the UV and IR could be different. For the cases when the UV and IR flavors differ, we sum over all quark flavors.
Surprisingly good agreement is observed between analytics and Pythia, especially of the general trends. The small disagreement especially for non-charm quark flavor in the IR and initial charm quark jets in the UV is likely due to finite charm mass effects that lead to an over production of light quarks as compared to charms. However, note also the scale on this plot: the difference in this flavor fraction between our analytics and Pythia is no more than half of a percent, which is well below the expected theoretical uncertainty. Note that default Pythia terminates the shower when splittings have a relative transverse momentum of 0.5 GeV and uses a large value of α s , α s (m Z ) = 0.1365. Because the parton shower in Pythia contains numerous parameters that are tuned against one another, we do not attempt to vary

Flavor Evolution in QED
This definition of jet flavor has consequences for characterization of asymptotic states in any massless gauge theory. In quantum electrodynamics (QED), for example, infrared divergences associated with the emission of low-energy photons from initial-state electrons render the Smatrix ambiguous; see, e.g., Ref. [68]. WTA flavor for scattering in QED may help resolve this issue by associating physical, asymptotic states that, in the presence of no interactions, would have been identified as electrons but could instead be identified as photons. We leave a study of the consequences for rendering the S-matrix finite to future work, and here just identify the evolution equations for WTA flavor in QED.
We assume that the only electrically-charged particles are the electron and positron, and we will work at energies well above their mass. In QED, the lowest-order expression for the β-function of the fine-structure constant is The evolution equations for the WTA flavor in this theory of QED can be found from taking limits of the corresponding evolution equations in QCD, Eqs. (3.23) and (3.27), where C A → 0, and n f , T R , C F → 1. In this limit, the photon and electron (or positron) evolution equations have the solutions Just as in QCD, these evolution equations exhibit fixed-points in the deep infrared, though, because QED is not asymptotically-free, the identification of the flavor of states in the deep UV is more subtle. Again, we leave addressing this subtlety for future work. Regardless of the initial flavor in the UV, after sufficiently long running, the fraction of states in the IR that are identified as photons is

Flavor-Sensitive Observables
In this section, we present a number of observables measured about the WTA axis that are sensitive to the WTA flavor prescription. In some cases, we will compare simple, lowestorder analytic calculations to the output of collider event simulation. For all of the simulated data that follows, we generated pp → gg and pp → cc events at the 14 TeV LHC in Pythia 8.306 [64]. Hadronization is turned off, but otherwise all default settings are used for the perturbative parton shower. Jets are found with FastJet 3.4.0 [65], using the anti-k T algorithm [4] with a jet radius of R = 1.0. We demand that jets have a transverse momentum p ⊥J > 1600 GeV and pseudorapidity |η| < 2.5, and measure the corresponding observables on the jet in the event with largest transverse momentum. Unless the flavor is identical in the IR to the UV, we sum over WTA quark flavors and assume there are n f = 5 active quark flavors.

Energy Fraction of the WTA Axis
The first observable we consider is the fraction of the total transverse momentum of the jet that is carried by the particle(s) that lie(s) along the WTA axis. At leading-logarithmic accuracy, these energy fractions can be calculated by solving the modified DGLAP equations of Eqs. (3.4) and (3.10) explicitly. However, we will not present these predictions here as the WTA evolution equations are in general just as complicated as the usual DGLAP equations and can only be solved numerically, in general. The standard procedure for doing this involves transformation to Mellin space, solving the evolution equations, and then transforming back. We present the Mellin moments of the splitting functions relevant for the energy fraction along the WTA axis in App. A. Instead, we will only present the results from Pythia8 simulation, shown in Fig. 4, separated into curves for the different jet flavors in the UV. In general, for initial UV quark jets whose WTA flavor is the same, very little energy is lost to emissions, due to the soft gluon singularity. However, a flavor change from the UV to the IR means that there must have been significant energy lost from the initiating parton, and so those WTA quark transverse momentum fractions are suppressed. For WTA gluons, the transverse momentum fractions are very similar between possible initiating partons in the UV.

Angularities about the WTA Axis
It has been explicitly demonstrated, first in Ref. [69], that IRC safe jet shape observables measured about the WTA axis are very sensitive to the quark versus gluon flavor of a jet. The reason for this is that the WTA axis is insensitive to recoil from soft, wide-angle emissions that displaces the hard, collinear core of a jet from its direction in the deep UV. In general recoil-free observables are good quark versus gluon discriminants for a deep UV definition of flavor, and it is an interesting question if this discrimination persists with the WTA flavor definition. In this section, we study a class of observables called angularities [70][71][72] measured about the WTA axis.
We define the angularities measured about the WTA axis as [51] where the sum runs over all particles in the jet J, and ∆Rb the longitudinal boost invariant angle of particle i with respect to the WTA axisb. For IRC safety, the angular exponent β > 0. We will focus on a particular angularity, referred to as the "Les Houches Angularity" (LHA) λ 0.5 [35] which corresponds to β = 0.5: It has been explicitly demonstrated in analytic calculations [51,69] that a smaller angular exponent is more sensitive for flavor discrimination.
Because the WTA flavor is dependent on the structure of the jet in the IR, the calculation of the distribution of the LHA requires care of the identification of jet flavor. Our procedure here for calculation of the distribution of the LHA binned by WTA flavor will be somewhat naive, and we leave a formal procedure or factorization theorem for systematic improvement to future work. What we will do here is as follows. Working in the collinear approximation, we will calculate the leading-order distribution for the LHA according to WTA flavor, where for the k → ij QCD splitting. For general angularities, the relevant distributions we need are (4.7) Next, to account for the leading logarithmic contributions to further soft gluon emission, we note the following. Soft, wide-angle emissions can only resolve the total color of the collinear region. Therefore, the Sudakov form factor ∆(τ β ) that describes the no-emission probability at leading logarithmic accuracy is independent of the WTA flavor after a collinear splitting. Therefore, we can just multiply the fixed-order collinear distribution by the Sudakov form factor associated with the total color of the collinear region. To leading-logarithmic accuracy, the Sudakov form factor is where the radiator R(τ β ) is Here, C i is the color factor of the collinear region and α s is evaluated at the jet's UV scale, Q 0 = p ⊥J R. Then, our simple predictions for probability distributions p i←k (λ 0.5 ) of the LHA binned by WTA flavor is (4.10) We compare our analytic predictions to the output of Pythia8 in Figs. 5 and 6, for WTA quark and gluon flavor, respectively. In general, good qualitative agreement is observed between calculation and simulation, demonstrating that dominant effects are accounted for in our calculation. There are a few things of note. First, the LHA distribution of charm quark flavored jets in the UV and IR lies at significantly smaller values than the other distributions. This is expected because if the quark flavor is preserved from the UV to IR then the energy of the emissions off of the quark must be relatively small, correspondingly ensuring that the value of the LHA is small, by IRC safety. Second, note that we have no analytic prediction for the LHA when the quark flavor is changed in flowing from the UV to the IR. To change quark flavor requires at least two emissions: a hard gluon must be emitted from the initial quark, and then that gluon must split to two quarks of a different flavor. As the fixed-order components of our calculations are only completed to leading-order, no quark flavor changing effects are included.

The Jet Shape about the WTA Axis
The final interesting observable we will consider is the jet shape [73,74] measured about the WTA axis. The jet shape ψ(∆R) is defined as the energy or transverse momentum fraction of the jet that lies at angle ∆R from the WTA axis: We note that because the WTA axis lies along the direction of a particle, this definition of the jet shape also coincides with the two-point energy-energy correlator [75], where we have fixed one of the particles in the pairwise correlation to be the WTA axis. A related quantity is the integrated jet shape Ψ(∆R) which is the fraction of the energy that lies within ∆R of the WTA axis: When summed over flavor, the jet shape has been calculated about the WTA axis or other recoil-free axes in previous studies [76][77][78]. Correspondingly, the jet shape satisfies a DGLAP evolution equation that relates the energy fractions at different angular scales to one another. We point readers to the references for more details about factorization to all orders and resummation, but here we will just provide a naive implementation of the resummation. The integrated jet shape for a quark along the WTA axis Ψ q (∆R) is the average total energy fraction within some angle ∆R of the jet axis, which is exactly a moment of the fragmentation function f q (x, Q 2 ), for an appropriate definition of the scale Q 2 . The integrated jet shape is then just where we identify Q = p ⊥J ∆R. We use this scale here because it enables the simplest expression for the evolution equations and leave a more detailed analysis, justification, and factorization to future work. By taking the moment of the fragmentation function evolution, the integrated jet shape satisfies the evolution equation ∆R dΨ q (∆R) d∆R = α s π 1 1/2 dy y [P qq (y)Ψ q (∆R) + P qg (y)Ψ g (∆R)] . (4.14) The moments of the splitting functions are Then, the evolution equation is The same exercise can be done for the gluon jet shape Ψ g (∆R) and we find, suppressing the details, the evolution equation .
Unlike for the flavor fraction evolution equations, these differential equations have no obvious conserved quantities so their resummation is a bit more complicated. However, we can reframe the evolution equations for the integrated jet shape as a set of 2n f + 1 coupled differential equations. For quark q i , its evolution equation with the gluon can be expressed as: For a gluon or a charm quark in the UV, these evolution equations can be simplified, when summed over all other quarks. For a gluon jet in the UV and summing over all WTA quark flavors, where we take .
For an initial charm quark in the UV, we introduce two quark flavor integrated jet shapes: the charm quark's, Ψ c (∆R), and the sum over all other quarks Their evolution equations, along with the gluon, are The boundary conditions applied to the solutions of these differential equations are defined as follows. We fix the value of the integrated jet shape when ∆R = R, the jet radius by the relationship When evaluated at the highest scale Q 2 0 , the WTA energy fraction fragmentation function takes the form because the initiating parton carries all of the energy of the jet in the UV. Therefore, when summed over all possible WTA flavors, the integrated jet shape is 1 at its boundary. Then, the boundary conditions for integrated jet shapes for individual WTA flavors are their relative fractions at the low scale: The simulated distributions from Pythia8 of the (differential) jet shape ψ(∆R) for different WTA flavor jets compared to analytic predictions from solving the evolution equations are shown in Figs. 7 and 8. Good qualitative agreement for UV charm quark jets separated into their individual WTA flavors is observed in Fig. 7, for charm and gluon WTA flavors down to the cutoff scale of the parton shower. All curves terminate at about ∆R ∼ 10 −2.5 ∼ 0.0032. With 1600 GeV jets and a cutoff scale of about 1 GeV, the minimum angle between any particle and the WTA axis is about ∆R min ∼ 10 −2.5 . The analytic prediction for non-charm quarks is not shown because with the boundary conditions we impose, its distribution happens to be negative, but very small magnitude, for much of the plotted domain. By contrast, the comparison between Pythia8 and analytic predictions for the jet shape from UV gluon jets in Fig. 8 is substantially different, with even the signs of the slopes of the distributions disagreeing. This disagreement for gluon jets may be a sign of mismodeling of gluon jets in the parton shower or of an incomplete description of these jets through the evolution equations. Ref. [78] which calculated the jet shape about the WTA axis, only analytically studied quark jets in the UV from electron-position collision and also demonstrated good qualitative agreement between both Pythia8 and Herwig7 [15] parton showers. It has also been noted in Refs. [35,79] that different parton shower generators model quark jets similarly as they can be finely tuned to LEP data. Gluon jets, by contrast, lack such a pure tuning sample and distributions of observables measured on gluon jets can differ significantly between different generators. In these plots, we also have turned off hadronization effects in Pythia which may be responsible for some of the differences with our prediction. Pythia is of course tuned to hadronic-level data, and parameters of the perturbative parton shower and hadronization model are coupled and cannot be tuned separately. We leave a detailed study of the description of UV gluon jets to future work.

Conclusions
In this paper, we re-evaluated formal and de facto definitions of perturbative, partonic jet flavor in QCD. The WTA jet flavor is a novel definition that is soft safe to all orders, can be applied to an arbitrary collection of particles, and has simple, linear collinear evolution from the UV to the IR. We presented explicit solutions to the evolution equations to leadinglogarithmic accuracy, and demonstrated that they exhibit fixed points in the deep IR, where the WTA flavor of a jet is completely independent of the jet flavor in the UV. A number of observables are presented that are sensitive to WTA flavor, and we hope that more detailed calculations can be presented in the future.
There are a number of interesting directions to pursue further. In this paper, we have presented the collinear-unsafety of the WTA flavor algorithm as a virtue, as the WTA flavor fragmentation that absorbs the collinear divergences exhibits a number of very nice properties. However, especially for applications of matching high-order fixed order calculations with the parton shower, a fully soft and collinear safe definition of jet flavor may be desired or required. Constructing a fully IRC safe flavor algorithm in which the WTA flavor can be embedded would then potentially marry the nice properties of each. On the other hand, fixed-order calculations match with objects like parton distribution functions in every prediction at a hadron collider, so the WTA fragmentation function framework might already enable a straightforward procedure.
To do this, however, would likely require the construction of a robust factorization theorem for jet flavor for which the WTA fragmentation function would be one piece, convolved with short-distance matrix elements, parton distribution functions, etc. The derivation of a factorization theorem would potentially also render the calculations of observables on WTA flavored jets well-defined at a fixed accuracy, and more importantly, be systematically improvable. A central piece of a factorization theorem might be something like a fragmenting jet function [53][54][55][56][57][58] that describes the parton that initiates a jet from a parton produced in the hard process. Perhaps a fragmenting WTA flavor jet function could be constructed, and its evolution be described by the same modified DGLAP equations we presented here.
Of course this jet flavor definition only exists for perturbative jets that consist of partons, but the WTA axis can be defined for any collection of any type of particles. On a physical jet that consists of hadrons, IRC safe observables could then be measured about the WTA axis and that information could potentially be used to provide an experimental definition of the WTA jet flavor. The WTA axis is remarkably robust to jet contamination [80], but it is expected that the direction of the WTA axis is displaced by hadronization effects by an amount of the order of Λ QCD /p ⊥J , but the effect might scale as a higher power of the QCD scale. Nevertheless, the perturbative WTA flavor should be imprinted in the distribution of hadrons and exploring the correlations between the different regimes would be fascinating.
Along these lines, this problem might be ripe for machine learning from several directions. First, there are multiple possible perturbative flavors as defined through the WTA procedure, and so identification of the WTA flavor exclusively from observables measured on the jet is a multi-label classification problem. Unlike binary discrimination for which the optimal observable is the likelihood ratio by the Neyman-Pearson lemma [81], there is no universal optimal discriminant for multi-label classification. Additionally, understanding the manifestation of WTA flavor on hadrons in jets could be studied on actual data, either in an experimental collaboration or through the CERN OpenData project. We hope that these directions for a deeper understanding of the properties of WTA flavor leads to a re-evaluation of what we want from jet flavor and uncovers further rich structure of QCD jets. We will express the WTA-modified moments of the splitting functions as a deviation from the moments integrated over all energy fractions. That is, we define γ ik (N ) + δγ ik (N ) ≡ Note that all of these modifications to the moments of the splitting function vanish in the N → ∞ limit. These results agree with the WTA anomalous dimensions calculated in Refs. [60,78]. The evolution equations in moment space can be compactly expressed through an evolution equation for each quark flavor i coupled to gluons: γ qq (N ) + δγ qq (N ) 2n f (γ qg (N ) + δγ qg (N )) γ gq (N ) + δγ gq (N ) γ gg (N ) + δγ gg (N ) (A.14) This is implicitly a coupled set of 2n f + 1 differential equations.