Calculating Pull for Non-Singlet Jets

The pull vector is a jet observable sensitive to the distribution of soft radiation controlled by the color flow in a collider event. We present calculations to leading order in the soft and collinear limits for the pull vector measured between pairs of jets that do not form a color-singlet dipole. Our calculations are presented within the context of $e^+e^-\to$ three jets events, on which pull is measured between the two subleading jets. A subset of these calculations can be re-interpreted as a bottom--anti-bottom quark jet pair in a color octet configuration, which can be a background to Higgs production at large boost. We also present a universal expression for the pull distribution in the high-boost and small jet radius limit. This distribution is controlled by color SU(3) quadratic Casimirs that arise from product representations of pairs of QCD jets.


Introduction
In the search for new physics at the Large Hadron Collider (LHC), a central goal is to measure all quantum numbers and couplings of known Standard Model particles as well as to observe as-of-yet undiscovered particles. At a collider experiment, measuring the mass of a particle is straightforward because detectors measure nearly all of the energy produced in collision. Techniques exist for determining a particle's electric charge by weighting hits in the tracking system by their energy [1,2]. However, direct measurement of a particle's charge under quantum chromodynamics (QCD) or color is subtle, as all of the particles that are actually detected by experiment are color-neutral. A particle's color, or at least if it has non-zero color, is inferred from jet production and quantities measured on jets [3][4][5][6][7].
The observable pull [8] was introduced to be directly sensitive to the flow of color between pairs of jets. The distribution of soft radiation throughout a collision event is determined by the location and connections of color dipoles, the ends of which are the observed jets. Pull quantifies the location of the dominant soft radiation between two jets, thus providing a measure of their color connectedness or their color "pull" on one another. Two jets that form a color-singlet dipole, from the decay of a color-singlet resonance, for example, will dominantly emit soft radiation in the region between the jets, because gluons emitted at wide angles would only see the net zero color of the two jets. By contrast, two jets that are produced from standard QCD processes at the LHC would in general have color connections to the initial state, as well as to whatever other color objects were produced. Thus, radiation about these jets would have a much weaker correlation with the relative locations of the jets of interest.
While pull has a very nice physical interpretation, has been studied extensively in simulation, and has even been measured in experiment [9][10][11] and used in searches [12][13][14][15][16], there has been little theoretical analysis of the observable to honestly justify that it does what it is claimed to do. The first calculations of pull as measured on the two jets from color-singlet decay were presented in Ref. [17], which illustrated the challenges of the calculation and demonstrated that at least in this restricted case, pull acts as expected. Further, pull had been measured in experiment on jets from W boson decay, which enabled direct comparison to data.
Nevertheless, to demonstrate that pull is indeed sensitive to the color connection between two jets, we should demonstrate that there is a significant difference between the calculated distribution of pull for pairs of jets that do and do not form a color-singlet dipole. This is our goal in this paper. The simplest collision event in which pairs of jets do not form color-singlets is e + e − → three jet events, which is what we consider first. For simplicity, we restrict our analysis to leading-order in the strong coupling and to leading power in the soft or collinear limits. We will measure the color-connectedness of the two jets closest in angle with pull. With three jets in the final state, there are therefore three distinct dipoles from which soft radiation can be emitted. We will show that the radiation from two of these dipoles can be accounted for from calculations presented in Ref. [17], with some re-interpretation. We present a new calculation for the distribution of radiation from the dipole which does not include the jet on which pull is directly measured.
Our calculations for pull in three-jet final states can then be leveraged to theoretically understand one of the original motivations for the observable. Identifying the decay of the Higgs boson to bottom quarks at high significance is a challenge at the LHC. Once jets have been tagged as containing bottom quarks, a major background to H → bb decay is the gluon splitting process g → bb. Because this splitting lacks a soft singularity, once the invariant mass of the bottom quarks is selected for, the kinematics of the bottom quarks from Higgs decay and gluon splitting are nearly identical. However, because the Higgs boson is a color singlet, radiation from the bottom quarks is confined to lie between them, distinct from the gluon splitting case. This suggests that pull may provide discrimination power between these two processes, though this has not been observed in simulation [18][19][20].
With an explicit calculation, we are able to understand the discrimination properties of pull in a controlled, well-defined context. Working in the limits in which both the Higgs is highly boosted and the radii of the bottom quark subjets is small, we are able to explicitly calculate the discrimination power of pull for this problem, quantified in the signal vs. background efficiency curve. We conjecture that in these boosted and collinear limits the pull distribution exhibits a universality, exclusively depending on the color configuration of the two nearby jets. In these limits, we are able to write down a master formula for the distri-bution of pull, for any two jets in QCD on which it might be measured. We enumerate all possible irreps of color SU(3) that arise in the product representation of the color of two QCD jets and how that affects the corresponding pull distribution. This explicitly demonstrates that pull is indeed sensitive to the color flow between a pair of jets in a simplified limit. We leave validation of this observation in simulation to future work. This paper is organized as follows. In Sec. 2, we first define the pull observable, presenting a slightly modified definition from that originally proposed that is more natural in the e + e − collision case. In Sec. 3, we present the calculation of the distribution of the pull observable measured on the closest two jets in angle produced in e + e − → three jets events. Sec. 4 expands on these results, and applies them to the problem of discrimination of H → bb and g → bb in the highly boosted limit. We explore pull for all possible representations of SU (3) color that can arise in the product of the color of two QCD jets in Sec. 5, and conclude and discuss future directions in Sec. 6. Appendices contain details of calculations quoted in the body of the paper.

Observable Definitions
For application to jets produced in hadron collisions, Ref. [8] introduced the pull vector t as: Here, the sum runs over the particles i in a jet J of interest, p ⊥ is the momentum transverse to the collision beam, and the vector r i is The jet center is located at rapidity-azimuth of (y J , φ J ) and particle i is located at (y i , φ i ). The jet center is just defined as the vector sum of the momenta of all particles that compose the jet. Pull is therefore defined as a two-dimensional vector in the plane of the cylindrical detector. As used to probe color connections, pull can be measured on two jets and the directions of their vectors compared. Jets with a strong color connection (i.e., jets that form a color-singlet dipole), will have pull vectors that point toward one another, while weakly color-connected jets will have pull vectors with a random relative orientation. For the calculations presented in this paper, however, we use a slightly different definition of the pull vector motivated both by our study of jets in e + e − collisions as well as simplifying analytical calculations. This modified definition of the pull vector was introduced in Ref. [17] and is Now, E is the energy, θ i is the angle from particle i to the jet center, and φ i is the azimuthal angle of the particle about the jet center. In the collinear limit, these two definitions are identical, but in general differ at finite angle. As our investigation mainly concerns with the limit where the jet radius R 1, this modified definition of the pull vector simplifies the calculation without losing the limit behavior of interest. For all results presented in this paper, the azimuthal angle φ i will be defined with respect to the location of a neighboring jet's center. Specifically, if a particle i lies on the line between the jet J and the reference jet, φ i = 0, while if it is on the other side of jet J, φ i = π. This is what we will mean by measuring pull on a pair of jets: we calculate the pull vector of one jet whose components are defined with respect to the location of the second jet.
Rather than the Cartesian components of the pull vector, we will typically express it as its magnitude t and azimuthal angle φ p . We call φ p the pull angle and it is defined as . (2.4) The pull vector is infrared and collinear (IRC) safe, and so its distribution can be calculated order-by-order in perturbation theory. However, the pull angle φ p alone is not IRC safe. We will have to deal with this later as the pull angle is the aspect of the pull vector that is most sensitive to color connections between jets.
3 Pull in e + e − → 3 jet events To illustrate the form of the pull distribution for pairs of jets that do not form a color-singlet dipole, we will study pull as measured on pairs of jets in e + e − → three jets events. The setup of how we measure pull on this final state is illustrated in Fig. 1. For a three-jet final state in the center-of-mass frame, those three jets lie in a plane and the most energetic will be isolated in a hemisphere about the collision point. We refer to this most energetic jet as jet 3. The two lower-energy jets are the pair closest in angle. Of these two jets, jet 1 is the most energetic and the jet on which we measure pull. Jet 2, the lowest-energy jet, defines the axis along which the pull angle is defined to be 0. Fig. 1 shows that the quark is the most energetic jet, and the anti-quark is the second most-energetic jet. Thus, in this configuration, we measure pull on the anti-quark jet with respect to the gluon jet's direction. Fig. 1 also shows the strength of color correlation between the pairs of jets, as measured by the product of each particle's color matrix. Because the total color of the final state is 0, the sum of the color of the quark, anti-quark, and gluon is 0: (3.1) The square of any of the color matrices is just the quadratic Casimir for that particular color representation. In QCD, we have By dotting the individual color matrices with Eq. (3.1), we can solve for the values of the dot products of pairs of color matrices, that correspondingly determine the strength of color connectedness of two jets. We have With this setup, we would like to calculate the differential cross section of the pull vector, or equivalently, the double differential cross section of the pull magnitude t and the pull angle φ p . To do this calculation, we will work to lowest order in the strong coupling α s and in the leading soft or collinear limits, where the pull magnitude is small, t 1. With these approximations, the double differential cross section decomposes into a sum of contributions from soft and collinear emissions: Here, we refer to S qqg (t, φ p ) as the soft function for pull and J(t, φ p ) as the jet function for pull which encode the contribution to pull from soft and collinear emissions, respectively. Note that no rigorous, all-orders factorization of the pull cross section is assumed or implied here; this decomposition simply follows from the factorization of QCD matrix elements into these components. The separation of the emission phase space into soft and collinear regions is arbitrary, so one needs to use some regularization scheme to do it. For pull, dimensional regularization is sufficient to uniquely define the soft and jet functions individually. Thus, calculation of the soft function, for example, proceeds by calculating the distribution of pull on the dimensionally-regulated phase space with the single-emission, eikonal matrix element. There are three possible dipoles off of which a soft particle can be emitted, and we need to sum together each of their contributions. We won't present explicit calculations in the text here, but will discuss how the pieces fit together for the complete soft function.
Following the identification of jets in Fig. 1, one of the dipoles that can emit a soft gluon is the dipole formed from jets 1 and 2. The calculation of this contribution to the soft function was done in Ref. [17], with the only necessary change to the case at hand to replace the Casimir C F in that calculation with (−T 1 · T 2 ) for the three-jet case. Next, the soft gluon could be emitted off of the dipole formed by jets 1 and 3. Note that, relative to jet 2, jet 3 has an azimuthal angle π about jet 1, by momentum conservation. Also, this dipole still contains the jet 1, on which pull is measured, and so we can use the results of Ref. [17] again, with two modifications. First, the Casimir C F in the soft function result of Ref. [17] should be replaced by (−T 1 · T 3 ) for the three-jet case. Second, the pull angle φ p should be rotated by π to represent the orientation of jet 3 with respect to jet 2. The contribution of soft gluons emitted off of the dipole formed from jets 2 and 3 is novel, and requires a new calculation. For this case, the dipole that emits the soft gluon does not contain the jet on which pull is measured. As such, this contribution to the soft function lacks a collinear singularity. This calculation is presented in App. A.
Adding together all three possible sources of soft radiation, the leading-order soft function for pull when t > 0 measured on e + e − → three jet events is: Here, µ is the dimensional regularization scale and the function f (φ p , θ) was calculated in Ref. [17] and is The function g(φ p ) is calculated in App. A and is g(φ p ) = (tan θ 12 2 + tan θ 13 2 ) 2 tan 2 θ 12 2 + tan 2 θ 13 2 + 2 tan θ 12 2 tan θ 13 2 cos(2φ p ) θ ij is the angle between jets i and j. For the coefficient of the logarithmic term in Eq. (3.5) above, we have used the conservation of color to express For a general configuration of the three final state jets, jet 1, on which pull is measured, can be any of the quark, anti-quark, or gluon jet. For the collinear emission contribution to the cross section, we then need to calculate pull as measured on either collinear emissions from quark jets or from gluon jets. The calculation of the quark jet function for pull was presented in Ref. [17], while the gluon jet function is novel, and its calculation is presented in App. B. We can express the jet function for either quarks or gluons as where B 1 comes from hard collinear splittings and is for quark and gluon jets, respectively. Other than the (trivial) sin 2 φ p dependence in the logarithm, collinear emissions are flat in φ p : they are at too small of an angle to know the specific direction of jet 2 and are uncorrelated with any other jets in the event.
Adding the soft and jet functions together, we find the cross section for pull at leading order to be Note that the dimensional regularization scale µ has dropped out; the physical cross section is independent of this unphysical scale.

Inclusive Prediction for Pull
With the differential cross section of the pull vector, we can assume that the soft and collinear contribution of the pull vector factorizes from an e + e − → qqg event. Then, the soft and jet functions are regarded as a conditional probability density of the pull vector for observing an additional soft or collinear emission off from one of the three particles in the final state. Thus, the inclusive differential cross section of the pull vector for an e + e − → qqg + X event can be found by integrating over the phase space of the final state particles dxq dxq denotes the cross section for e + e − → qqg in terms of energy fractions x q and xq, and Θ pull is the phase space constraints for identifying the two jets on which pull is measured. The differential cross section for e + e − → qqg is , (3.13) in which the energy fraction x i of particle i is defined as (3.14) Note that the angular dependence in the jet and soft function frequently appears as tan(θ ij /2). A straightforward algebraic manipulation with dot products shows that With Θ pull , we aim to describe the following algorithms for the pull measurement: (1) identify the two lower energy jets, (2) identify the one of them with a higher energy, and (3) ensure that those two jets are separated by at least 2R such that they are recognized as distinct jets. Therefore, we demand that the following three inequalities hold: These can correspondingly be expressed as in terms of the three-body final state energy fractions x i : The rightmost Θ-function enforces x 2 > x cut and ensures that the cross section is IRC safe. With these phase space restrictions and matrix elements identified, we can then perform the integrals in Eq. (3.12), summing over all possible orderings of the q,q, and g jets. The result of this calculation is shown on the left in Fig. 2. In that figure, we have plotted the relative differential cross section, in terms of the components of the pull vector Recall that φ p = 0 is in the direction of the nearby reference jet. To make this plot, we have set the jet radius R = 0.5, the jet energy cut value x cut = 0.2, and the number of active fermions to n f = 5. The cross section is highly peaked about the positive t x axis, demonstrating that most radiation is present in the region between the two jets. This is to be expected: because of the soft and collinear singularities of the gluon in the e + e − → qqg matrix element, the dominant configuration of the jets is with the gluon as the lowest energy jet. Therefore, pull is measured about the quark or anti-quark jet, with respect to the direction of the gluon jet, and this pair of jets lives in the 3 or3 representation of SU(3) color. The product of color factors in this configuration is −T 1 · T 2 = C A /2 > 0, and so the cross section peaks around t y = 0 and t x > 0 (φ p = 0) because the functions f (φ p , θ) and g(φ p ) have maxima at φ p = 0. It is also interesting to restrict to studying the non-dominant jet configuration, by forcing the gluon jet to have the largest energy of all three final state jets. In this configuration, the jets on which pull is measured, the quark and anti-quark, are in the 8 representation of SU (3) color. As such, the product of their color matrices is negative, −T q ·Tq = C F −C A /2 = −1/6, and so the radiation is dominantly outside of the two jets on which pull is measured. This is illustrated in the plot on the right of Fig. 2. The cross section in this configuration is peaked about the negative t x axis, as expected.

H → bb vs. g → bb
The results presented in the previous section, along with prior calculations [17], can inform the use of pull for identification of H → bb decays. One of the original motivations for pull presented in Ref. [8] was that it could be used to identify Higgs decays to bottom quarks from the dominant background of gluon splitting to bottom quarks. Because there is no soft singularity for g → bb splitting, the kinematics of H → bb and g → bb are nearly identical, once the mass of the pair of bottom quarks is fixed. Thus, observables sensitive to jet kinematics, such as subjet energy fractions, are not useful for this problem. However, the Higgs boson is a color singlet, while the gluon is a color octet, and this distinction is imprinted on the distribution of soft radiation within and about the pair of bottom quarks. Pull is explicitly sensitive to the orientation of soft radiation, and so can be used to improve identification of Higgs decays. In this section, we will study the discrimination power of the pull angle φ p for Higgs decays to bottom quarks. We will work in the highly-boosted limit in which the energy or transverse momentum of the Higgs boson is much larger than its mass m H , so that the bottom quark jets are relatively collimated. Further, we assume that the radii of the individual bottom quark jets R is significantly smaller than the angular separation of the bottom quarks. These limits are relevant and can easily be borne out in practice. For example, for a Higgs boson with p ⊥ = 250 GeV, the angular separation of the bottom quarks θ 12 is approximately Correspondingly, subjet radii for individual bottom quarks of R 0.4 are reasonable as now even down to R = 0.2 is used in experiment [21]. Importantly, for the subjets to be well-defined and non-overlapping, their jet radius R should be less than half of the angular separation of the bottom quarks, θ 12 . Then, taking the R θ 12 1 limits of the expression for the pull distribution for a color singlet from Ref. [17] we find For the color octet configuration, we use the more general expression presented in Eq. (3.11). The configuration of jets we consider produced from e + e − collisions is illustrated in Fig. 3 where we restrict to the configuration in which the quark and anti-quark jets are closest in angle. For expanding to linear order in the jet radius R, we first note that the expansion of the function g(φ p ) is proportional to R 2 . This contribution comes from a dipole that does not include the jet of interest, so for radiation from this dipole to land in the jet, it must hit an uncorrelated region of area R 2 . Continuing, the contribution proportional to T 1 · T 3 expands to linear order in R as f (π + φ p , θ 13 )| R θ 13 = 2R cot θ 13 2 cos(π + φ p ) + O R 2 = −2R cot θ 13 2 cos(φ p ) + O R 2 .

(4.3)
In the high-boost limit of the two bottom quark jets, θ 13 → π, and so cot θ 13 2 → 0. Therefore, this term is also ignorable to linear order in the jet radius R. The only relevant term in the cross section in these limits is proportional to T 1 · T 2 , which is just the product of the color matrices of the bottom quarks. The relevant color factor is Therefore, the pull distribution for the color octet configuration in the small R limit is This has been written with an explicit negative sign in front of the linear in R term because C A /(2C F ) − 1 = 1/8 > 0. Going forward, we will drop the remainder O(R 2 ) as it will be implicit in the following.

Discrimination Power of Pull
These distributions are already informative, but we would like to determine the distribution of the pull angle φ p alone to identify its power as a discrimination observable. Because the pull angle is not IRC safe, we cannot determine the distribution of φ p by just integrating these distributions over t. However, φ p is Sudakov safe [17,22,23], and so we can calculate its distribution by marginalizing against the probability distribution of the pull magnitude, t. That is, for probability distribution p(t) and conditional probability distribution p(φ p |t), the distribution of the pull angle is As long as p(t) has no support around t = 0, this integral is finite. This is indeed the case for the physical (or resummed) distribution, so Eq. (4.6) provides a way to define the distribution of φ p . Here, we will just calculate the conditional probability distribution p(φ p |t) to lowest order in the limits we have discussed.
To lowest order, the conditional distribution p(φ p |t) is just the ratio of the double differential cross section of t and φ p to the cross section for t alone: Above, we had calculated the distributions for both the H → bb and g → bb configurations.
To determine the distribution for t exclusively, we can just integrate over φ p . For either the singlet or octet color configurations, the result is the same in the limits in which we work: Note crucially that the term linear in R integrates to 0. From the expressions for the double differential distributions above, the conditional distributions for the Higgs decay and gluon splitting are: We can then determine the distribution of the pull angle. For the Higgs decay, for example, we have p H→bb (φ p ) = dt p H→bb (t) p H→bb (φ p |t) . (4.11) We are restricting our analysis to linear order in the jet radius R, which will dramatically simplify what follows. As shown above, the pull magnitude distribution p(t) actually has no contribution to it that is linear in R. Thus, in this integral, we only need to keep the terms in p(t) at leading order in the R → 0 limit. This is correspondingly the collinear limit in which the only thing that the pull magnitude depends on is the flavor of the jet of interest.
For both H → bb and g → bb, the jet of interest is always a quark, and so the distribution of t is identical for the two processes, up to corrections of order R 2 : p H→bb (t) = p g→bb (t) + O(R 2 ) . (4.12) To the order we work, we can then safely set p H→bb (t) = p g→bb (t) ≡ p(t), independent of production process. With this simplification, it follows that the pull angle distribution for Higgs decay is Note that the distribution p(t) is normalized and integrates to 1, by definition. We define d 0 as the corresponding moment of the pull magnitude distribution: (4.14) With this notation, it then follows that the gluon splitting distribution is We can estimate the value of d 0 by determining the mean value of the pull magnitude, t . In the collinear limit for a quark jet, this is at lowest order Further, the angle between the two quark jets θ 12 can be approximated from the mass and energy of the singlet resonance. In the collinear or high-boost limit, we have where m H is the mass of the Higgs and p ⊥ is its transverse momentum, assuming it is central in the detector.
Using these results and assuming that the distribution p(t) is highly peaked around its mean, we then have that Evaluating everything except for the jet radius, transverse momentum, and mass, this is approximately where we have used α s = 0.1. Recall that for the two bottom quark jets to be non-overlapping, we require that R m H /p ⊥ , less than approximately half of the angle between the bottom quark jets. So, d 0 is bounded from above by about 0.2. From these distributions, we can then quantify the discrimination power of the pull angle by making a sliding cut on the value of φ p . The distributions of the pull angle from Higgs decay and gluon splitting are plotted in Fig. 4, where we use d 0 = 0.2. On the right we plot the signal versus background efficiency curve or receiver operating characteristic (ROC) curve found from keeping those events that have pull angle below a sliding cut. Because the pull angle peaks at small values for H → bb, this procedure amplifies the signal over the background, as manifest by the ROC curve lying below the diagonal.
To quantify the absolute power of the pull angle to discriminate Higgs from gluon splitting, we can integrate under the ROC curve. This area-under-the-curve (AUC) vanishes for perfect discrimination and takes value 1/2 for identical distributions. The AUC can be calculated as an ordered integral over the two distributions and we find The numerical value on the second line was found from setting R = m H /p T and using the value of d 0 identified earlier. For comparison, this value of the AUC is comparable to the value of the AUC for other jet discrimination problems, such as discriminating quark-from gluon-initiated jets. For that problem, the discrimination power of the jet mass as quantified by the AUC at leading logarithmic accuracy is [24] AUC (q vs. g) = 1 (4.21) We also note, however, that these theoretical prediction of metrics may not be borne out in simulation or experiment, but are at least representative of the possible information available in the pull distribution for discrimination. This simple calculation of course ignores many relevant physical effects that would exist in a real jet and would affect discrimination power. The largest such effect would likely be from soft radiation uncorrelated or only weakly correlated with the direction of the jet. Radiation that lands in the jet from color dipoles that are not color connected to the jet of interest would be approximately uniformly distributed over the area of the jet, with no preferred azimuthal direction. The profile of such radiation in the pull angle φ p would therefore be flat, and work to wash out the distinction between color singlet and octet dipoles studied in this section. To mitigate this effect, one might groom the jet, removing soft, wide-angle radiation in the jet, but preserving radiation collinear to the jet axis. However, collinear radiation is also flat in pull angle φ p to lowest order in the collinear angular size, so this is also likely to wash out these subtle differences. These considerations demonstrate the fragility of color correlations and possibly explain why pull has not been observed to be a useful discriminant in simulation [18][19][20].
Beyond these practical considerations, one would like to have a formal understanding of the accuracy of the distributions and discrimination metrics derived in this section. While the pull angle φ p is not an IRC safe observable, its Sudakov safety means that one can vary renormalization scales in the integrand of Eq. (4.6) to have some estimate of theoretical uncertainties. Such a procedure was also used in in Ref. [25] to estimate theoretical uncertainties for the groomed energy fraction z g [23]. As observed in that case as well, we expect that this scale variation underestimates theoretical uncertainties on the calculated pull angle distribution. Moving away from the small jet radius R limit, we expect that increasing R will likely improve discrimination power to a point. More soft radiation that is sensitive to the dipole configuration will be included in the jets, but so too will more uniform contamination radiation, as mentioned earlier. The effect of contamination radiation scales like the area of the jet, R 2 , while the leading color-correlations between pairs of jets scales like R, so we expect there is some range of R where contamination is small but color correlations are relatively large. Further, higher-order effects like non-global logarithms [26] will likely increase these color correlations present in the pull angle distribution. Non-global effects will pull radiation in the jet toward ends of the dipoles that lie outside of the jet, in principle enhancing differences between the color singlet and octet configurations.

Other Color Representations
The analysis of the previous section suggests a more general result for the pull distribution, appropriate for any combination of a pair of jets on which pull is measured. We denote the jet 1 as the jet on which pull is measured and the jet 2 as the reference jet that defines the origin of the pull angle φ p . In the limits in which the boost of the jet pair is large (θ 12 1) and jet radius is small (R θ 12 ), the leading-order expression for the pull distribution is The coefficient of the term proportional to the jet radius R is just another way to express the product of color matrices: We conjecture that this distribution is universal, in the limits R θ 12 1 described above. A unique aspect of this distribution is that there is a non-trivial term linear in the jet radius, R. For many (if not nearly all) other observables, the first corrections to the distribution due to a finite jet radius are quadratic in R. This is true of the jet mass, for example, and the magnitude of the pull vector, t. The term proportional to R integrates to 0 on φ p ∈ [0, π], and so does not contribute to the pull magnitude's distribution. The universality of this distribution along with the simplicity of color representations of pairs of jets in QCD enables us to explicitly enumerate all possible values for the quadratic Casimir difference, the coefficient of the R term. This is a concrete manifestation and justification for the name "pull": the difference of quadratic Casimirs explicitly corresponds to how soft radiation is pulled around the jets. If the product representation is smaller than the sum of jets' Casimirs, then radiation is pulled between the pair of jets. By contrast, if the product representation is larger than the pair of jets individually, radiation is pushed out of the pair.
A general irreducible representation of SU(3) can be represented with two non-negative integers m 1 and m 2 and denoted as D(m 1 , m 2 ). The dimension for such a representation is The quadratic Casimir of this representation is The fundamental and adjoint representations are 3 = D(1, 0) and 8 = D(1, 1), respectively, and these formulae give the correct values for the dimension and Casimir of these representations. Exhaustive information about the representation theory of SU(3) can be found in Ref. [27]. The jets that form the pair on which pull is measured can only be some combination of quarks and gluons in QCD, so enumerating the possible product representations of color SU(3) that can appear is a simple exercise with SU(3) Clebsch-Gordan coefficients.
In Table 1, we list all possible QCD jet pairs, the irreps of SU(3) color that appear in their corresponding product representation, and then the value of the difference of quadratic Casimirs, using the formula presented earlier. For most of the representations in the table, the difference of Casimirs is positive, indicating that the pull angle distribution peaks at φ p = 0; that is, most radiation lies between the jet pair. Only the highest dimension product representations produce negative Casimir differences, indicating that most radiation in this case is emitted outside of the region between the jet pair. Intriguingly, the 10 representation of the color of a pair of gluons exhibits a perfectly flat pull angle distribution in this limit. Apparently this representation corresponds to exactly the same amount of radiation between as outside of the pair of gluon jets.

Conclusions
The pull observable was designed to be sensitive to the flow of color between pairs of jets and thus sensitive to their product representation of SU (3) color. This has been studied in simulation extensively and motivated measurements, but had not been justified theoretically.
In this paper, we demonstrated that pull, especially the pull angle, takes on a different distribution for pairs of jets in distinct product representations of color. We performed explicit calculations at leading order for pull measured on e + e − → three jets events, studied the discrimination power of pull for identification of H → bb decays, and presented a conjecture for the pull distribution in the high-boost, small jet radius limit.   (3) that appear in the corresponding product representation are presented in the middle column. We have only listed those irreps that correspond to unique values of the quadratic Casimir, e.g., 10 and 10 have the same quadratic Casimir. In the final column, we calculate the "pull" of the product representation; the difference between the individual quadratic Casimirs of the two jets and their product representation.
The results presented in this paper suggest a number of extensions. Observables that are more sensitive to color flow between jets can be designed, motivated by recent work in machine learning for particle physics [28][29][30]. In particular, soft, wide-angle radiation in a jet is most sensitive to the colors of the other jets in an event, and to leading order, the distribution of this radiation is described by eikonal matrix elements. With these eikonal matrix elements, one can construct the theoretically-optimal observable for discrimination of, say, a pair of jets that form a color singlet from a pair of jets that do not. This optimal observable is the likelihood ratio by the Neyman-Pearson lemma [31], and in general is not the pull angle. Designing such observables may also resolve issues regarding residual color flow information in machine learning studies, even for jets on which pull is measured.
Prospects for observation of other hadronic decays of the Higgs boson could potentially be improved by using pull, or related color flow observables. The Standard Model Higgs boson decays to pairs of gluons nearly 10% of the time, and yet the H → gg decay mode is extremely challenging to observe. Because gluons carry more color individually than quarks, the strength of color connection between the gluons in H → gg decays is substantially larger than between the bottom quarks in H → bb decays, as shown in Table 1. This may suggest that it is easier to discriminate the gg color singlet representation from non-singlet color representations; however, this may also mean that identification of the H → gg decay at high boost is more challenging to identify because the two hard prongs in the jet are less well-defined.
Finally, as the pull angle is not IRC safe, its calculation relies on resummed multidifferential cross sections to be well-defined. Thus, ideally one would like the two-dimensional resummed cross section for the pull vector, from which the pull angle can be defined by marginalization. How this resummation would proceed for different color configurations of jets would be interesting to determine. Further, measurements of pull on g → bb decay, for example, would test the extent to which the results derived in this paper were accurate at all at describing reality. This then may point to a whole new class of observables that can be used to study global correlations in particle collision events.
The integrals that remain are finite for → 0, so we can just set = 0 to calculate the corresponding pull distribution for t > 0. We find S 23 (t, φ p ) = (−T 2 · T 3 ) α s π 2 1 t (tan θ 12 2 + tan θ 13 2 ) 2 tan 2 θ 12 2 + tan 2 θ 13 2 + 2 tan θ 12 2 tan θ 13 2 cos(2φ p ) In this expression, we have also symmetrized the first two terms of the splitting function, to isolate the divergence at z = 0. To integrate over z, we can expand the first factor in +-functions: 3) The integral with the δ(z) is just 2C A . For the integral with the +-function, we can set = 0 and we have With these results, the jet function is Only keeping those terms that contribute for t > 0, the jet function is then