Jet Dipolarity: Top Tagging with Color Flow

A new jet observable, dipolarity, is introduced that can distinguish whether a pair of subjets arises from a color singlet source. This observable is incorporated into the HEPTopTagger and is shown to improve discrimination between top jets and QCD jets for moderate to high pT.


Introduction
The impressive resolution of the ATLAS and CMS detectors means that a typical QCD jet at the LHC deposits energy in O(10−100) calorimeter cells. Such fine-grained calorimetry allows for jets to be studied in much greater detail than previously, with sophisticated versions of current techniques making it possible to measure more than just the bulk properties of jets (e.g. event jet multiplicities or jet masses). One goal of the LHC is to employ these techniques to extend the amount of information available from each jet, allowing for a broader probe of the properties of QCD. The past several years have seen significant progress in developing such jet substructure techniques. A number of general purpose tools have been developed, including: (i) top-tagging algorithms designed for use at both lower [1,2] and higher [3] p T as well as (ii) jet grooming techniques such as filtering [4], pruning [5], and trimming [6], which are designed to improve jet mass resolution. Jet substructure techniques have also been studied in the context of specific particle searches, where they have been shown to substantially extend the reach of traditional search techniques in a wide variety of scenarios, including for example boosted Higgses [4,7], neutral spin-one resonances [8], searches for supersymmetry [9], and many others [10][11][12][13][14]. Despite these many successes, however, there is every reason to expect that there remains room for refinement of jet substructure techniques.
Top tagging algorithms have reached a mature level of development in recent years [1][2][3][15][16][17][18][19][20]. A variety of different algorithms employ primarily kinematic observables like jet masses and the W ± helicity angle in increasingly sophisticated ways, allowing for efficient discrimination between top jets and ordinary QCD jets. One direction that has received less attention is the use of observables that do not map onto the kinematics of hard partons. This article introduces a new non-kinematic observable that can be used in top tagging to gain additional background rejection and should have applications outside of top tagging.
A distinguishing aspect of hadronic top decays is that the jets from the W ± decay belong to a color singlet configuration. For p T W m W these jets become close together and will often be clustered within a single jet. The radiation pattern of the W ± decay products, which is controlled by the color configuration, has a distinctive form with most of the radiation clustered in between the two jets. This QCD analog of the Chudakov effect offers an additional handle for top discrimination on top of kinematic observables. The organization of this article is as follows. First the concept of color flow is briefly reviewed, and connection is made to the jet observable pull [21]. Next the proposed "dipolarity" observable is introduced and explored in a general context. Dipolarity is then incorporated into the HEPTopTagger, and the performance of the modified tagger is tested on Monte Carlo event samples. Finally, there is some discussion of the results as well as possible applications of dipolarity beyond top tagging.

Color flow and pull
Within the context of top tagging, several jet observables have been defined that go beyond the kinematics of hard partons. These include a number of jet shape observables such as spherocity [18], planar flow [20,22], N -subjettiness [23], and template overlap [24]. The jet observable defined in the next section draws from the complimentary information offered by color flow. In a QCD event, radiation is controlled by the kinematics of the hard partons as well as by how color indices are contracted together (color flow). Partons whose color indices are contracted together are color-connected, with a color string stretching between the two color sources. For example, the two quarks in the hadronic decay of a color singlet like the Higgs form a color dipole whose radiation pattern is contained primarily within a pair of cones around the two quarks, with a tendency for more radiation to occur in the region between the two quarks [25].
Color flow arguments of this sort have motivated attempts to use QCD radiation patterns for event discrimination, e.g. mini-jet vetoes in Higgs searches [26]. More recently, the authors of [21] introduced a jet observable dubbed pull, which is a p T -weighted vector in rapidity-phi space that is constructed so as to point from a given jet to its color-connected partner(s). Although pull has been shown to offer some discrimination in particle searches [27], it does not seem well-suited to tagging boosted hadronic tops. The most straightforward way to incorporate pull into a top tagging algorithm is to measure the pull of two subjets that reconstruct the W ± and check whether each subjet's pull vector points towards the other subjet. A problem with this approach is that the pull vectors are sensitive to how the W ± jet is broken down into two subjets. For a lopsided distribution of the W ± into two subjets, one of the subjets will consist of only a small handful of calorimeter cells, and as a consequence its pull will be sensitive to statistical fluctuations and contamination. Even for a W ± broken down into two subjets more symmetrically, the pull vectors can depend sensitively on the precise boundary drawn between the two subjets, which itself is a noisy function of the particular jet clustering algorithm being used. Anticipating the potential difficulties of incorporating pull into a top tagging algorithm, we explore an alternative approach in which the entire radiation pattern of the W ± is considered simultaneously. This simple idea leads us to jet dipolarity, which we now define.

Dipolarity
Consider a jet, J, with two subjets, j 1 and j 2 , whose centers are located at pseudorapidities η 1 and η 2 and azimuthal angles φ 1 and φ 2 , respectively. For each calorimeter cell (η i , φ i ) with transverse momentum p T i let R i be the (minimum) euclidean distance in the η-φ plane between (η i , φ i ) and the line segment that runs from (η 1 , φ 1 ) to (η 2 , φ 2 ). Dipolarity is defined as the p T -weighted sum Dipolarity is an infrared and collinear (IRC) safe observable so long as the algorithm used to identify J, j 1 and j 2 is IRC safe. Notice that dipolarity, which is essentially a two-subjet observable, requires the centers of j 1 and j 2 as input, although it does not require that the constituents of J be partitioned between j 1 and j 2 . The centers of j 1 and j 2 can be determined by whatever procedure is convenient for the particular application. For example one could choose the centers of j 1 and j 2 so as to minimize the sum in (1).
Dipolarity will be small when most of the radiation within the jet J occurs in the region between the two subjets j 1 and j 2 and will be large whenever a substantial amount of radiation is found elsewhere. As a consequence of the weighting with respect to R 2 i in (1), D receives large contributions from semisoft radiation away from the cores of j 1 and j 2 . It is this semisoft radiation away from the cores of j 1 and j 2 that is expected to reflect the color configuration of J. The weighting in (1) does not know about the exact radiation pattern of a color singlet; nevertheless, we expect that color singlets that decay into two jets will have small D, while radiation emitted by colored objects will tend to yield larger values of D.
This expectation can be fleshed out more explicitly by considering the emission pattern of a third parton with energy ω from a pair of partons in a particular color configuration, see e.g. [25]. In the eikonal approximation (ω → 0) one finds the radiation function for a color singlet to be while for two partons color-connected to the beam we have instead The resulting radiation patterns are depicted in FIG.2. One sees explicitly that the color singlet has its radiation clustered in the region between the two partons, whereas for partons color-connected to the beam, a substantial amount of radiation is emitted towards the beam. Using the expressions in (2) and (3)  Eikonal radiation pattern dpT /dηdφ for a color singlet with ∆R=0.9, typical for a W ± originating from a top with pT ∼ 300 GeV. Bottom: As above with the partons instead color-connected to the beam (left/right-going parton connected to the left/right beam). Contours differ by powers of e. For the color singlet the radiation is mostly found in the region between the two subjets. For the background-like color configuration, the radiation is pulled towards the beam. Note that an absolute comparison cannot be made between the figures, since the collinear singularities in (2) and (3) are not regulated.
is found from Monte Carlo calculations, expressions (2) and (3) do not yield dipolarity distributions in quantitative agreement with the Monte Carlo. Given the crudeness of the approximations that went into these expressions, this discrepancy is not surprising; a more accurate estimate of D for various color configurations could be obtained by using antenna patterns as in [28]. Dipolarity can be used within the context of top tagging to reduce QCD backgrounds. Consider a collection of fat QCD jets originating from parton branchings with identical kinematics but different color configurations as illustrated in FIG. 3. If one of the QCD jets fakes the kinematics of a top quark decay, then each of the differ- ent color configurations fakes the kinematics equally well. The dipolarities of the subjets, however, will be broadly distributed in accord with their different color configurations. For instance, gluon jets are known to give the largest fake rates for top jets as a consequence of their larger Casimirs which more often result in wide angle branchings with significant mass drops. FIG. 3 illustrates how gluon jets, with their distinct color configurations, radiate differently from top jets. All of this suggests that the dipolarity of the W ± in a hadronic top decay is wellsuited as a discriminant in top tagging algorithms.

HEPTopTagger
To test whether dipolarity makes an effective discriminant, cuts on dipolarity are incorporated into the HEPTopTagger [1,2], which is designed to work effectively at intermediate boost, with 200 GeV p T 800 GeV. The high efficiency of the HEPTopTagger at these p T makes it a good candidate for such a modification because dipolarity cuts are expected to be most effective at intermediate p T . This is because at lower p T contamination from pile-up and the underlying event becomes more of a concern as the top jets become fatter and fatter, while at higher p T the finite resolution of the detector makes it difficult to get an accurate handle on radiation patterns. Furthermore, the multibody filtering implemented by the HEPTopTagger results in accurate reconstruction of the W ± . The HEPTopTagger algorithm is defined as follows: 1. Using the Cambridge/Aachen algorithm cluster the event into fat R = 1.5 jets.
2. Break each fat jet j into hard subjets using the following mass-drop criterion: undo the last stage of clustering to yield two subjets j 1 and j 2 (with m j1 > m j2 ), keeping both j 1 and j 2 if m j1 < 0.8m j and otherwise dropping j 2 ; repeat this procedure recursively, stopping when the m ji drop below 30 GeV.
3. Consider in turn all possible triplets of hard subjets. First, filter 1 each triplet with a resolution R filter = min(0.3, ∆R ij /2). Next, using the five hardest constituent subjets of the filtered triplet calculate the jet mass m filt . Finally, choose the triplet whose m filt lies closest to m t .
4. Recluster the five filtered constituents chosen in step 3 into exactly three subjets j 1 , j 2 , and j 3 ordered in descending p T . Accept the fat jet as a top candidate if it passes any of the following three pairs of mass cuts: Here R min = 85% × m W /m t and R max = 115% × m W /m t .
5. Finally, require that the total p T of the three subjets defined in step 4 be greater than 200 GeV. 2 Dipolarity cuts are introduced into the HEPTopTagger by modifying step 4 above. For a top candidate that has passed one of the three pairs of mass cuts we calculate the dipolarity of the W ± as identified by the mass cut: e.g. for a top candidate that satisfies ii) and ii') the W ± is identified as j 1 +j 2 . If more than one of the pairs of mass conditions is satisfied in step 4, we choose the smaller dipolarity. We find that this procedure performs better than calculating the dipolarity of the pair of subjets that reconstructs m W most accurately.
In addition to introducing dipolarity cuts, we also make cuts on the filtered mass of the reconstructed top, m filt , which is not done in the original HEPTopTagger, 1 That is, recluster the constituents of the triplet using the Cambridge/Aachen algorithm with jet radius R filter . Filtering was introduced in [4]. 2 The HEPTopTagger does not make use of b-tagging, which is a natural extension to the algorithm that can result in significant improvements in background rejection. Since dipolarity cuts are orthogonal to b-tagging, we do not explore the use of b-tagging in this paper. where the cuts have been chosen so as to avoid any explicit mass scales. We introduce cuts on m filt for two reasons. The first is to improve background rejection. The second and main reason is that we are interested in determining whether dipolarity cuts are essentially orthogonal to cuts on kinematic observables. To do this we must ensure that the HEPTopTagger is using a full compliment of kinematic cuts, including cuts on m filt . In a particular application, cuts on m filt may be undesirable. In that case, the inclusion of dipolarity cuts would result in a larger improvement of background rejection.
Note that the j i selected in step 4 contain only the hard substructure of the fat jet. Some amount of soft radiation has been thrown out by filtering and the mass drop criterion. To effectively gauge whether the radiation pattern of the reconstructed W ± is consistent with the expected dipole radiation pattern, it is important to include some of the discarded soft radiation. We find that the criterion used to select the radiation included in calculating the dipolarity of the W ± has significant impact on the ultimate utility of dipolarity as a discriminant. In particular different criteria will lead to dipolarity distributions that are more or less correlated with the kinematic observables considered by the HEPTopTagger. Applying dipolarity in another context would likely require this criterion to be carefully reworked so as to maximize performance. In the present case we find that the following criterion, which aims to capture as much of the radiation emitted by the W ± as possible, while minimizing possible contamination, to be most effective. In addition to the hard radiation from the two W ± subjets, we include all soft radiation contained within the pair of cones centered around the two hard W ± subjets, fixing the radius of the cones to be ∆R/ √ 2, where ∆R is defined between the two hard W ± subjets. Furthermore we exclude any radiation contained within the smallest cone that encloses the hard b subjet. Note that angular ordering implies that the majority of the radiation emitted by the W ± is within the pair of cones of radius ∆R. We choose our cones to be somewhat smaller to minimize contamination from underlying event/pile-up as well as the b subjet. See  FIG. 1 to see this selection criterion at work on a sample top jet. The orange cells in the figure correspond to radiation that has been discarded by mass-drop filtering but that is included in calculating the dipolarity of the candidate W ± .

Testing Dipolarity
In order to make a meaningful comparison between the performance of the HEPTopTagger with and without dipolarity cuts, it is not enough to leave the kinematic cuts employed by the HEPTopTagger fixed at their default values. Instead it is important to optimize the cuts to yield the largest background rejection at each given signal efficiency. This optimization is performed by a custom Monte Carlo code that finely samples the space of cuts. Specifically, a scan is performed at discrete values of R min 70% m W /m t ≤ R min ≤ 98% m W /m t with step sizes of 1% m W /m t . For each value of R min , R max is chosen to be R max = 2.0 × m W /m t − R min . In addition we simultaneously optimize over cuts m t min ≤ m filt ≤ m t max , D ≤ D max , and, in step 4 of the HEPTopTagger, l cut ≤ arctan m 13 /m 12 and b cut ≤ m 23 /m 123 . The remaining cuts in step 4 of the HEPTopTagger are left at their default values, since these are less important for background rejection. Additionally, the two parameters that define the mass drop criterion remain fixed at their default values. The hard substructure cutoff of 30 GeV is sensitive to detector effects, which can only be crudely mocked-up without a full detector simulation. See TABLE I for a sample set of cuts.
We use three different event samples for evaluating the performance of the modified HEPTopTagger. These event samples (with center of mass energy of 7 TeV) belong to a set of benchmark event samples that have been made publicly available by participants of BOOST 2010 [29]. The first event sample is generated by HERWIG 6.510 [30] with the underlying event simulated by JIMMY [31], which has been configured with a tune used by ATLAS. The second is generated by PYTHIA 6.4 [32] with Q 2 -ordering and the 'DW' tune for the underlying event. The third is generated by PYTHIA 6.4 [32] with p T -ordering and the 'Perugia' tune for the underlying event. See [33] for more details. For signal jets we use the hardest jet in each event of a Standard Model hadronic tt sample, excluding jets with |η| > 2.5. For background jets we use the hardest jet in each event of a Standard Model dijet sample, again excluding jets with |η| > 2.5. For jet clustering we use the Cambridge-Aachen (CA) algorithm [34,35] as implemented by FastJet 2.4.2 [36]. In order to simulate the finite resolution of the ATLAS or CMS calorimeters, particles in each event are clustered into 0.1 × 0.1 cells in η −φ space and then combined into massless four-vector pseudoparticles that are fed into FastJet. We have also checked that imposing a low energy cutoff of 1 GeV on each cell results in only a mild degradation of background rejection. mance that results from including dipolarity cuts into the HEPTopTagger as a function of p T at a fixed signal efficiency of S = 20%. Finally, FIG. 6 shows how, for p T ∈ [400 GeV, 500 GeV], dipolarity cuts improve background rejection at signal efficiencies S < ∼ 35%. At a fixed signal efficiency of S = 20%, dipolarity cuts lead to a sizable decrease in the mistag rate for 350 GeV p T 800 GeV with the largest decrease at intermediate p T . For the HERWIG event samples the mistag rate decreases by as much as ∼ 50%, whereas for the PYTHIA samples the mistag rate decreases by a more modest amount, ∼ 30%. Differences in the underlying event (UE) model do not explain this disagreement; for instance, repeating the analysis with the UE turned off results in background mistag rates that are only somewhat smaller, with both HERWIG and PYTHIA dropping by similar amounts. This suggests that the difference between the HERWIG and PYTHIA mistag rates most likely arises from the parton shower.

Discussion
This article has introduced a new jet substructure observable that is useful in discriminating among different color configurations in jets that have large mass drops. This discrimination is of interest, since such jets often arise from decays of boosted heavy particles. Incorporating this discriminant into a top tagging algorithm results in QCD background mistag rates that are lower by as much as 50%; the exact mistag rate, however, shows considerable sensitivity to the details of the parton shower. Specifically HERWIG event samples result in a larger improvement in background rejection than is found for PYTHIA. We suspect that HERWIG, which uses angular ordering, does a better job of simulating the effects of color coherence than PYTHIA, which uses Q 2 or p Tordering in combination with angular vetoes. This could explain why the dipolarity of the W ± is a more discriminating observable in the case of the HERWIG event samples. With an understanding of the origin of this difference, comparisons to measurements at the LHC could help improve the description of QCD radiation. It would be interesting to understand how this difference arises from the details of the parton shower; doing so, however, lies outside the scope of this paper.
Validating how well color flow effects as modeled by Monte Carlo event generators match what is observed in collider experiments is only beginning to be studied actively. Understanding color flow in detail is a difficult problem; for example, QCD predictions for radiation patterns can be affected by non-global logarithms, see e.g. [40]. Therefore validating theoretical predictions against data will be critical in reducing the theoretical uncertainty associated with how dipolarity and other color flow observables are modeled by Monte Carlo calculations. A few color coherence studies performed at the Tevatron showed spatial correlations between the third and second hardest jets in pp collisions, and HERWIG was shown to provide a better description of the data than PYTHIA [37]. More recently, the color of the W ± in tt events was studied, and agreement between theory predictions for jet pull and data was shown [38].
Jet dipolarity should be useful in a broader set of applications to both Standard Model and beyond the Stan-dard Model physics. Possible directions for future research include: (i) applications of dipolarity to a collider search for heavy color singlet resonances that decay to tt; (ii) applications to standard model W ± /Z 0 physics; (iii) applications to heavy color singlet resonances that decay to W + W − or Z 0 Z 0 ; (iv) applications to cascade decays of supersymmetric particles; (v) inclusion of dipolarity into other top-tagging algorithms; (vi) applications to the decay of new particles into novel color configurations such as in the decay of the LSP in supersymmetric models with baryonic R-parity violation; and (vii) modifying D to more closely correspond to the exact radiation pattern of a color singlet. Each of these directions could make an interesting laboratory for further development of jet substructure techniques.