Boosted objects: a probe of beyond the Standard Model physics

We present the report of the hadronic working group of the BOOST2010 workshop held at the University of Oxford in June 2010. The first part contains a review of the potential of hadronic decays of highly boosted particles as an aid for discovery at the LHC and a discussion of the status of tools developed to meet the challenge of reconstructing and isolating these topologies. In the second part, we present new results comparing the performance of jet grooming techniques and top tagging algorithms on a common set of benchmark channels. We also study the sensitivity of jet substructure observables to the uncertainties in Monte Carlo predictions.

the date of receipt and acceptance should be inserted later Abstract. We present the report of the hadronic working group of the BOOST2010 workshop held at the University of Oxford in June 2010. The first part contains a review of the potential of hadronic decays of highly boosted particles as an aid for discovery at the LHC and a discussion of the status of tools developed to meet the challenge of reconstructing and isolating these topologies. In the second part, we present new results comparing the performance of jet grooming techniques and top tagging algorithms on a common set of benchmark channels. We also study the sensitivity of jet substructure observables to the uncertainties in Monte Carlo predictions.

Introduction
The LHC has started to explore the multi-TeV regime. The production of presently unknown particles is perhaps the most exciting prospect for the general purpose experiments ATLAS [1] and CMS [2]. In both experiments, searches for Physics Beyond the Standard Model form a key element of the rich physics programme.
At the LHC, many of the particles that we considered to be heavy at previous accelerators will be frequently produced with a (transverse) momentum greatly exceeding their rest mass. Good examples are the electro-weak gauge bosons W ± and Z 0 , the top quark, the Higgs boson or bosons and possibly other new particles in the same mass range. The abundant presence of heavy SM particles will yield promising signatures for searches for physics Beyond the Standard Model (BSM physics). When these boosted objects decay they form a highly collimated topology in the detector. Algorithms and techniques developed for the reconstruction and isolation of objects produced at rest are often inadequate for their boosted counterparts. New tools must be developed to fully benefit from the potential of these states.
In recent years, a fruitful dialogue has developed between theorists and the LHC and Tevatron experimentalists. A number of workshops have fuelled collaboration in the investigation of new signatures and the development of experimental techniques. The series started with the BOOST09 [3] workshop at Stanford National Accelerator Centre (SLAC) and continued at the Jet Substructure workshop at the University of Washington [4] in January 2010. From the 22nd to the 25th of June of 2010, BOOST2010 [5] was held at the University of Oxford.
At the BOOST2010 workshop, two working groups were set up to concentrate on the leptonic and hadronic decays of boosted objects. Mixed decays to quarks and leptons (i.e. top decay to W ± b followed by W ± → l ± ν l ) were also covered by the hadronic working group. Both working groups met in several parallel sessions during the workshop and organised follow-up meetings in the subsequent months. In this paper we present the report of the hadronic working group.
Hadronic boosted objects have received considerable attention recently and the available literature is steadily increasing. We start this report with three brief sections that provide a review of the most important developments.
Many groups have studied the phenomenology of boosted hadronic topologies, discovering novel ways of performing SM measurements and BSM searches. In section 2 we present a review of the results published to date.
The reconstruction of hadronic decays of boosted W, Z bosons and top quarks (and new particles with similar mass) is particularly challenging. The partons formed in the decay are typically too close to be resolved by a jet algorithm 1 . In this case only an analysis of the substructure of the fat-jet can reveal its heavy-particle origin. We give an overview, in section 3, of the increasingly sophisticated tools developed for this purpose.
In the final review section, Section 4, we present a brief review of the experimental status of jet substructure in past experiments.
In the sections thereafter, we present new results obtained in studies initiated during the workshop. In section 5, we present the Monte Carlo samples generated in the hadronic working group. We make these samples available to serve as a benchmark test for the performance of new techniques.
In section 6, we return to the jet grooming techniques introduced in section 3. We present an estimate of their performance on the benchmark samples.
Jet substructure may be subject to considerable uncertainties in the predictions of popular Monte Carlo models. The sensitivities of the most important observables to variations in the parton shower model, the underlying event and detector effects are investigated in section 7.
Finally, we compare the performance of several toptagging algorithms in section 8.
The review sections, benchmark samples and results are intended to foster the exciting new developments that boost our discovery potential. We hope that this report be an incentive for further work and in particular for studies of the substructure of highly energetic jets in the earliest LHC data. 1 To be quantitative, consider the following rule of thumb for a two-body decay: To resolve the two partons of a X → qq decay, a radius (or more generally a jet size) of R < 2mX /pT must be chosen. For pT ≫ mX , the value for R must be chosen exceedingly small. For mX = 80 GeV, the minimal R is equal to 0.4 for a transverse momentum of 400 GeV. To set the scale: 95 % of the energy in a 400 GeV jet is contained in a jet of size 0.4 [6,7].

Models and signatures
Many groups have studied the phenomenology of boosted hadronic decays, considering two major sources: -In BSM physics scenarios, where a heavy resonance decays to particles with intermediate masses that then decay to light quarks, i.e. X heavy → Y interm. → jets, where X heavy is an unknown heavy resonance and Y interm. may be a SM particle with intermediate mass (W , Z, top) or a BSM particle. In scenarios where m X ≫ m Y the intermediate particles are naturally boosted (i.e. p T ≫ m). -Even if only a relatively small fraction of signal events are produced with large transverse momentum, focusing on those events can be a superior way of disentangling the signal from the backgrounds. Phenomenological studies [8,9,10,11,12,13] have indeed shown that, because of the kinematic features, event reconstruction efficiencies, b-tagging efficiencies [9,14] and the jet energy resolution can be improved and that combinatorial problems in the identification of the decay products of Y interm. are reduced.
Without going into the details of each study, in this section we briefly review which specific scenarios have been addressed in phenomenological studies. We establish the extent to which the use of the techniques of section 3 have been shown to increase the discovery potential of the LHC experiments. The four subsections correspond to four categories of boosted objects: boosted Higgs bosons, boosted top quarks, electroweak gauge bosons and finally boosted BSM particles.

Boosted Higgs bosons
The main purpose of running the LHC is to understand the nature of electroweak symmetry breaking, e.g. by confirming or modifying the minimal mechanism of the SM. The detection of a light SM Higgs boson (m H < 130 GeV) is particularly difficult and until recently relied mainly on two channels: The dominant gluon-initiated production mechanism followed by Higgs decay to photons, pp → H → γγ, or production through vector boson fusion followed by Higgs decay to τ leptons, pp → Hjj → jjτ − τ + .
Butterworth, Davison, Rubin and Salam [8] (BDRS) studied the case of a light Higgs boson (m H ∼ 120 GeV) produced in association with an electroweak gauge boson. The leptonic decay of the associated vector boson provides an efficient trigger for these events. Cuts on the leptonic decay products ensure that the electroweak gauge boson and the recoiling Higgs boson are produced with a large transverse momentum. The Higgs boson decays into a collimated bb pair with a large branching fraction. The analysis employs the Cambridge/Aachen (C/A) jet algorithm [15,16] to investigate the jet substructure of the single, merged jet produced by the two b quarks. This study demonstrated that V H production can be a discovery channel, with a significance S/ √ B ≃ 4.5, assuming 30 fb −1 of data and √ s = 14 TeV. The ATLAS collaboration was able to reproduce this study in a fulldetector simulation with only slightly smaller statistical significance [9].
The ZH channel was used by Soper and Spannowsky [12] to show that the combined use of the jet grooming techniques discussed in Section 3 can improve the confidence level of a Higgs detection.
One of the major discovery channels for a light SM Higgs boson in ATLAS and CMS reports [17,18,19] was the ttH production channel with subsequent Higgs decay to bottom quarks. Further studies revealed a very poor signal-to-background ratio of 1/9 [20,21], making the channel very sensitive to systematic uncertainties which might prevent it from reaching a 5σ significance for any luminosity. However, at high transverse momentum, after reconstructing the boosted, hadronically decaying top quark as well as the Higgs boson and requiring 3 b-tags, Plehn, Salam and Spannowsky [11] find a signal-to-background ratio of roughly 1/2, while keeping S/ √ B at a similar value to that in Ref. [20].
For some scenarios with non-SM decays, the Higgs boson may evade the constraints from the LEP experiments. Two recent studies of a light Higgs boson (m H ∼ 80 − 120 GeV) in the vector boson associated (V H) and ttH channels have used subjet techniques to "un-bury" its decay, via two pseudo-scalars, to four gluons from the large QCD backgrounds [22,23].
The reconstruction of a boosted Higgs boson from H → bb in BSM models has been discussed in Ref. [24] and [46], with a specific application to cascade decays in the Minimal Supersymmetric Standard Model (MSSM) [25]. For the lightest CP-even Higgs boson, a high statistical significance has been found with as little as 10 fb −1 at √ s = 14 TeV, provided its production from neutralino or chargino decays is not too rare.

Boosted top quarks
The reconstruction of boosted top quarks was one of the first applications of subjet techniques. Due to the threepronged decay of a hadronically decaying top quark and the known W boson mass, the radiation pattern of its decay products provides strong discriminating power from QCD-induced light parton jets. Many different top-tagging approaches have been proposed, mainly focusing on scenarios where the boosted tops are decay products of much heavier resonances. These top quarks are naturally boosted (p Ttop > 500 GeV).
The tools and algorithms proposed for top-tagging are discussed in more detail in Section 3 and the performance of these algorithms is discussed in Section 8; here we only review briefly some uses in the literature.
Early ATLAS studies used the so-called "YSplitter" tool [26], proposed in [27] and further detailed in [28], for the identification of hadronically decaying top quarks. These studies established that hadronic decays of boosted top quarks can be identified efficiently amongst the background from QCD jet production. In a later study [29], a likelihood analysis was used to improve the tagging efficiency further.
The so-called "Hopkins" top-tagger of Kaplan, Rehermann, Schwartz and Tweedie [30] has been used in phenomenological studies [31] and full detector simulation studies have been performed by CMS with a slightly modified implementation [32]. In a further CMS study [33] this tagger was applied to the tt all-hadronic channel, showing a very high background rejection while keeping a reasonable signal efficiency for high p T objects. Using this tagger the C/A algorithm outperforms both the k T or anti-k T algorithms in reconstructing the top [32].
Taggers to reconstruct hadronic tops with moderate transverse momentum (p T 200 GeV), proposed by Plehn, Salam, Spannowsky and Takeuchi [11,34], were used in Ref. [34] to reconstruct the light top squark of the MSSM in a final-state with only jets and missing transverse energy.
Top quarks are predominantly produced in pairs at hadron colliders. The lepton + jets and fully leptonic final states are easier to isolate from the QCD background at hadron colliders than the fully hadronic final state. Nonboosted top quarks decaying to bℓν l provide an isolated charged lepton suitable for triggering, large missing transverse energy and a b quark: three good handles for reducing the QCD background.
For boosted top quarks decaying leptonically, however, QCD jet production may again be a dangerous background. The rejection achieved by flavour tagging is severely degraded for very high p T jets [35] and the E miss T resolution may be insufficient to reveal the presence of the relatively low p T neutrino. Finally, the lepton from W ± boson decay and the b-jet often merge and traditional lepton isolation criteria result in a significant loss of signal efficiency.
In Ref. [36], several alternatives to lepton isolation were proposed with better performance in lepton + jets events. Rehermann and Tweedie [37] propose a "miniisolation" cut at the tracker level for the lepton, which results in a very high background rejection rate.
A full simulation study [38] investigated the sensitivity of the ATLAS experiment to resonant production of tt pairs. The lepton + jets final state was selected using a combination of the leptonic observables developed in Ref. [36] and [37] and a hadronic top-tagger based on an evolution of the work of Ref. [29] and Ref. [36]. This study was specifically aimed at early ATLAS data (200 pb −1 at √ s = 10 TeV or 1 fb −1 at √ s = 7 TeV) and the algorithm was adapted to perform well for tops with only moderate boost. Its performance was found to compare favourably with that of a more traditional approach for a resonance mass as low as 1 TeV, showing that boosted tops are an interesting probe for new physics even in the earliest stages of the experiment.
The CMS collaboration also investigated top pair production in the muon+jets decay channel (tt → µνb bqq ′ ) [39,40]. In both CMS analyses, jets were reconstructed with the SISCone algorithm [41] and no top-tagging algorithm had been applied for the hadronically decaying top quark.
Instead, either only the lepton isolation criterion had been relaxed [40] or both the lepton isolation and the number of reconstructed jets criteria had been relaxed [39]. To estimate background from QCD multi-jet events, a data driven method was developed. While the method described in [40] is more focused on a good mass resolution, it is suitable for searches for massive tt resonances in the lower end of the mass spectrum (around 1 TeV) in the very early stages of data-taking. The analysis of Ref. [39] takes the boosted topology of the decay products into account and achieves significantly better cross section limits for massive resonances (2-3 TeV).

Boosted electro-weak gauge bosons
Longitudinal vector boson scattering can help to unravel the nature of electroweak symmetry breaking, in particular, if no Higgs is found. In the vector boson fusion process, the two vector bosons are produced in association with two forward jets and tend to be central and have high transverse momentum in the kinematic region relevant to boosted studies. Allowing one of the two vector bosons to decay to quarks enhances the branching ratio and subjet techniques can be used to reconstruct this vector boson from the collimated decay products. This was one of the first applications of jet substructure by Butterworth, Cox and Forshaw [27], and was further treated in [42], including the use of polarization.
Building on Ref. [27], the ATLAS simulation study described in Ref. [43], investigated a chiral Lagrangian model, in which a scalar or a vector resonance decays to two vector bosons that decay semi-leptonically. The hadronically decaying vector boson is found by investigating the jet mass of k T jets [44,45] with a jet-size parameter of R = 0.6. By also considering the specific phenomenology of vector boson scattering, i.e. the presence of high-|η| jets and the absence of other central jets due to the lack of colour flow between the initial protons, it is shown that a semi-leptonically decaying 800 GeV W Z resonance with a production cross section of 0.65 fb can be discovered in 60 fb −1 of integrated luminosity at √ s = 14 TeV.
The reconstruction of boosted electroweak gauge bosons can also be used in SUSY cascade decay chains, as shown by Butterworth, Ellis and Raklev [46], to obtain information about the masses and branching ratios of SUSY particles.
Searches for a heavy Standard Model Higgs boson focus on the so-called 'gold plated mode' where the Higgs decays to two leptonic Z bosons which are naturally boosted. By requiring one of the Z bosons to decay hadronically, Hackstein and Spannowsky [47] found the semi-hadronic channel to be at comparable significance to the purely leptonic channel for detecting the Higgs boson. A combination of several subjet techniques was deployed, as suggested in Ref. [12], to reconstruct the boosted, hadronically decaying Z boson. Assuming the existence of a chiral fourth generation, the Higgs boson could be detected or ruled out with only 1 fb −1 of data at √ s = 7 TeV in this channel. Englert, Hackstein and Spannowsky [48] showed that the semi-leptonic channel also provides sensitivity to the CP property of a heavy scalar resonance.
Only recently Katz, Son and Tweedie [49] studied boosted electroweak bosons, e.g. W , Z or Higgs, from a Z ′ resonance. They showed that reconstructing these bosons using the BDRS approach yields significant improvements for the Z ′ discovery potential compared to previous analysis.

Boosted BSM physics particles
Currently, the published literature on the reconstruction of boosted BSM particles with unknown mass is fairly sparse. One exception is the study from Butterworth, Ellis, Raklev and Salam [10]. If baryon-number-violating couplings are present in supersymmetry, the lightest neutralino can decay into three quarks. These neutralinos can be produced highly boosted from a squark or gluino decay. By investigating the jet substructure, it was shown that a signal can be extracted from the large light-jets background without making any assumptions on the presence of charged leptons. The neutralino mass was determined to O(10 GeV) precision.
In Ref. [50], these methods were tested in a full AT-LAS simulation. Using the k T jet algorithm with a size parameter of R = 0.7, the neutralino decay was shown to produce a single fat-jet when the neutralino transverse momentum exceeded a few hundred GeV. The technique was further tested, at the Tevatron, in Ref. [13] to investigate the possibility of reconstructing very light gluinos (mg ∼ 150 GeV) decaying into three quarks.

Tools and techniques
Jet substructure analyses are able to distinguish the fatjets, which form when highly boosted particles decay to quarks or gluons, from the large QCD jet background.
A sophisticated set of tools has been developed to try to answer the following two questions: Firstly, given a massive jet, is its mass due to the presence of a decayed massive object (signal) or simply a consequence of the QCD emissions that always occur within jets produced by light quarks or gluons (background)? Secondly, assuming a jet does come from a decayed massive object, can one establish which particles are most likely to have come from that massive object and which ones are more likely to be due to initial-state radiation, underlying event (UE) or pileup (PU)? This second question is important because the addition of UE/PU particles to a jet can severely degrade mass resolution.
Three (somewhat overlapping) sets of methods have been developed to help address these questions: identification of subjets within the candidate jet, dedicated "grooming" away of uncorrelated radiation within a jet and energy flow techniques.

Two-body subjet methods
Subjet methods are mostly based on the k T [44,45] and Cambridge/Aachen (C/A) [15,16] jet clustering algorithms, either directly on jets found with these algorithms, or on the result of reclustering some other jet's constituents. These algorithms sequentially merge (by four-vector addition) the pair of particles that are closest in some distance measure d ij , unless there is a distance d iB = p 2n T i which is smaller than all d ij , in which case particle i is called a jet and the clustering proceeds with the remaining particles in the event 2 . Many jet substructure methods undo one or more steps of the clustering so as to identify subjets that correspond (approximately) to the individual decay products of the massive object 3 . Utilities for clustering and for studying the clustering history are available in FastJet [51,52]. Additional analysis tools are supplied by SpartyJet [53,54].
In the k T algorithm, the final step in the clustering of a jet usually corresponds to the merging of the two decay products of the massive object. This was exploited in an early study by Seymour [55] of boosted W boson decays, which involved undoing the last stage of a (R = 1) k T -jet's clustering to obtain two subjets. In one analysis, this was followed by angular cuts on the separation between those two subjets, to reduce backgrounds. In particular, QCD jets with masses near m W tend to acquire much of their mass from relatively soft parton branchings at wide angles. In another analysis in [55] the subjet separation was used to set a smaller radius for a more refined, second stage of k T clustering on the W -jet constituents. Particles from the leading two refined jets were used to reconstruct the W boson, thereby ignoring wider-angle radiation, usually dominated by UE/PU. 4 2 The parameter ∆Rij is the (angular) distance between constituents i and j in y-φ space, where y is the rapidity and φ is the azimuthal angle of the constituents transverse to the beam direction. The parameter R controls the size of the jets in y-φ coordinates and can be roughly referred to as the jet radius (even though these jets are not usually circular). In particular, the criteria above ensure that particles separated by ∆R > R at a given clustering stage cannot be combined and that a particle can only be promoted to a jet if there are no other particles within ∆R < R. 3 Though the anti-kT algorithm is formulated similarly to kT and C/A, simply with n = −1 in Eqn. 1, where n governs the relative power of the energy versus geometrical scales, its intrinsic hierarchy is in general unsuitable for direct substructure studies. This is because when it merges a softer subjet with a main hard subjet, the merging often takes place across a multitude of clustering steps. For the kT and C/A algorithms, such a merging happens in a single step. 4 The angular-ordering property of QCD tells us that to catch the QCD radiation from the decay products of a coloursinglet such as the W boson, the subjets need only extend out In a procedure dubbed "YSplitter," Butterworth, Cox and Forshaw [27] (see also [28,46,56]) also used the k T algorithm and simply cut on the value of the d ij distance (equivalently, the k T scale) in the final merging. QCD backgrounds tend toward small values (depending in detail on the jet definition and p T scale); whereas W -jets tend toward values correlated with m W . 5, 6 Butterworth, Davison, Rubin and Salam (BDRS) [8] pointed out that the C/A algorithm, which repeatedly clusters the closest pair of particles in angle, produces a much more "angular-ordering aware" organisation of jet substructure. This observation served as the basis for a method that combines both improvement of mass resolution and reduction of internal phase space in a relatively scale-free manner. It is particularly well-suited for performing unbiased searches for undiscovered boosted 2body resonances, such as the Higgs, since QCD-initiated jets processed by this method produce a relatively featureless mass spectrum.
In contrast to the k T algorithm, it is not useful just to undo the last stage of C/A clustering: The absence of any momentum scale in its distance measure means that the last clustering often involves soft radiation on the edges of the jet and so, is unrelated to the heavy object's decay. C/A-based approaches must therefore continue to work backwards through the jet clustering and stop when the clustering meets some specific hardness requirement. BDRS require a substantial "mass-drop", i.e. max(m i , m j )/m ij significantly below 1, and symmetry between the momentum fractions of the two subjets, expressed as a cut on: These criteria are controlled by two dimensionless thresholds, µ and y cut , respectively. While wide-angle UE/PU radiation is actively removed by the mass-drop procedure, this removal is not sufficient in the moderately-boosted regime, ∆R ≃ 1, studied by as far as the subjet angular separation. Any particles found beyond this tend to be uncorrelated. 5 This strategy is similar to Seymour's subjet angular cut method, if we restrict our attention to jets with mass near mW . In the approximation of massless subjets, kT and ∆R are strictly related for fixed originator mass. 6 The possible degradation of mass resolution due to UE/PU is not addressed by this procedure, but YSplitter can readily be combined with dedicated grooming approaches if necessary.
7 Alternate hardness measures are also possible, such as the softer subjet's pT divided by the total jet pT , the angular separation between the subjets and Jade-type distance measures, e.g. d The mass-drop criterion identifies a localised region within the fat-jet that looks like two distinct cores of energy. The asymmetry cut essentially serves the same purpose as the kT scale cut in YSplitter, to eliminate energy-sharing configurations that look more QCD-like. But, as phrased here, the cut no longer refers to an absolute mass scale. Declustering can continue indefinitely, down to arbitrarily small ∆R and arbitrarily small masses, until suitable substructure is identified.
BDRS. The subjets obtained in this way may still be quite large and contaminated. 9 To refine the subjets further, BDRS apply a "filtering" procedure, close in spirit to the reclustering method of Seymour [55]. The constituents of the two subjets are reclustered with C/A, using R = min(0.3, ∆R subjets /2). The three hardest subjets are taken, facilitating the capture of possible gluon radiation in the heavy particle decay, while still eliminating much of the UE/PU. The procedure was adapted for top-taggers in [11,34] and found to be beneficial in normal dijet studies too [58]. A discussion of its optimisation for two-body decays is given in Ref. [59].

Three-body subjet methods: top-taggers
A variety of three-body subjet methods have been developed, building on the two-body methods. These have mainly been tailored for boosted tops, but have also been adapted for generic heavy particle searches.
One of the simplest top-taggers is an extension of YSplitter by Brooijmans [26]. Further substructure is revealed by repeated k T declustering and reading off the k T scales of the next-to-last (and next-to-next-to-last) clusterings. Top-jets can be discriminated from QCD by placing cuts in the multidimensional space of jet mass and k T scales, or by using a likelihood ratio built in this space [29]. In subsequent sections, we refer to this as the "ATLAS" tagger.
Thaler and Wang [36] utilise a similar approach. A jet is reclustered with k T , until, depending on the analysis, exactly two or three subjets are formed. Internal kinematic variables in addition to k T scales are utilised. For example, in the three-subjet analysis, a W boson candidate is identified by forming the minimum pairwise mass between subjets and a minimum cut is placed on its mass. Relative energy sharings between the subjets are also studied.
The "Hopkins" top-tagger of Kaplan, Rehermann, Schwartz and Tweedie [30,60] is a descendant of the two-body approach of BDRS. The original version is specialised for a perfect ∆η × ∆φ = 0.1 × 0.1 calorimeter, as a straw-man detector. Quantities |∆η| + |∆φ| and min(p T 1 , p T 2 )/p T jet are used to categorise clusterings, rather than relative mass drop and pairwise energy asymmetry. Thresholds are set by parameters δ r and δ p , respectively. When an interesting declustering is found, the two subjets are used as "jets" for a secondary stage of two-body substructure searches. The original jet is a good top candidate if at least one of these secondary declusterings succeeds, so that there are three or four final subjets. 10 Kinematic cuts are then applied to these subjets (without filtering): The summed subjet mass should lie near m t , there should be a subjet pair that reconstructs near m W and the recon-structed W boson helicity angle should not be too shallow. 11 The Hopkins tagger has been modified by CMS [32,33,61] in two ways. First, as the actual LHC detectors have better resolution than the 0.1 × 0.1 grid model, declustering uses ∆R to determine adjacency and the parameter δ r shrinks with p T : δ r = 0.4 − 0.0004 · p T jet . Second, the W boson mass window and helicity angle cuts are replaced by a single cut on the minimum pairwise subjet mass (excluding any fourth subjet), as in Thaler and Wang 12 .
A method more closely tied to BDRS and building on the top-Higgsstrahlung study of [11], appears in a paper by Plehn, Spannowsky, Takeuchi and Zerwas [34], the Heidelberg-Eugene-Paris (HEP) top-tagger. A fat-jet is declustered using a fractional mass-drop criterion, with no asymmetry requirement. Subjets-within-subjets are searched for indefinitely, until subjets with masses below 30 GeV are encountered. As the boosts in this study are modest and the ∆R scales are quite large, multibody filtering is also applied. For every set of three subjets, one reclusters the constituents with C/A, R = min(0.3, (∆R jk /2)) (j, k run over the three subjet indices) and takes the invariant mass of the 5 hardest resulting filtered subjets as a filtered mass. The set of three initial subjets that gives a filtered mass closest to the top mass is retained as the sole top candidate. The set of filtered constituent particles is then reclustered yet again to yield exactly three subjets and non-trivial cuts are placed in the two-dimensional subspace of (m 23 /m 123 , arctan(m 13 /m 12 )), where the numbers label the subjets in descending p T . For tops, one of the pairings will have a relative mass m ij /m 123 ≃ m W /m t and the cuts are designed to capture this region. All cuts for the present tagger have been designed to be free of explicit mass scales by normalising with respect to the jet mass.
Tagging of boosted 3-body decays can also be applied to search for as-yet undiscovered particles. Of particular note are the R-parity-violating neutralino taggers of Butterworth, Ellis, Raklev and Salam [10], which generalize the idea of 3-body tagging beyond the special kinematic situation of the top. The paper explores two tagging methods. One is similar to the YSplitter top-tag in Ref. [26,29] but with cuts on dimensionless ratios such as d 12 /m 2 . The other is C/A based and searches the entire clustering history of a jet, paying attention only to clusterings that are not too locally p T -asymmetric (using a single asymmetry parameter z min ) and recording the associated Jadedistances, p T 1 p T 2 ∆R 2 12 . The clustering with the largest Jade-distance defines the neutralino candidate and to ensure 3-body kinematics, a cut is placed on the ratio of the masses of the subjets with second-largest and largest Jade distances. An adaptation of this method has been used in an unpublished C/A-based top-tagger by Salam [62], which takes the subjet with the largest Jade-type distance as the top candidate. Then in the top candidate rest frame, the top constituents are reclustered into exactly three jets using the e + e − style k T algorithm (see [63] for a description). The hardest subjet (in absolute energy) is assumed to be the b-jet and the other two constitute the W boson decay products. Three-body kinematics can then be constrained as desired.

Jet grooming methods
Jet grooming -elimination of uncorrelated UE/PU radiation from a target jet -is useful irrespective of the specific boosted particle search and can even be applied for slowmoving heavy particles that decay to well-separated jets. While many of the methods discussed above incorporate some form of grooming, notably Seymour and BDRS (filtering), two other methods specifically dedicated to grooming have been developed.
Pruning was introduced by Ellis, Vermilion and Walsh [64,65]. The idea is to take a jet of interest and then to recluster it using a vetoed sequential clustering algorithm. Clustering nominally proceeds as usual, but it is vetoed if 1) the particles are too far away in ∆R, and 2) the energy sharing, defined by min(p T 1 , p T 2 )/p T (1+2) , is too asymmetric. If both criteria are met, the softer of the two particles is thrown away and all d ij 's and d iB 's are recalculated. The ∆R and energy-sharing thresholds are set by adjusting the two parameters D cut and z cut , respectively. 13 Trimming is a technique that ignores regions within a jet that fall below a minimum p T threshold. It was introduced by Krohn, Thaler and Wang in [66]. (Similar ideas also appear in [8,55].) Trimming reclusters the jets' constituents with a radius R sub and then accepts only the subjets that have p T,sub > f cut , where f cut is taken proportional either to the jet's p T or to the event's total H T . The small jet radius and energy threshold are the only parameters.
While different grooming methods have the same goal, it may be possible to combine approaches for greater effectiveness. For a first study along these lines, see [12].

Event-shape and energy-flow methods
Top decays often feature a triangular structure, transverse to the boosted top-quark axis. Event-shape type measures (widely used at e + e − colliders) can be applied to the jet constituents to help establish whether such a triangular structure is present. Refs. [36,67] both proposed planarflow type observables, for which one establishes the eigenvalues of a matrix such as where p ⊥k i is the k th component of p i transverse to the jet axis and E i is its energy. The matrix has two eigenvalues, λ 1 ≥ λ 2 , and the second eigenvalue provides a measure of the jet planarity, for example through the combination Another method of making use of the energy flow [68] involves the construction of energy-flow "templates". These take the energy flow, discretised in θ and φ (templates), for each possible orientation of top-decay products (possibly with cuts to limit backgrounds). Then for a given jet in an event, the method finds the template that provides the best match event to that jet's energy flow pattern, with a measure of the match quality that involves Gaussians of the difference between actual and template energy flows.
Two other energy-flow/event-shape type methods introduced recently aim to isolate signals based on the absence of energy flow [22,23]: They both involve the (same) case of colour-neutral particles with p T ≫ m that decay to two coloured particles i and j. Because of the colourless nature of the parent, the emissions from the coloured particles are highly collimated within an angle ∆R ij (once again, due to angular ordering). In contrast QCD backgrounds involve emission on all angular scales. Vetoing on energy-flow and/or subjets outside ∆R ij therefore allows a significant reduction in background while retaining much of the signal.
In this context it is worth commenting also on the use of colour-structure dependence of energy flow in the nonboosted limit, to help distinguish signals of colour-neutral heavy-object decays (two jets colour connected to each other) from backgrounds (two jets, each colour connected to the beams) through an observable named "pull" [69].

Experimental status of jet substructure
The previous BOOST workshops have highlighted the importance of understanding jet substructure and have spurred numerous groups to use existing data sets to perform studies that address some of the key uncertainties raised in these previous meetings. This section provides a brief review of pioneering studies of jet substructure using data collected at the DESY HERA ep Collider and the Fermilab Tevatron Collider, as well as the recent work reported for the first time at BOOST2010.

Jet substructure measurements performed at HERA
One of the earliest studies looked at the mean number of subjets in a recoil jet produced in the photoproduction of jets in ep collisions [70]. From jets produced at large angles to the proton beam and with transverse energies E T > 17 GeV, the average number of subjets in the recoil jet was used to measure the strong coupling constant and to confirm the general picture of QCD radiation within the perturbative parton shower believed to be responsible for the jet. A subsequent study [71] employed a sample of jets produced through photoproduction and deep-inelastic scattering to study the kinematics of jet production as well as the distribution of energy flow within the jet. The data are well-described by QCD calculations as implemented in PYTHIA and an extraction of the strong coupling constant was made. Most recently, the ZEUS collaboration reported a study of subjets in jets produced in neutralcurrent deep-inelastic scattering [72]. The jets were clustered with the k T algorithm and the subjet structure was obtained in the laboratory frame, by running a variant of the exclusive k T cluster algorithm 14 with a y cut = 0.05 for jets with E T > 14 GeV and pseudorapidity from -1 to 2.5. The dimensionless parameter y cut is related to the k T distance metric through the following formula: where R is the resolution parameter of the k T algorithm and p T,i , p T,j and p T denote the transverse momentum of the two subjets and of the parent jet, respectively. Focusing on the kinematic distributions of those jets with exactly two subjets, the study found that QCD predictions were in good agreement with the data, again confirming the general picture of QCD radiation in the showering process.

D0 jet substructure measurements
D0 studied the k T subjet multiplicity for central (|η| < 0.5) jets reconstructed with the k T algorithm with R = 0.5 and 55 GeV< p T < 100 GeV in data collected during Run I of the Fermilab Tevatron Collider at √ s = 0.63 and 1.8 TeV [73]. The analysis selects subjets based on y cut = 10 −3 . The choice to a minimum subjet p T of approximately 3% of the total jet p T .
The subjet p T distribution shows that jets are composed of a soft and a hard component. The soft component has a threshold at 1.75 GeV set by the value of y cut and the minimum jet p T , whereas the hard component peaks at 55 GeV driven by single-subjet jets. Exploiting the two centre-of-mass energies and taking the fraction of gluon jets at each of these from simulation, the subjet multiplicity for quark and gluon jets is extracted from the data. After correcting for subjets originating from showering in the calorimeter, as well as other small effects, the ratio of the average number of extra subjets (i.e. average minus one) in gluon relative to quark-originated jets is measured to be 1.84±0.15(stat.)±0.20(syst.), confirming that gluon jets radiate more than quark jets.

CDF jet substructure measurements
CDF performed an early Run II measurement of jet substructure using 0.17 fb −1 of data collected at √ s = 1.96 TeV.
Measurements were carried out on jets with p T up to 380 GeV, with the key variable being the average fraction of jet transverse momentum that lies inside a cone of radius r concentric to the jet cone, as a function of r [74]. These measurements showed that PYTHIA [75] (v. 6.203) calculations with Tune A settings provided a reasonable description of the observed data. The herwig 6.4 [76] MC calculations also gave a reasonable description of the measured jet shapes, but tended to produce jets that were too narrow at low jet p T values.
CDF presented new results at the BOOST2010 meeting of measurements of jet mass, angularity and planar flow for jets with p T > 400 GeV from a sample of 5.95 fb −1 [77]. The measured distributions were compared with analytical expressions from NLO QCD calculations, as well as PYTHIA 6.1.4 predictions incorporating full detector simulation. The theory predictions for jet mass were in good agreement with the data, whereas the angularity and planar flow predictions by PYTHIA showed disagreement in detail (primarily at low angularity and low planar flow). Subsequent to the meeting, CDF presented results of a search for boosted top quarks in this sample, setting a preliminary upper limit of 54 fb on Standard Model top quark production cross section for top quarks with p T > 400 GeV at 95% confidence level [78].

Benchmark samples
Over the years the community involved in studies of boosted objects has grown considerably. A large number of different tools exists and it is often hard to gauge their relative performance from published results. Moreover, as we will show in section 7, the choice of Monte Carlo tools used to model the jets can have a pronounced impact on the results. We have therefore generated a number of common samples to provide a benchmark for the performance comparisons in the following sections.
We simulated LHC proton-proton collisions at a centreof-mass energy of 7 TeV. The samples consist of QCD dijet events, representing the most important background for many searches, and the Standard Model tt events as signal, serving as a typical example of heavy, boosted particle production. Both samples were produced with HERWIG 6.510 [76]. All samples are divided in equally-sized subsamples with parton p T ranges from 200-300 GeV, 300-400 GeV, . . . , 1.5-1.6 TeV, thus covering the full range from topologies with moderate boost to extremely energetic events. We generated 10.000 events for each parton p T bin. Combining all samples yields an approximately flat p T distribution.
The generated events include a description of the underlying event (UE). HERWIG is used in conjunction with JIMMY [81] that takes care of the underlying event generation. For this study we rely on a tune from AT-LAS [82] 15 .
The subleading terms in parton generators are often constrained by tuning. Extensive tuning has been performed at LEP, mostly constraining quark jets. However, little tuning of substructure observables has been performed at hadron colliders, where, relative to LEP various new elements arise: there are many more gluon jets, there are colour connections with the initial state, and there is the question of how to handle recoil from hard emissions in a context where the partonic centre-of-mass energy is no longer fixed. We therefore expect that the description of jet substructure in different MC tools may differ significantly.
For comparison we generated identical samples with PYTHIA 6.4 [75]. A number of tunes for the UE description were considered, that we will label as DW, DWT and Perugia0. The parton shower model of the DW and DWT samples is Q 2 -ordered. Both yield identical results for the underlying event at the Tevatron. However, the two tunes extrapolate differently to the LHC, where DWT leads to a more active underlying event 16 .
The Perugia tune [80] uses a p T -ordered parton shower. To disentangle the impact of the parton shower and that of the underlying event, we generated an additional set of PYTHIA samples with the UE generation switched off. Samples with UE switched off were also produced with HERWIG.
The different generators and tunes used should give a rough measure of the uncertainties in the parton shower and underlying event modelling at the start of data taking at the LHC. We note that, with more refined tunes to LHC data, these uncertainties are expected to decrease considerably over the coming years.
We propose using these samples as a benchmark for MC studies investigating the prospects of searches using boosted objects. The samples are publicly available for future work at: http://www.lpthe.jussieu.fr/ salam/projects/boost2010events/ -http://tev4.phys.washington.edu/TeraScale/boost2010/ We also agreed on a standard definition of the primary jet reconstruction. For each event, jets are clustered with the anti-k T algorithm with an R-parameter of 1.0. As input to the jet clustering, all stable particles with |η| < 5.0 except neutrinos and muons are used. In order to exclude soft additional jets that do not originate from hadronic top quark decay, at most 2 jets per event with p T > 200 GeV are considered. HERWIG's internal soft UE turned off). 16 The value of the PYTHIA parameter PARP(90) governing the energy dependence is set to 0.16, the value used by ATLAS, in DWT, while the Tevatron tune A value of 0.25 is chosen in tune DW. For a detailed description, the reader is referred to Ref. [79].

Impact of jet grooming tools
Having described many jet substructure tools in section 3, we now consider how they perform in reconstructing the hadronic decays of heavy particles. We begin with three grooming tools: pruning, trimming, and filtering. Although initially formulated in different contexts, in practice they rely on the same phenomena: Contamination from the underlying event and pile-up will have characteristically lower energy than the core(s) of a high-p T jet, and most of the energy of the "uncontaminated" jet is located in some small number of small regions. Each of the grooming techniques differs in the way this broad idea is implemented and as a result, differences in performance may be expected.
For the purpose of this report, we consider their performance in identifying top jets on the benchmark samples described in Section 5. For simplicity we only consider two narrow p T ranges: 300-400 GeV and 500-600 GeV.
For each groomer, the components of anti-k ⊥ , R = 1.0 jets are reclustered with Cambridge/Aachen. Each groomer then acts on the C/A substructure. For pruning, which was intended to be used in this manner, we use the "standard" parameters from Ref. [64], {z cut = 0.1, D cut = 0.5 × (2m/p T )}. Trimming was originally proposed for use on QCD jets (while, for example, looking at a dijet mass), so the original parameters are obviously not sensible. Likewise, filtering has typically been used in the context of further grooming the already identified subjets of a decay. For trimming and filtering, we have chosen reasonable values based on a superficial exploration of the parameter space, requiring good performance in the higher of the two p T bins. A careful optimisation of parameters requires a more thorough study. For trimming, we take {R sub = 0.35, f cut = 0.03 × p jet T }. For filtering, we take {R filt = 0.35, N subjets = 3}.
In Fig. 1, we compare the mass distribution for groomed jets with that for ungroomed jets, for QCD dijets (a,c) and for hadronic top decays (b,d).
The most striking difference between the two p T intervals is the pronounced peak at the W boson mass for 300 < p T < 400 GeV sample. The results in Figures (a) and (c) show that all three grooming techniques affect the shape of the mass distribution of the dijet background. The effect is most clearly visible in the intermediate mass regime, where jet mass is typically dominated by relatively soft radiation around a single, hard core. For the 500 < p T < 600 GeV dijet sample the fraction of jets in the mass window from 50-150 GeV is 73 % for ungroomed jets, and 27 %, 48 % and 72 % for pruned, trimmed and filtered jets, respectively.
For the high mass tail of the QCD jet mass distribution, large jet masses often come from hard, perturbative emissions that will not be "groomed away", so grooming diminishes in effectiveness and the differences between the various techniques are less pronounced. For the same sample, the fraction of QCD jets in the mass window from 150-200 GeV is 14.6 %, 7.8 %, 9.9 % and 10.6 %, respectively.
Turning to the signal distributions of Fig. 1 (b) and (d) we find that grooming clearly improves the mass resolution when compared to the raw jet mass. While the resolution of the three grooming methods is similar, the fraction of events in top quark mass peak differs. More aggressive grooming leads to a larger number of signal events that migrate out of the signal peak. For the tt sample with 500 < p T < 600 GeV, the fraction of jets with 150 < m j < 200 GeV is 66 %, 54 %, 64 % and 69 % for ungroomed, pruned, trimmed and filtered jets, respectively.
The findings discussed above indicate that while the three grooming techniques have qualitatively similar effects, there are important differences. For our choice of parameters, pruning acts most aggressively on the signal and background, followed by trimming and filtering. These differences can be explained by a more detailed look at the internals of the algorithms.
Filtering is normally used after finding the substructure of the jet and selecting the hardest subjets. In this analysis, however, the number of subjets is fixed to three and even very soft subjets can be included. For this reason filtering is not expected to be as effective in reducing the background in the intermediate mass region.
Trimming, on the other hand, uses a relative p T threshold to determine which subjets to keep, so soft subjets are discarded. To a good approximation, low m/p T QCD jets consist of a single hard core surrounded by soft radiation. Trimming will keep just radiation within R sub of the core; filtering will keep the core as well as two other soft subjets. In figure 1 (a) and (c) we can indeed see that trimming shifts the jet mass distribution further down. At the same time, the fraction of boosted tops that is found in the top mass interval is slightly reduced.
Like trimming, pruning can strip the jet to a single hard core. The key difference is that the angular cutoff, D cut , is adaptive, scaling with the m/p T of the jet. This means that at low m/p T , the angular size of the hard subjet(s) kept by pruning gets smaller, making pruning more aggressive. Again we find that this expectation is confirmed by the strong reduction of QCD jets in the intermediate mass regime, and the more pronounced migration of signal events from the top quark mass peak to lower mass.
We conclude that all three grooming methods lead to a significantly improved mass resolution for jets containing a hadronic top quark decay. Aggressive grooming, as implemented in the pruning algorithm, is also very effective in reducing the background population in the intermediate mass regime from 50-150 GeV. For higher masses, dominated by perturbative QCD, grooming techniques are less effective in reducing the QCD background.

Sensitivity of jet substructure to the MC description
In this section, we study the reliability of Monte Carlo predictions for the substructure of jets. To this end, we compare the response of a sub-jet analysis to events generated with several different Monte Carlo tools and UE  tunes described in section 5. In particular, we establish the sensitivity of jet mass and related observables to the parton shower model and to the UE. We also perform a simulation that mimics a number of important detector effects. Data collected at the LHC in 2010-2011 should enable a more thorough understanding than we can hope to achieve at this stage.
We reconstruct the jet invariant mass distribution for anti-k T jets with R = 1. The grooming techniques described in section 6 select relatively hard events and are therefore expected to reduce the sensitivity to soft and diffuse energy deposits. We apply the three grooming procedures and determine the invariant mass of the resulting groomed jet. We present the result of trimming, but the conclusions hold for all three techniques. We moreover recluster the jet constituents with the k T algorithm and unwind the sequence to retrieve the i → j splitting scales d ij . We note that the splitting scales are determined on the ungroomed cluster sequence.
To establish the impact of different parton shower models we compare the response to two of the most popular Monte Carlo tools for jet formation, HERWIG and PYTHIA. We moreover vary the order of the emissions in PYTHIA, using two schemes known as p T -ordering (used in the Perugia0 tune) and Q 2 ordering (used in DW and DWT). In Fig. 2, we compare the jet mass distribution for these three setups, along with the k T scales corresponding to the 1 → 2 and 2 → 3 splits. For the sake of a clean comparison we disabled UE activity for these samples.
We find the p T ordered shower in PYTHIA yields a significantly softer spectrum than the Q 2 ordered shower model. This is true for the jet invariant mass and the scales of the hardest splittings in the shower. The results obtained for the HERWIG shower are in good agreement with the Q 2 ordered shower for both the jet mass and the 1 → 2 splitting scale.
We expect larger differences between Monte Carlos in the region of larger masses and splitting scale, as these probe less collinear regions of the jet structure, where  Jet invariant mass mj before (a-c) and after grooming (d-f), and (ungroomed) splitting scales √ d12 (g-i) and √ d23 (j-l) for anti-kT jets with R=1 reconstructed on dijet samples with an approximately flat distribution in jet pT . The three histograms in the plots of the leftmost column correspond to three different shower models: Q 2 and pT ordered showers in PYTHIA with the DW and Perugia tune, respectively, and the default HERWIG shower model. In the central column, two Pythia underlying event tunes are compared to default Herwig/Jimmy. In the rightmost column particle-level jets are compared to cluster-level jets. In the small inset underneath each histogram, the relative deviations from a reference histogram are given ((data-ref)/ref), where the result for PYTHIA Q 2 -ordered showers (leftmost column), the PYTHIA DW tune (central column) and particle level jets (rightmost column) as the reference. the codes are less constrained. Unfortunately matching to fixed-order prediction (such as with Alpgen or Madgraph) may not necessarily be of immediate help in this context given the recent results [83] which show that matching does not necessarily improve the description of the event structure near the dijet limit.
For the 2 → 3 splitting scale we observe relative differences of up to 20 %. The greater robustness of the 1 → 2 splitting scale compared to the 2 → 3 splitting scale might have been expected. None of the Monte Carlos explicitly includes an exact collinear 2 → 3 splitting kernel, whereas they do all include the 1 → 2 kernel.
In the experimental environment, jet observables are affected by UE activity and energy flow due to pile-up events. These effects are particularly important for the large jet sizes envisaged for many searches. It is therefore important to establish the sensitivity of substructure analyses to such effects. In Fig. 2 we compare the distributions for the same three observables for three different UE tunes.
The larger UE activity in DWT with respect to the DW tune in PYTHIA is reflected in a (slightly) increased jet mass. The HERWIG+JIMMY jet mass spectrum is significantly harder than that of either PYTHIA tune. Although the UE activity is typically soft, it can have a sizable effect on the invariant mass of the jet. This deviation is clearly observed in all p T bins from 200 GeV to 1.5 TeV. For the first splitting scale, on the other hand, we find excellent agreement between the three tunes. Our interpretation is that this observable corresponds to the hardest event in the shower development and is therefore least sensitive to unrelated, soft activity. This is consistent with our observation that consecutive, softer, splittings (2 → 3 and 3 → 4) exhibit an increasing discrepancy between the PYTHIA DW and HERWIG distributions.
Finally, the measurement of substructure observables will be affected by detector limitations. We study two important effects here by comparing particle and cluster level results and leave the remainder for future studies. In our simple setup, the detector granularity is simulated by forming massless clusters that contain the energy of all particles in a y − φ region of 0.1 × 0.1. A 1 GeV threshold is applied to the resulting cluster E T .
The jet mass, in Fig. 2, is found to be quite sensitive to detector effects. The peak of the distribution for the QCD jet background is shifted down by several tens of GeV. The same is found to be true for the tt signal. The groomed mass distribution is much less affected. The splitting scales are found sensitive to detector effects, in the region d ij < 20 GeV.
To summarise, we have investigated the sensitivity of some of the most popular jet substructure observables to uncertainties in the MC description of the parton shower, the UE and the detector response. We find that observables envisaged to be used in the selection of new physics are strongly affected by some of these effects. The MC samples for evaluating the performance of new algorithms must therefore be chosen carefully. We recommend the benchmark samples presented in section 5 be used to provide a comparison under equal conditions. The significant difference we observe between different MC tools suggest there is benefit to be had by more extensive measurements of a range of jet subtructure and shape observables at the LHC, where the very high statistics available at moderate p t could provide strong additional constraints on the generators. For early comparisons of different shower models and tunes to LHC data the reader is referred to recent studies of jet shapes by ATLAS [6] and CMS [7].
We find that the investigated observables show rather differing sensitivities. We find the invariant mass of the jet to be quite sensitive to UE activity and detector effects. The grooming techniques investigated in this paper greatly improve the robustness of the jet mass. Also the k T splitting scales are quite robust, provided their use is limited to the region above approximately 20 GeV. For the two most commonly used shower models, all observables are in good agreement, but the p T ordered shower in PYTHIA yields significantly different results.

Comparison of top-tagging tools
We have performed a study to compare the different toptagging algorithms. 17 The benchmark QCD dijet (background) and tt (signal) samples, produced using HERWIG as described in section 5, were used. However, for this study, only the subsamples with parton p T ranges up to and including 700-800 GeV bin were used. For each event, jets were clustered with the anti-k T algorithm with an R-parameter of 1.0.
For each anti-k T jet, top-tagging algorithms are run on the constituents of the jet. As the final step of top-tagging, all algorithms applied selection criteria on kinematic variables such as jet mass and jet substructure. Applying the top-tagging algorithms with their default cut values yields different mistag rates and efficiencies which makes it difficult to compare them directly. In our study, for each top-tagging algorithm, the cuts were optimised for each efficiency by minimising the mistag rate while keeping the efficiency fixed. In this context, the overall mistag rate and overall efficiency are defined as the number of toptags divided by the total number of anti-k T jets in the background and signal sample, respectively. The normalisation uses anti-k T jets above 200 GeV and at most two per event.
As can be seen from Fig. 1 in Section 6, jets at low p T values often have an invariant mass inconsistent with the top quark mass. These jets include only some decay products of the hadronic top quark decay, e.g., the quarks of the hadronic W boson decay. Running the optimisation procedure in this p T region would hardly result in a top-tagger but possibly rather a "W -tagger". Therefore, we impose an additional cut on the anti-k T jet mass of m jet > 120 GeV for all top-taggers. This implies a maximum overall tagging efficiency of 75%. Curves with the optimal mistag rate versus signal efficiency are shown in Fig. 3. The optimisation was repeated on the p T subsamples and can be compared to the overall optimisation applied on the subsample to evaluate the potential benefit of using p T -dependent cut values. Curves for the 300 < p T < 400 GeV (c) and 500 < p T < 600 GeV (d) subsamples are also shown.
While these curves can be used to compare the overall performance of the top-tagging algorithms, they do not reflect the p T -dependence of the tag rate. We expect that, at least initially, the experiments are likely to choose a single set of parameters across the whole p T range in order to keep their analyses of these new tools as simple as possible. It is therefore instructive to look at the tag rate as function of jet p T for specific working points. We chose two working points defined by their overall signal efficiency of 20% and 50%.
Firstly, we investigated the performance of two taggers that do not incorporate any grooming procedures. The first one is referred to as the ATLAS tagger [38,29,26], the second one as the Thaler/Wang (T/W) tagger [36]. Both of them exploit the inherent hierarchical nature of the k T jet algorithm by reclustering the initial jet's constituents. The final and penultimate stages of this process correspond on average to the merging of the top quark decay products and hence jet substructure can be probed via the first few k T splitting scales.
The ATLAS tagger 18 relies on m jet , m W 19 and a variant of the first three splitting scales that gives dimension- 18 The ATLAS studies of the variables used in this tagger only became public after BOOST2010 [38]. 19 The W boson mass is defined as the lowest pairwise mass among the three subjets obtained by undoing the two last stages of the kT clustering. . Efficiency and mistag rate as function of jet pT for working points with overall efficiency of 20% (uppermost row) and 50% (lowermost row). Results correspond to the ATLAS and Thaler/Wang taggers (a,d), the Hopkins and CMS taggers (b,e) and the pruning tagger (c,f). The mistag rate has been multiplied by a factor 5 to make it visible on the same scale. 162 < mjet < 265 GeV a The optimal zcut found is near the "standard" value of 0.1, but much smaller values of Dcut are found (the original value was 0.5). This is due to a trade-off between the pruning and mass cut parameters. With wide mass windows and high efficiencies, it turns out to be better to "over-prune". The fact that Dcut decreases from the 20% efficiency point to the 50% point is likely an artifact of the low resolution of the parameter scan (cf. Fig. 3). b The variant of the ATLAS tagger in these proceedings is based on a cut on the likelihood value from TMVA and hence parameter values are not applicable. less observables 20 . In order to ease subsequent analysis, we used a projective (one dimensional) likelihood estimator to discriminate signal from background events. The likelihood classifier was built with the TMVA toolkit [85]. The 20 zcut ≡ dcut dcut+m 2 jet , where dcut is the kT distance between the merging subjets and mjet is the mass of the merged jet.
Thaler/Wang tagger makes use of m jet , m W and a dimensionless energy sharing observable among the last two subjets 21 . In this study, we optimised rectangular cuts on the variables used by the Thaler/Wang algorithm with TMVA 21 z cell ≡ min(E 1 ,E 2 ) where Ei is the energy of the i ith subjet when undoing the last stage of the kT clustering process [36].
for the classification of events. The resulting efficiencies as a function of jet p T are shown in Fig. 4. The efficiencies are relatively flat for p T 500 GeV after a turn-on for lower p T . We also indicate the maximum possible efficiency after applying the m jet > 120 GeV cut in the same figure.
To optimise the Hopkins tagger and its close cousin, the CMS tagger, we varied the lower cut for the jet mass, m jet , and the cut window for m min , m W , yielding the curves in Fig. 3. In addition, the two taggers are compared in Fig. 4 for two working points chosen to yield 20% and 50% overall top-tagging efficiency. We find the tag rate to be relatively flat for p T 500 GeV after a steep turn-on for lower p T . The small p T dependence at higher p T in the taggers which do not employ grooming is further reduced in these grooming-based taggers.
We finally consider a top-tagger that employs pruning to groom the jets (described in detail in Section 3.3). For the purposes of this study, we included an additional step: To identify the W boson subjet, the final jet is unclustered to three subjets (by undoing the last merging) and the minimum-mass pairing is chosen to be the W boson, as in the CMS tagger.
To generate the pruning tagger efficiency curves in Fig. 3, the parameters z cut and D cut are scanned over the ranges 0.01-0.2 and (0.1-0.85)×(2m/p T ) jet . We then scan the cuts on the jet and W boson subjet masses, with the only constraint being that the top jet mass is always required to be greater than 120 GeV. We define two working points, that yield an average efficiency of 20% and 50%. The tagger parameters of both working points are given in Table 1. The tagging rates for signal and background as functions of anti-k T jet p T are shown in Fig. 4. The tag rates are relatively flat for p T 400 GeV, after a turn-on for lower p T .
In general all grooming-based taggers that we tested have a flatter efficiency above p T of 400 GeV than the ungroomed approaches. This reflects the relative stability of the groomed variables as a function of p T . Splitting scales, in particular, are sensitive to the p T of the initial jets, however groomed masses correspond closely to physical quantities and hence are Lorentz-boost invariant.
The overall mistag rates for the different taggers at the different working points are summarised in Table 2. Statistical errors are quoted for all measurements.
Before we discuss these results in more detail, it is useful to discuss the reliability of such performance estimates. In Section 7 we found that the distribution of jet substructure observables is rather sensitive to the choice of parton shower and underlying event model tune. To quantify the effect on the tagging performance, we measured the performance of the taggers with the parameters of Table 1 on a second sample generated with PYTHIA. As might be expected from the results of Section 7 and earlier studies [30], we find that this ad hoc choice has a profound impact on the performance. The performance of all taggers as measured on the PYTHIA sample is significantly better than for the default HERWIG samples, with the rate of fake tags in the di-jet sample (for equal efficiency) dropping by up to a factor 2. If we assume the difference between both samples is an indication of the systematic error, the absolute value of the fake rate is uncertain to a level that makes comparison very hard, if not impossible. In Section 7, we found, moreover, that even simple detector effects have a profound impact on substructure observables. The absence of a detailed detector simulation thus further undermines the reliability of the absolute performance measurements.
The relative performance of the taggers, however, is not affected by this large systematic error. The use of common benchmark samples ensures that the tagging approaches are compared on a level playing field. When we compare the PYTHIA and HERWIG results we indeed find that, despite the large changes in the absolute fake rate, the relative performance of the different taggers is conserved. To enable direct comparison with the existing taggers, we recommend that future taggers be tested using these samples.
We can then proceed to a discussion of the relative performance of the taggers. For the 20% working point it is clear that the grooming based taggers perform strongly, suppressing the background by a factor of 20-100. For the samples we chose, the pruning approach performs best. The ungroomed tagging approaches are more competitive at the 50% working point, which is often at the limit of the applicable range for the grooming-based approaches. It can be seen that the pruning-based approach actually performs worst at this working point.
This seems to be the reflection of the fact that grooming approaches produce a narrow top mass peak, typically containing around 60% of the signal for top jets. To produce an overall efficiency of around 50%, in combination with the m jet > 120 GeV requirement, we must then choose a large mass window. This partly negates the advantages of the grooming approaches and leads to worse relative performance compared to techniques without grooming.

Conclusions
At the LHC, many of the particles that we have considered heavy so far (W and Z bosons, the top quark, the Higgs boson and possible BSM particles in the same mass range) will be produced with a transverse momentum that greatly exceeds their mass. The topologies that form in the decay of such highly boosted particles are expected to play an important role in searches for BSM physics. The BOOST2010 workshop brought together leading theorists and experimentalists in this field of study. In this paper, we present the report from the hadronic working group.
Many groups have studied the use of boosted objects in a range of Standard Model and new physics scenarios, demonstrating that these topologies can increase the experiments' potential in many different areas of the LHC physics programme, from searches for the Higgs boson to the reconstruction of SUSY cascade decays and heavy resonances. We hope that the review section may provide a starting point for people interested in this exciting subject. For experiments to benefit fully from the opportunities offered by boosted objects an extensive set of novel tools is required. We have prepared a number of samples to study the particle-level performance of these tools. We propose these be used as a benchmark for future analyses.
Jet grooming methods like pruning, trimming and filtering are particularly promising and the full deployment of these tools in the LHC experiments should be pursued actively. The comparison of the mass distributions for raw, ungroomed jets and after applying the pruning, filtering or trimming procedure reveals a clear improvement in the mass resolution for composite fat-jets formed in the hadronic decay of boosted top quarks. Consequently, the signal-to-background ratio in a given mass window is greatly improved. The different approaches to jet grooming yield rather similar results for the observables studied.
We have furthermore investigated the sensitivity of jet substructure observables to the Monte Carlo description. This study demonstrates that variations in the parton shower model, the underlying event activity or the detector model can have a non-negligible impact, especially on the jet mass. The result underlines the need for standard benchmark samples. Jet grooming is a very effective means to reduce the dependence of the jet mass on soft unrelated activity (underlying event, pile-up). The splitting scales are found to exhibit a much less pronounced sensitivity to such activity. These first results call for a much more detailed evaluation of the systematics in a realistic experimental environment.
Finally, we have compared different top tagging algorithms in identical conditions, i.e using common samples, primary jet reconstruction algorithms and performance estimators. We find boosted hadronic top decays can be tagged with an efficiency of well over 50%, while rejecting jets from QCD background by over a factor 20. For 20% efficiency the groomed taggers outperform their ungroomed counterparts, reaching a QCD jet rejection of over a factor of 200.