Boosted objects and jet substructure at the LHC

This report of the BOOST2012 workshop presents the results of four working groups that studied key aspects of jet substructure. We discuss the potential of the description of jet substructure in first-principle QCD calculations and study the accuracy of state-of-the-art Monte Carlo tools. Experimental limitations of the ability to resolve substructure are evaluated, with a focus on the impact of additional proton proton collisions on jet substructure performance in future LHC operating scenarios. A final section summarizes the lessons learnt during the deployment of substructure analyses in searches for new physics in the production of boosted top quarks.


Introduction
With a centre-of-mass energy of 7 TeV in 2010 and 2011 and of 8 TeV in 2012 the LHC has pushed the energy frontier well into the TeV regime. Another leap in energy is expected with the start of the second phase of operation in 2014, when the centre-of-mass energy is to be increased to [13][14] TeV. For the first time experiments produce large samples of W and Z bosons and top quarks with a transverse momentum p T that considerably exceeds their rest mass m (p T m). The same is true also for the Higgs boson and, possibly, for as yet unknown particles with masses near the electroweak scale. In this new kinematic regime, well-known particles are observed in unfamiliar ways. Classical reconstruction algorithms that rely on a one-to-one jetto-parton assignment are often inadequate, in particular for hadronic decays of such boosted objects.
A suite of techniques has been developed to fully exploit the opportunities offered by boosted objects at the LHC. Jets are reconstructed with a much larger radius parameter to capture the energy of the complete (hadronic) decay in a single jet. The internal structure of these fat jets is a key signature to identify boosted objects among the abundant jet production at the LHC. Many searches use a variety of recently proposed substructure observables. Jet grooming techniques 1 improve the resolution of jet substructure measurements, help to reject background, and increase the resilience to the impact of multiple proton-proton interactions.
In July 2012 IFIC Valencia organized the 2012 edition [4] of the BOOST series of workshops, the main forum for the physics of boosted objects and jet substructure 2 . Working groups formed during the 2010 and 2011 workshops prepared reports [9,10] that provide an overview of the state of the field and an entry point to the now quite extensive literature and present new material prepared by participants. In this paper we present the report of the working groups set up during BOOST2012. Each contribution addresses an important aspect of jet substructure as a tool for the study of boosted objects at the LHC.
A good understanding of jet substructure is a prerequisite to further progress. Predictions of jet substructure based on first-principle, analytical calculations may provide a more precise description of jet substructure and allow deeper insight. However, resummation of the leading logarithms in this case is notoriously difficult and the predictions may be subject to considerable uncertainties. In fact, one might ask: -Can jet substructure be predicted by first-principle QCD calculations and compared to data in a meaningful way?
The findings of the working group that was set up to evaluate the limitations and potential of the most popular approaches are presented in Section 2. While progress toward analytical predictions continues, searches for boosted objects that employ jet substructure rely on the predictions of mainstream Monte Carlo models. It is therefore vital to answer this question: -How accurately is jet substructure described by stateof-the-art Monte Carlo tools?
The BOOST2010 report [9] provided a partial answer, based on pre-LHC tunes of several popular leadingorder generators. After the valuable experience gained in the first three years of operation of the LHC, it seems appropriate to revisit this question in Section 3. A further potential limitation to the performance of jet substructure is the level to which the detector response can be understood and modelled. Again, the first years of LHC operation have provided valuable experience on how well different techniques work in a realistic experimental environment. In particular, the impact of multiple proton-proton interactions (pile-up) on substructure measurement has been evaluated exhaustively and mitigation schemes have been developed. Anticipating a sharp increase in the pile-up activity in future operating scenarios of the LHC, one might worry that in the future the detector performance might be degraded considerably for the sensitive substructure analyses. A third working group was therefore given the following charge: -How does the impact of additional proton proton collisions limit jet substructure performance at the LHC, now and in future operating scenarios?
Section 4 presents the contributions regarding jet reconstruction performance under extreme contributions, with up to 200 additional proton-proton collisions in each bunch crossing. We present the prospects for fake jet rates and the impact of pile-up on jet mass measurements under these conditions. In the first years of operation of the LHC several groups in ATLAS and CMS have deployed techniques specifically developed for the study of boosted objects in several analyses. Jet substructure has become an important tool in many searches for evidence for new physics. In Section 5 we present the lessons learnt in several studies of boosted top quark production that have been the first to apply these techniques and answer the following question: -How powerful is jet substructure in studies of boosted top production, and how can it be made even more powerful?
We hope that the answers to the above questions prepared by the working groups may shed some light on this rapidly evolving field. The internal structure of jets has traditionally been characterized in jet shape measurements. A detailed introduction to the current theoretical understanding and of the calculations needed for observables that probe jet substructure is provided in last year's BOOST report [10]. Here, rather than give a comprehensive review of the literature relevant to the myriad of developments, we focus on the progress made in the last year in calculations of jet substructure at hadron colliders. Like the Tevatron experiments ATLAS and CMS have performed measurements of the energy flow within the jet [11,12]. Both collaborations have moreover performed dedicated jet substructure measurements on large-R jets that are briefly reviewed before we introduce analytical calculations and summarize the status of the two main approaches.

Jet Substructure Measurements by ATLAS
The first measurement of jet mass for large-radius jets (R = 1.0, 1.2) and several substructure observables was performed by ATLAS on data from the 2010 run of the LHC [13]. These early studies include also a first measurement of the jet mass distribution for filtered [1] Cambridge-Aachen jets. A number of further jet shapes were studied with the same data set in Reference [14]. These early studies were crucial to establish the jet substructure response of the experiment and validate the Monte Carlo description of substructure. They are moreover unique, as the impact of pile-up could be trivially avoided by selecting events with a single primary vertex. The results, fully corrected for detector effects, are available for comparison to calculations.
Since then, the ATLAS experiment has performed a direct and systematic comparison of the performance of several grooming algorithms on inclusive jet samples, purified samples of high-p T W bosons and top quarks, and Monte Carlo simulations of boosted W and top-quark signal samples [15]. The parameters of largeradius (R = 1.0) trimmed [2], pruned [3] and massdrop filtered jet algorithms were optimized in the context of Standard Model measurements and new physics searches using multiple performance measures, including efficiency and jet mass resolution.
For a subset of the jet algorithms tested, dedicated jet energy scale and mass scale calibrations were derived and systematic uncertainties evaluated for a wide range of jet transverse momenta. Relative systematic uncertainties were obtained by comparing ratios of trackbased quantities to calorimeter-based quantities in the data and MC simulation. In situ measurements of the mass of jets containing boosted hadronically decaying W bosons further constrain the jet mass scale uncertainties for this particular class of jets to approximately ±1%.

Jet Substructure Measurements by CMS
The CMS experiment measured jet mass distributions with approximately 5 fb −1 of data at a center-of-mass energy of √ s = 7 TeV [16]. The measurements were performed in several p T bins and for two processes, inclusive jet production and vector boson production in association with jets. For inclusive jet production, the measurement corresponds to the average jet mass of the highest two p T jets. In vector boson plus jet (V + jet) production the mass of the jet with the highest p T was measured. The measurements were performed primarily for jets clustered with the anti-k t algorithm with distance parameter R = 0.7 (AK7). The mass of ungroomed, filtered, trimmed, and pruned jets are presented in bins of pt. Additional measurements were performed for anti-k t jets with smaller and larger radius parameter (R = 0.5, 0.8), after applying pruning [3] and filtering [1] to the jet, and for Cambridge-Aachen jets with R =0.8 and R = 1.2.
The jet mass distributions are corrected for detector effects and can be compared directly with theoretical calculations or simulation models. The dominant systematic uncertainties are jet energy resolution effects, pileup, and parton shower modeling.
The study finds that, for the grooming parameters examined, the pruning algorithm is the most aggressive grooming algorithm, leading to the largest average reduction of the jet mass with respect to the original jet mass. Due to this fact, CMS also finds that the pruning algorithm reduced the pileup dependence of the jet mass the most of the grooming algorithms.
The jet mass distributions are compared against different simulation programs: Pythia 6 [17,18] (version 424, tune Z2), Herwig++ [19,20] (version 2.4.2, tune 23), and Pythia 8(version 145, tune 4C), in the case of inclusive jet production. In general the agreement between simulation and data is reasonable although Her-wig++ appears to have the best agreement with the data for more aggressive grooming algorithms. The V + jet channel appears to have better agreement overall than the inclusive jets production channel which indicates that quark jets are modeled better in simulation. The largest disagreement with data comes from the low jet mass region, which is more affected by pileup and soft QCD effects.
The jet energy scale and jet mass scale of these algorithms were validated individually. The jet energy scale was investigated in MC simulation, and was found to agree with the ungroomed energy scale within 3%, which is assigned as an additional systematic uncertainty. The jet mass scale was investigated in a sample of boosted W bosons in a semileptonic tt sample. The jet mass scale derived from the mass of the boosted W jet agrees with MC simulation within 1%, which is also assigned as a systematic uncertainty.

Analytical predictions for jet substructure
Next-to-leading order (NLO) calculations in the strong coupling constant have been performed for multi-jet production, even in association with an electro-weak boson. This means that substructure observables, such as the jet mass, can be computed to NLO accuracy using publicly available codes [21,22]. However, whenever multiple scales, e.g. a jet's transverse momentum and its mass, are involved in a measurement, the prediction of the observables will contain logarithms of ratios of these scales at each order in perturbation theory. These logarithms are so important for jet shapes that they qualitatively change the shapes as compared to fixed order. Resummation yields a more efficient organization of the perturbative expansion than traditional fixedorder perturbation theory. Accurate calculations of jet shapes are impossible without resummation. In general one can moreover interpolate between, or merge, the resummed and fixed-order result.
In resummation techniques the perturbative expansion of cross-sections for generic observables v is schematically organized in the form 3 where is the corresponding Born crosssection and L = ln v is a logarithm of the observable in question 4 .
The notation used in traditional fixed-order perturbation theory refers to the lowest-order calculation as leading order (LO) and higher-order calculations as next-to-leading order (NLO), next-to-next to leading order (NNLO), and so on (with N n LO referring to the O(α n s ) correction to the LO result). When organized instead in resummed perturbation theory as in Eq. (1), the lowest order, in which only the function g (δ) 1 is retained, is referred to as leading-log (LL) approximation. Similarly, the inclusion of all g (δ) i with 1 ≤ i ≤ k + 1 and of g 0 up to order α k−1 s gives the next k -to-leading log approximation to ln σ; this corresponds to the resummation of all the contributions of the form α n s ln m N with 2(n − k) + 1 ≤ m ≤ 2n in the cross section σ. This can be extended to 2(n−k) ≤ m ≤ 2n by also including the order α k s contribution to g (δ) 0 . Typical Monte Carlo event generators such as Pythia, Herwig++ and Sherpa [23] are correct at LL. NLL accuracy has also been achieved for some specific observables, but it is difficult to say whether this can be generally obtained. Analytic calculations provide a way of obtaining precise calculations for jet substructure. Multiple observables have been resummed (most often at least to NLL but not uncommonly to NNLL and as high as NNNLL accuracy for a few cases) and others are actively being studied and calculated in the theory community.
Often for observables of experimental interest, nonglobal logarithms (NGLs) arise [24], in particular whenever a hard boundary in phase-space is present (such as a rapidity cut or a geometrical jet boundary). These effects enter at NLL level and therefore modify the structure of the function g (δ) 2 in Eq. (1). Until very recently [25], the resummation of NGLs was confined to the limit of large number of colours N C [24,26,27].
Moreover, we should stress that another class of contributions, usually referred to as clustering logarithms, affects the g (δ) 2 series of Eq. (1) if an algorithm other than anti-k t is used to define the jets [28,29]. The analytic structure of these clustering effects has been recently explored in Ref. [30,31] for the case of Cambridge-Aachen and k t algorithms.
Furthermore, recent studies have shown that strict collinear factorization is violated if the observable considered is not sufficiently inclusive [32,33]. As a consequence, coherence-violating (or super-leading) logarithms appear, which further complicate the resummation of certain observables. These contributions affect, for example, non-global dijets observables [34,35] but also some classes of global event shapes [36].
Of course, to fully compare to data one needs to incorporate the effects of hadronization and multi-particle interactions (MPI). Progress on this front has also been made, both in purely analytical approaches (especially for hadronization effects [37]) and in interfacing analytical results with parton showers that incorporate these effects.
The two main active approaches to resummation are referred to as traditional perturbative QCD resummation (pQCD) and Soft Collinear Effective Theory (SCET). They describe the same physical effects, which are captured by the Eqs. 1 and 2. However, the techniques employed in pQCD and SCET approaches often differ. Calculations in pQCD exploit factorization and exponentiation properties of QCD matrix elements and of the phase-space associated to the observable at hand, in the soft or collinear limits. The SCET approach is based on factorization at the operator level and exploits the renormalization group to resum the logarithms. The two approaches also adopt different philosophies for the treatment of NGLs. A more detailed description of these differences is given in the next Sections.

Resummation in pQCD
Jet mass was calculated in pQCD in [39]. A more extensive study can be found in Ref. [40] where the jet mass distribution for Z+jet and inclusive jet production, with jets defined with the anti-k t algorithm, were calculated at NLL accuracy and matched to LO. In particular, for the Z+jet case, the jet mass distribution of the highest p T jet was calculated whereas for inclusive jet production, essentially the average of jet mass distributions of the two highest p T jets was calculated. For the Z+jet case, one has to consider soft-wide angle emissions from a three hard parton ensemble, consisting of the incoming partons and the outgoing hard parton. For three or fewer partons, the colour structure is trivial. Dijet production on the other hand involves an ensemble of four hard partons and the consequent soft wide-angle radiation has a non-trivial colour matrix structure. The rank of these matrices grows quickly with the number of hard partons, making the calculations for multi jet final states a formidable challenge 5 .
The jet mass is a non-global observable and NGLs of m J /p T for jets with transverse momentum p T are induced. Their effect was approximated using an analytic formula with coefficients fit to a Monte Carlo simulation valid in the large N C limit, obtained by means of a dipole evolution code [24]. It was found that in inclusive calculations 6 the effects of both the soft wide-angle radiation and the NGLs, both of which affect the g (δ) 2 series in Eq. (1), play a relevant role even at relatively small values of jet radius such as R = 0.6 and hence in general cannot be neglected A restriction on the number of additional jets could be implemented, for instance, by vetoing additional jets with p T > p cut T . The presence of a jet veto modifies the calculation in several ways. First of all, it affects the 5 The colour structure of soft gluon resummation in a multi jet environment has been studied in [41,42] and resummed calculations for the case of five hard partons in the context of jet production with a central jet veto can be found in Refs [34,35,43,44] 6 We refer to inclusive calculations if no requirements were made on the number of additional jets in the selection of the event.
argument of the non-global logarithms: ln n (m 2 J /p 2 T ) → ln n (m 2 J /(p T p cut T )). Thus p cut T could be in principle used to tame the effect of NGLs. However, if the veto scale is chosen such that p cut T p T , logarithms of this ratio must be also resummed. Depending on the specific details of the definition of the observable, this further resummation can be affected by a new class of NGLs [45,46].
An obstacle to inclusive predictions in the number of jets is that the constant term g (δ) 0 in Eq. (1) receives contributions from higher jet topologies that are not related to any Born configurations. For instance, the jet mass in the Z+jet process would receive contributions from Z+2jet configurations, which are clearly absent in the exclusive case. The full determination of the constant term to O(α s ) and the matching to NLO is ongoing.

Resummation in SCET
There have been several recent papers in SCET directly related to substructure in hadron collisions 7 . Ref. [47] discusses the resummation of jet mass by expanding around the threshold limit, where (nearly) all of the energy goes into the final state jets. Expanding around the threshold limit has proven effective for other observables, see Ref. [48] and references in Ref. [47]. The large logarithms for jet mass are mainly due to collinear emission within the jet and soft emission from the recoiling jet and the beam. These same logarithms are present near threshold and the threshold limit automatically prevents additional jets from being relevant, simplifying the calculation. The study in Ref. [47] performs resummation at the NNLL level, but does not include NGLs. Instead, their effect is estimated and found to be subdominant in the peak region, where other effects, such as nonperturvative corrections, are comparable. Thus NGLs could be safely ignored where the calculation was most accurate.
An alternative approach using SCET is found in Ref. [49]. Beam functions are used to contain the collinear radiation from the beam remnants. The jet mass distribution in Higgs+1jet events is studied via the factorization formula for 1-jettiness, that is calculated to NNLL accuracy. Using 1-jettiness, the jet boundaries are defined by the distance measure used in 1-jettiness itself, instead of a more commonly employed jet algorithm, although generalizations to arbitrary jet algorithms are possible.
For a single jet in hadron collisions, 1-jettiness can be used as a means to separate the in-jet and out-of-jet radiation (see for a review the BOOST2011 report [10]). The observable studied in Ref. [49] is separately differential in the jet mass and the beam thrust. The in-jet component is related to the jet mass, and can be converted directly up to corrections that become negligible for higher p T (up to about 3% for p T = 300 GeV in the peak of the distribution of the in-jet contribution to 1-jettiness which is smaller than NNLL uncertainties). The beam thrust 8 is a measure of the out-of-jet contributions, equivalent to a rapidity-weighted veto scale p cut on extra jets. The calculation can be made exclusive in the number of jets by making the out-of-jet contributions small. Where Ref. [47] ensures a fixed number of jets by expanding around the threshold limit, Ref. [49] includes an explicit jet veto scale.
Exclusive calculations in the number of jets avoid some of the issues mentioned in Sec. 2.4. An important property of 1-jettiness is that, when considering the sum of the in-and out-of-jet contributions, no NGLs are present, and when considering these contributions separately, only the ratio p cut /m J of these two scales is non-global. A smart choice of the veto scale may then allow to minimize the NGL and make the resummation unnecessary. This corresponds to the NGLs discussed in Sec. 2.4 that are induced in going from the inclusive to the exclusive case. These are the only NGLs present; the additional NGLs of the measured jet p T to their mass discussed for the observable of Sec. 2.4 are absent in this case. By using an exclusive observable, with an explicit veto scale, NGLs are controlled. For comparison with inclusive jet mass measurements, such as those discussed in Sections 2.1 and 2.2, the uncertainty associated with the veto scale can be estimated in a similar fashion as the NGL estimate in Ref. [47].
It was argued in Ref. [49] that the NGLs induced by imposing a veto on both the p T and jet mass are smaller than the resummable logarithms of the measured jets over a range of veto scales. In contrast, in the inclusive case the corresponding p T value that appears in the NGLs is of the order of the measured jet p T (since all values less than this are allowed), making it a large scale and the NGLs as large as other logarithms. For a fixed veto cut, it was argued that the effect of these NGLs (at least of those that enter at the first non-trivial order, O(α 2 s )), can be considered small enough to justify avoiding resummation for a calculation up to NNLL accuracy for 1/ √ 8 < m cut J /p cut < √ 8 (cf. Ref. [53]) in the peak region where a majority of events lie. It is also worth noting that the effect of normalizing the distribu-tion by the total rate up to a maximum m cut J and p cut has several advantages and in particular has a smaller perturbative uncertainty than the unnormalized distribution, in addition to having smaller experimental uncertainties.
We also note that while jet mass is now the most well-understood substructure observable, it is also clearly much simpler than the more complicated techniques often employed by experimentalists in boosted studies. There has also been progress in understanding more complicated measurements using SCET, and in particular a calculation of the signal distribution in H → bb was performed in Ref. [54]. While it is probably fair to say that our theoretical understanding (or at least the numerical accuracy) of such measurements are currently not at the same level as that of the jet mass, this is a nice demonstration that reasonably accurate calculations of realistic substructure measurements can be performed with the current technologies and that it is not unreasonable to expect related studies in the near future.

Discussion and recommendations for further substructure measurements
We have presented a status report for the two main approaches to the resummation of jet substructure observables, with a focus on their potential to predict the jet invariant mass at hadron colliders. In both approaches recent work has shown important progress We hope that providing predictions beyond the accuracy of parton showers may help both discovery and measurement. Beyond the scope of improving our understanding of QCD, gaining intuition for which treatments work best is an important step towards adopting such predictions as an alternative to parton showers. Non-perturbative corrections like hadronization are more complicated at the LHC due to the increased colour correlations. Entirely new perturbative and semiperturbative effects such as multiple-particle interactions appear. Monte Carlo simulations suggest that these have a significant impact.
The treatments of non-perturbative corrections and NGLs are often different in pQCD and SCET 9 and this leads to slight differences in which measurements are best suited for comparison to predictions. The first target for the next year should be a phenomenological study of the jet mass distribution in Z+jet, for which we encourage ATLAS and CMS measurements. Ideally, since the QCD and SCET literature have emphasized a difference in preference for inclusive or exclusive measurements (in the number of jets), both should be measured to help our understanding of the two techniques.
The importance of boosted-object taggers in searches for new physics will increase strongly in the near future in view of the higher-energy and higher-luminosity LHC runs. However, the theoretical understanding of these tools is in its infancy. Analytic calculations must be performed in order to understand the properties of the different taggers and establish which theoretical approaches (MC, resummation or even fixed order) are needed to accurately compute these kind of observables 10 .

Monte Carlo Generators for Jet Substructure Observables
Section prepared by the Working Group: 'Monte Carlo predictions for jet substructure', A. Arce, D. Bjergaard, A. Buckley, M. Campanelli, D. Kar, K. Nordstrom.
In order to use boosted objects and substructure techniques for measurements and searches, it is important that Monte Carlo generators describe the jet substructure with reasonable precision, and that variations due to the choice of parton shower models and their parameters are characterized and understood. We study jet mass, before and after several jet grooming procedures, a number of popular jet substructure observables, colour flow and jet charge. For each of these we compare the predictions of several parton shower and hadronisation codes, not only in signal-like topologies, but also in background or calibration samples.

Monte Carlo samples and tools
Three processes in pp collisions are considered at √ s = 7 TeV: semileptonic tt decays, boosted semileptonic tt decays, and (W ± → µν)+jets. These processes provide massive jets coming from hadronic decays of a colour-neutral boson as well as jets from heavy and light quarks.
Like Z+jets, the (W → µν)+jets process provides a well-understood source of quarks and gluons, and additionally allows an experimentally accessible identification ("away-side-tag") of the charge of the leading jet.
Assuming that the charge of this jet is opposite to the muon's charge leads to the same charge assignment as a conventional parton matching scheme in approximately 70% of simulated events in leading order Monte Carlo simulation; in the remaining 30% of cases, the recoiling jet matches a (charge-neutral) gluon.
The selection of t, W ± , and quark jet candidates for the distributions compared below include event topologies that can be realistically collected in the LHC experiments, with typical background rejection cuts, so that these studies, based on simulation, could be reproduced using LHC data.
The most commonly used leading order (LO) Monte Carlo simulation codes are the Pythia and Herwig families. Here, predictions from the Perugia 2011 [58] tune with CTEQ5L [59] [65] is also included in comparisons. The Pythia6 generator with the Peru-gia2011 tune is taken as a reference in all comparisons. For each generator, tune and process 1 million protonproton events at √ s = 7 TeV are produced. The analysis relies on the FastJet 3.0.3 package [66,67] and Rivet analysis framework [68]. All analysis routines are available on the conference web page [69]. In the boosted semileptonic tt analysis, large-radius jets were formed using the anti-k t algorithm [70] with a radius parameter of 1.2 using all stable particles within pseudorapidity |η| < 4. The jets are selected if they passed the following cuts: p jet T > 350 GeV, 140 GeV < m jet < 250 GeV. Only the leading and subleading jets were selected if more than two jets passed the cuts. The subjets were formed using the Cambridge-Aachen algorithm [71,72] with radius 0.3.

Jet mass
The jet mass distribution for the leading jet in the boosted semi-leptonic tt sample is shown in Fig. 1. The parton shower models in Pythia6, Pythia8, Her-wig++ and Sherpa yield significantly different predictions. Important differences are observed in the location and shape of the top quark mass peak. The largest deviations of the normalized cross section in a given jet mass bin amount to approximately 20%. Much better agreement is obtained for predictions with different tunes of a single generator.
The effect of different grooming techniques on jet mass is also shown in Fig. 1. For filtering, three hardest subjets with R sub = 0.3 are used. The trimming uses all subjets over 3% of p jet T and R sub = 0.3. For pruning, z = 0.1 and D = m jet /p jet T is used. As expected, a much narrower top quark mass peak is obtained, with a particularly strong reduction of the high-mass tail. The grooming procedure improves the agreement among the different Monte Carlo tools, as expected from previous Monte Carlo studies with a more limited set of generators [9] and comparison with data [13].

Jet substructure observables
We investigate the spread among generators for a number of other substructure observables on the market: -The Angular Correlation Function [73] measures the ∆R scale of a jet's radiation. It is defined as: where the sum runs over all pairs of particles in the jet, and Θ(x) is the Heaviside step function. The Angular Structure Function is defined as the following derivative: Peaks in ∆G(R) 11 can then be found which correspond to ∆R scales with excess radiation in the jet. The variable r 1 * is the point in the dR-spectrum that the first peak in the angular structure function appears at, and n p is the total number of peaks in the jet's angular structure function. The prominence h of the highest peak is defined as its height. The prominence of any lower peak is defined as the minimum vertical descent that is required in descending from that peak before ascending a higher, neighboring peak. A prominence of h > 4 for peaks in the angular structure function is required and the partial mass and ∆R scale of the most prominent peaks are retained. -N -subjettiness [74,75] measures how much of a jet's radiation is aligned along N subjet axes in the y − φ plane. It is defined as: In this analysis the derivatives are smoothed using a Gaussian in the numerator and an error function in the denominator, both with σ = 0.06  where ∆R n,k is the distance from k to the nth subjet axis in the y − φ plane, R jet is the radius used for clustering the original jet, and β is an angular weighting exponent 12 . -Angularity [76] introduces an adjustable parameter a that interpolates between the well-known event shapes thrust and jet broadening. Jet angularity is an IRC safe variable (for a < 2) that can be used to separate multijet background from jets containing boosted objects [77]. It is defined as: To improve the performance of N -subjettiness it is possible to use a k-means clustering algorithm to find (locally) optimal locations for the subjet axes. In this analysis β = 1 is used to find the subjet axes by reclustering with the kt algorithm. The k-means clustering algorithm is run once, as with this angular weighting exponent it finds a local minimum immediately. No attempt is made to find the global minimum.
where ω i is the energy of a constituent of the jet.
where v max and v min are the maximum and minimum values of the variances of jet constituents along the principle and minor axes 13 .
Most models predict very similar behavior for angularity, eccentricity and the ∆R scale of the peak in the n p = 1 bin for the angular structure function. Deviations are typically below 10% for these observables. The harder jet mass distribution in SHERPA and the softer spectrum in Pythia8 are reflected in the edges of the τ 3/2 distribution.

Colour flow
Colour flow observables offer a complimentary way to probe boosted event topologies. Pull [79] is a p T -weighted vector in η − φ space that is constructed so as to point from a given jet to its colour-connected partner(s). The pull is measured with respect to the other W ± daughter jet. The W -boson is selected kinematically in 4-jet events with 2 b-quarks, and flavors are labelled using the highest p T cone. In Fig. 3, the top left plot shows this variable for a background-like distribution. The comparisons demonstrate that Herwig produces a different colour flow structure.
Dipolarity [80] can distinguish whether a pair of subjets arises from a colour singlet source. In the top right plot of Fig. 3, the dipolarity predictions are seen to be similar for all models considered.

Jet charge
Jet charge [81,82,83] is constructed in an attempt to associate a jet-based observable to the charge of the originating hard parton. The p T -weighted jet charge Fig. 3, using anti-kt 0.6 jets. The comparison displays the most relevant distributions for typical quark tagging and boson tagging analyses. Different MC models are seen to have very similar predictions for this observable too.

Summary
We have prepared the Rivet routines to evaluate the predictions of Monte Carlo generators for the internal structure of large area jets. The normalized predictions  from several mainstream Monte Carlo models are compared. Several aspects of jet substructure are evaluated, from basic jet invariant mass to colour flow observables and jet charge.
We find that for jet mass large variations are observed between the various MC models. However, for groomed jets the deviations between different model predictions are smaller. The differences between several recent tunes of the Pythia generator are much smaller. The MC model predictions are similar for Nsubjettiness, angularity and eccentricity. The Herwig++ model gives different predictions than other models for colour flow observables, but since the implementation of colour connection in Herwig++ model is very recent, this may lead to improvement of the model.

The impact of multiple proton-proton collisions on jet reconstruction
Section prepared by the Working Group: 'Jet substructure performance at high luminosity', P. Loch, D. Miller, K. Mishra, P. Nef, A. Schwartzman, G. Soyez.
The first LHC analyses exploring the experimental response to jet substructure demonstrated that the highly granular ATLAS and CMS detectors can yield excellent performance. They also confirmed the susceptibility of the invariant mass of large-area jets to the energy flow from the additional proton-proton interactions that occur each bunch crossing. And, finally, they provided a first hint that jet grooming could be a powerful tool to mitigate the impact of pile-up. Since then, the LHC collaborations have gained extensive experience in techniques to correct for the impact of pile-up on jets. In this Section these tools are deployed in an extreme pile-up environment. We simulate pile-up levels as high as µ = 200, such as may be expected in a future high-luminosity phase of the LHC. We evaluate the impact on jet reconstruction, with a focus on the (substructure) performance.

Pile-up
Each LHC bunch crossing gives rise to a number of proton-proton collisions and typically the hard scattering (signal) interaction is accompanied by several additional pile-up proton-proton collisions. The total proton-proton cross-section is about σ tot = 98 mb (inelastic σ inel = 72.9 mb) at √ s = 7 TeV [84], and even slightly higher at Pile-up manifests itself mostly in additional hadronic transverse momentum flow, which is generated by overlaid and statistically independent, predominantly soft proton-proton collisions that we refer to as "minimum bias" (MB). This diffuse transverse energy emission interferes with the signal of hard scattering final state objects like particles and particle jets, and typically requires corrections, in particular for particle jets. In addition, it can generate particle jets (pile-up jets) either from any of the individual MB collisions (QCD jets), or by stochastically forming jets in the high density particle flow generated by the multitude of them (stochastic jets).

Monte Carlo event generation
We model the pile-up with MB collisions at √ s = 8 TeV and a bunch spacing of 50 ns, generated with the Pythia Monte Carlo (MC) generator [85,86], with its 4C tune [61]. All inelastic, single diffractive, and double diffractive processes are included, with the default fractions as provided by Pythia(tune 4C).
Overall 100 × 10 6 MB events are available for pileup simulation. The corresponding data are generated in samples of 25000 MB collisions, with the largest possibly statistical independence between samples, including new random seeds for each sample. To model pile-up for each signal interaction, the stable particles 14 generated in a number µ of MB collisions, with µ being sampled from a Poisson distribution around the chosen µ , are added to the final state stable particles from the signal. This is done dynamically by an event builder in the analysis software, and is thus not part of the signal or MB event production. All analysis is then performed on the merged list of stable particles to model one full collision event at the LHC.
The example signal chosen for the Monte Carlo simulation based studies presented in this Section is the decay of a possible heavy Z boson with a chosen M Z = 1.5 TeV to a (boosted) top quark pair, at √ s = 8 TeV. The top-and anti-top-quarks then decay fully hadron- The Pythia generator [85,86] is used to generate the signal samples. The soft physics modeling parameters in both cases are from the pre-LHCdata tune 4C [61]. The pile-up is simulated by overlaying generated minimum bias proton-proton interactions at √ s = 8 TeV using Poisson distributions with averages µ = {30, 60, 100, 200}, respectively, thus focusing on the exploration of future high intensity scenarios at LHC.
All analysis utilizes the tools available in the Fast-Jet [66] package for jet finding and jet sub-structure analysis. The larger jets used to analyze the final state are reconstructed with the anti-k T algorithm [70] with R = 1.0, to assure that most of the final state top-quark decays can be collected into one jet. This corresponds to top-quarks generated with p T 400 GeV. The configurations for jet grooming are discussed in Section 4.6.

Investigating jets from pile-up
Stable particles emerging from the simulated protonproton collisions are clustered into anti-k T jets [70] with a radial distance parameter R = 0.4, using the Fast-Jet [66] implementation: Truth jets are obtained by clustering all stable particles from a given individual MB interactions. For an event containing µ pileup interactions, jet finding is therefore executed µ times. The resulting truth jets are required to have p T ≥ 5 GeV. Pileup jets are obtained by clustering the stable particles from all MB interactions forming the pile-up event. They are subjected to the kinematic cuts described below.
Jets with rapidity |y| < 2 are accepted. The contribution of pile-up to jets can be corrected using the jet area based method in Ref. [87]. It employs the median transverse momentum density ρ, which here is determined using k T jets with R = 0.4 within |y| < 2.
To evaluate the effect of this correction, the transverse momentum ratio R p T is introduced as Here A is the catchment area [67] of the pile-up jet, and p match T is the matching truth jet p T . The matching criterion is similar to the one suggested in Ref. [88], where the truth jet matching uses the constituents shared between the truth jet and the pile-up jet. The jets are considered matched if the fraction of constituents of the truth jet that are also contained in the pile-up jet contribute to at least 50% of the truth jet p T . In the following, pile-up jets are only considered if their corrected transverse momentum is p corr T ≥ 20 GeV, and they are matched to at least one truth jet. The contribution of particles from any vertex to a given pile-up jet can be measured using the jet vertex fraction (F jvf ). It is defined as where N part (V i ) is the number of particles from a given vertex V i , and N coll is the number of collision vertices contributing particles to the jet. F jvf is calculated for each of these vertices. Note that p T corresponds to the uncorrected jet transverse momentum and consequently, the value of each component of F jvf (V i ) depends on µ.

Evaluation of the pile-up jet nature
It follows from the definition of R p T that pile-up jets with values of R p T close to unity are matched to a truth jet with p T ≈ p corr T of the pile-up jet itself. Consequently, there is a single MB interaction which predominantly contributes to the jet. On the other hand, jets with a small value of R p T are mostly stochastic, as no single minimum-bias collision contributes in a dominant way to the pile-up jet. We characterize jets as stochastic if R p T is smaller than 0.8. This threshold value is arbitrary and the fraction of QCD-like and stochastic jets depends on the exact choice. The conclusion of our study holds for a broad range of cut values.
The fractions of QCD-like and stochastic pile-up jets change as a function of pileup jet p T and µ. This can be seen in Fig. 4, where QCD jet-like samples are defined by R p T > 0.8 for each pile-up level. The fraction of these jets at a given p corr T decreases exponentially with µ. The exponential decrease is slower for larger the p corr T . At a pile-up activity of µ = 100, the fraction of pile-up jets that are QCD-like is about 40% (20%) for p corr T > 40 GeV (20 < p corr T < 30 GeV). At µ = 150, these numbers decrease to about 25% and 15%, respectively.

Pile-up jet multiplicity
The mean number of pileup jets per event, as a function of jet p corr T and N PV , is indicative of the efficiency of the jet area based method to suppress jets generated by pile-up. It is shown in Fig. 5 for the inclusive pile-up jets and separately for the subsample of QCD-like pile-up jets satisfying R p T > 0.8. It is observed that the average inclusive number N of low (p corr T 20 GeV) pileup jets per event increases rather linearly with µ, i.e. ∂ N /∂µ ≈ const. For higher pile-up jet p T , ∂ N /∂µ is significantly smaller, and displays an increase with increasing µ.
The sub-sample of QCD-like jets in the inclusive pile-up jet sample shows a different behavior, as indicated in the righmost panel of Fig. 5. In this case ∂ N /∂µ decreases with increasing µ in all considered bins of p corr T . This contradicts the immediate expectation of an increase following the inclusive sample, but can be understood from the fact that with increasing µ the likelihood of QCD-like jets to overlap with (stochastic) jets increases as well. The resulting (merged) pileup jets no longer display features consistent with QCD jets (e.g., loss of single energy core), and thus fail the R p T > 0.8 selection.
The pile-up jet multiplicity shown in Fig. 5  Jet trimming Trimming is described in detail in Ref. [2].
In this approach the constituents of the large antik T jet formed with R = 1.0 are re-clustered into smaller jets with R trim = 0.2, using the anti-k T algorithm again. The resulting sub-jets are only accepted if their transverse momentum is larger than a fraction f (here f = 0.03) of a hard scale, which was chosen to be the p T of the large jet. The surviving sub-jets are recombined into a groomed jet. Jet filtering Filtering was introduced in the context of a study to enhance the signal from the Higgs boson decaying into two bottom-quarks, see Ref. [1]. In its simplified configuration without mass-drop criterion [89] applied in this study it works similar to trimming, except that in this case the sub-jets are found with the Cambridge-Aachen algorithm [72,90] with R filt = 0.3, and only the three hardest sub-jets are retained. The groomed jet is then constructed from these three sub-jets. Jet pruning Pruning was introduced in Ref. [91].
Contrary to filtering and trimming, it is applied during the formation of the jet, rather than based on the recombination of sub-jets. It dynamically suppresses small and larger distance contributions to jet using two parameters, Z cut for the momentum based suppression, and D cut = D cut,fact × 2m/p T (here m and p T are the transverse momentum and mass of the original jet) for the distance based. Pruning vetoes recombinations between two objects i and j for which the geometrical distance between i and j is more than D cut and the p T of one of the objects is less than Z cut × p i+j T , where p i+j T is the combined transverse momentum of i and j. In this case, only the hardest of the two objects is kept. Typi-cal values for the parameters are: Z cut = 0.1 and D cut,fact = 0.5.
In this study, trimming and filtering are applied to the original anti-k T jets with size R = 1.0. We study the interplay between jet grooming and area-based pile-up correction. The subtraction is applied directly on the 4-momentum of the jet using: with ρ = median m δ,patch = i∈patch √ m 2 i + p 2 t,i − p ti , and A µ is the active area of the jet as defined in Ref. [67] and computed by FastJet. The ρ term, mentioned above is the standard correction typically correcting the transverse momentum of the jet. The ρ m term corrects for contamination to the total jet mass due to the PU particle. When applying this subtraction procedure, we discard jets with negative transverse momentum or (squared) mass of the jet.
The estimation of ρ and ρ m is performed with Fast-Jet using 15 k t jets with R = 0.4. Corrections for the rapidity dependence of the pileup density ρ are applied using a rapidity rescaling.
When we apply this background subtraction together with trimming or filtering, the subtraction is performed directly on the subjets, before deciding which subjets should be kept, so as to limit the potential effects of pileup on which subjets are to be kept.

Jet substructure performance
The various methods and configurations discussed in the previous section are applied to the jets reconstructed with the anti-k T algorithm with R = 1.0 in the Z → tt final state in the presence of pile-up. For the studies presented in this report we require jet p T before grooming and pileup subtraction to be greater than 100 GeV and consider the two hardest p T jets in the event. We further require that the rapidity difference between the two jets |y 1 − y 2 | is less than one. The immediate expectation for the reconstructed jet mass m is the top mass, i.e. m ≈ 175 GeV, and no residual dependence on the pile-up activity given by µ , after the pile-up subtraction. The two plots in the upper row of Figure 6 show the distributions of the reconstructed jet masses without any grooming and with the pile-up subtraction discussed in Section 4.6 applied. The effect of pile-up on the mass scale and resolution is clearly visible. Applying only the pile-up subtraction, without changing the composition of the jets, already improves the mass reconstruction significantly. All µ dependence is removed from the jet mass spectrum, as shown in Fig. 6. In particular, the position of the mass peak is recovered. With increasing pileup, the mass peak gets more and more smeared, an effect due to the fact that the pileup is not perfectly uniform. These point-to-point fluctuations in an event lead to a smearing ±σ √ A in (5). For very large pileup, this smearing extends all the way to m = 0 as seen in Fig. 6.
The effect of the other grooming techniques on the reconstructed jet mass distributions is summarized in Fig. 6, with and without the pile-up subtraction applied first. The spectra show that both trimming and filtering can improve the mass reconstruction. The application of the pile-up subtraction in addition to trimming or filtering further improves the mass reconstruction performance.
The findings from the spectra in Figs. 6 are quantitatively summarized in Fig. 7 for the mass scale and resolution. Here the resolution is measured in terms of the mass range in which 67% of all jet masses can be found (Q 67% (m jet ) quantile). Maintaining the jet mass scale around the expectation value of 175 GeV works well for trimming and filtering with and without pileup subtraction, see Fig. 7. The same figure indicates that for very high pile-up ( µ = 100 − 200), the jet mass after trimming and filtering without pile-up subtraction shows increasing sensitivity to the pile-up. The additional pile-up subtraction tends to restore the mass scale with better quality.
Both trimming and filtering improve the mass resolution to different degrees, but in any case better than pile-up subtraction alone, as expected. Applying the additional pile-up subtraction to trimming yields the least sensitivity to the pile-up activity in terms of mass resolution and scale.
These effects can be explained as follows. As discussed earlier, pileup has mainly two effects on the jet: a constant shift proportional to ρA and a smearing effect proportional to σ √ A, with σ a measure of the fluctuations of the pileup within an event. In that language, the subtraction corrects for the shift leaving the smearing term untouched. Grooming, to the contrary, since it selects only part of the subjets, acts as if it was reducing the area of the jet 16 . This reduces both the shift and the dispersion. Combining grooming with subtraction thus allows to correct for the shift leftover by grooming 16 Note that grooming techniques do more than reducing the catchment area of a jet. Noticeably, the selection of the hardest subjets introduces a bias towards including upwards fluctuations of the background. This positive bias is balanced by a negative one related to the perturbative radiation discarded by the grooming. These effects go beyond the generic features explained here.   and reduce the smearing effects at the same time. All these effects are observed in Figures 7.

Concluding remarks
The source of jets produced in minimum bias collisions in the presence of pile-up is analyzed using a technique relating the single collision contribution in the jet to its transverse momentum after pile-up correction in particle level Monte Carlo. The rate of pile-up jets surviving after application of the jet area based pile-up subtraction is about two with p corr T > 20 GeV and within |y| < 2, at a pile-up activity of µ = 100. It rises about linearly with increasing pile-up for this particular selection. Higher p T jets occur at a much reduced rate, but with a steeper than linear rise with increasing µ.
The rate of QCD-like jets is significantly smaller, and shows a less-than-linear increase with increasing µ even for p min T = 20 GeV. This can be understood as a sign of increased merging between QCD-like jets and stochastic jets. The merged jets are less likely to display features characteristic for QCD-like jets, and therefore fail the selection.
The fraction of QCD-like jets with a core of energy arising from a single proton-proton interaction of at least 0.8p corr T is found to decrease rapidly with increasing µ. At µ = 50 about 60% of the pile-up jets with p corr T > 50 GeV are found to be QCD-like, whereas at µ = 200 this number is decreased to about 20%.
A brief Monte Carlo study of the effect of jet grooming techniques on the jet mass reconstruction in Z → tt final states has been conducted. Jet trimming and filtering are used by themselves, or in combination with the pile-up subtraction using the four-vector area, to reconstruct the single jet mass and evaluate the stability of the mass scale and resolution at pile-up levels of 30, 60, 100, and 200 extra proton-proton collisions, in addition to the signal event. It is found that for this particular final state trimming and filtering work well for maintaining the mass scale and resolution, provided they are applied together with pile-up subtraction so as to benefit both from the average shift correction from subtraction and noise reduction from the grooming.
The studies presented here are performed with Monte Carlo simulated signal and pile-up (minimum bias) interactions. No considerations have been given to detector sensitivities and other effects deteriorating the stable particle level kinematics and flows exploited here. With this respect the conclusions of this study are limited and can be considered optimistic until shown otherwise. Note also that comparing the performance of filtering and trimming would require varying their parameters and that this goes beyond the scope of this study. Among the many applications of the strategies for boosted objects discussed in the literature (the bibliog-raphy of References [9,10] is a good starting point to navigate the extensive literature), the study of highly energetic top quarks forms the case that has been studied in greatest detail by the experiments. Several studies of the production of boosted top quarks have set limits on new physics scenarios. The first sample of boosted top quarks has also been used to understand the modelling of the parton shower and the detector response. In this section we present a summary of achievements so far, discuss how existing analyses could benefit from an improved understanding of jet substructure, and explore possible directions for future work.

Boosted top quark production
The top quark decay topology observed in the detector depends strongly on the kinematic regime. The decays products of top quarks produced nearly at rest (p T < 200 GeV/c) are well-separated, leading to experimental signatures such as isolated leptons and a relatively large number of clearly resolved jets. With increasing transverse momentum, the decay products of the top quark will become collimated and possibly reconstructed in the same final state object. For intermediate boosts (200 < p T <400 GeV), the daughters of the W boson from a fully-hadronic top decay will be close enough to be clustered into the same jet. At this point, the use of jet substructure techniques becomes important to efficiently identify these decay signatures. At even larger p T top quarks become truly boosted objects: all decay products of the top will be strongly collinear, with the ∆R ∼ 2m top /p T . Hadronic top quarks can be reconstructed in a single jet, and top quarks with leptonic decays generally contain nonisolated leptons due to the overlap with the b-quark jet. Table 1 presents the expected numbers of boosted top quark pairs according to the Standard Model at past, present and future colliders. The numbers show clearly how the study of boosted top quarks becomes viable only with the start of the LHC. The first phase of operation yields a sample of several tens of thousands of boosted top quark pairs. The next-to-last column indicates the size of the sample expected in a 13 or 14 TeV run of the LHC, that is to start by the end of 2014. The increase in the centre-of-mass energy and the larger integrated luminosity each bring an increase of an order of magnitude in the production of boosted top quarks.
We expect, therefore, that boosted topologies will gain considerable importance as the LHC program develops. To exploit the LHC data to their full potential it is critical that existing experimental strategies are adapted to this challenging kinematical regime. Before we turn to the results of analyses of boosted object production, we discuss a number of new tools that were developed to identify and reconstruct boosted top quarks efficiently.

Top Tagging.
Excellent reviews of top tagging algorithms exist [93]. Previous BOOST reports have compared their performance for simulated events (at the particle level). In this Section we present a very brief review for completeness.
The Johns Hopkins (JHU) tagger [94] identifies substructure by reversing through the iterative clustering process used to form jets. Subjets are found using several criteria -the ratio of their individual p T to the original jet p T must be above a given threshold, and the subjets must be spatially separated from each other to give a valid decomposition. In this way, a jet can be deconstructed into up to four subjets, and jets with three or more subjets are analyzed further, requiring the invariant mass of the identified subjets to be in the range [145,205] GeV, and two of the subjets to be consistent with m W , in the range [65,95] GeV. There is an additional cut on the W boson helicity angle, cos θ h < 0.7.
The variant of the JHU tagger used by CMS [95] uses a similar jet decomposition, with slight differences in the selections of top quark and W boson masses from the subjets. Additionally, the CMS top tagger does not apply the W boson helicity angle requirement, but instead selects jets with the minimum pairwise mass of the subjets larger than 50 GeV. The JHU and CMS top tagging algorithms have been developed with jet distance parameters up to R = 0.8, and therefore are only efficient for top quarks with p T above approximately 400 GeV/c.
The HEP top tagger [96], is designed to use jets with distance parameter R = 1.5, thereby extending the reach of the tagging algorithm to lower jet p T values. The algorithm uses a mass drop criterion to identify substructure within the jet, but also uses a filtering algorithm to remove soft and large-angle constituents from the individual subjets. The three subjets with a combined mass closest to m t are then chosen for further consideration. Cuts are then applied to masses of subjet combinations to ensure consistency with m W and m t . Specifically, for the three subjets sorted in order of subjet p T , having masses m 1 , m 2 , m 3 , the quantities m 23 /m 123 and arctan m 13 /m 12 are computed. Geometrical cuts can be applied in the phase space defined by these two quantities to select top jets and reject quark or gluon jets. Table 1 The top pair production rate at past, present and future colliders, calculated with the MCFM code [92]. The inclusive production rate is given in the first row. The expected number of events with boosted top quarks (M tt > 1 TeV) and highly boosted top quarks (M tt > 2 TeV) is given in the second and third row, respectively. The HEP top tagger obtains tagging efficiencies of up to 37% for lower p T top quarks (p T > 200 GeV/c), with an acceptable mistag rate. It has been used by the ATLAS tt resonance search in the fully hadronic channel [97], where no resolved analysis has been performed. At high jet p T , the efficiencies for the HEP Top Tagger and JHU Top Tagger selections are comparable.
Boosted top quarks were also studied using both R = 1.0 anti-k t jets and jets identified by the HEPTop-Tagger [96] algorithm as candidate "top-jets." Kinematic and substructure distributions were compared between data and MC simulation and were found to be in agreement. Furthermore, the efficiency with which top quarks were identified as such was found to be significantly increased in both cases, and the HEPTop-Tagger was shown to reduce the backgrounds to such searches dramatically, even with a relatively relaxed transverse momentum selection.
Overall, the results from ATLAS suggests that, among the jet grooming configurations tested, the trimming algorithm exhibited an improved mass resolution and smaller dependence of jet kinematics and substructure observables on pile-up (such as N -subjettiness [74,75] and the k t splitting scales [98]) compared to the pruning configurations examined. For boosted top quark studies, the anti-k t algorithm with a radius parameter of R = 1.0 and trimming parameters f cut = 0.05 and R sub = 0.3 was found to be optimal, where a minimum p T requirement of 350 GeV is typical. It is important to note that only the k t -pruning for R = 1.0 jets was tested and that since the performance does depend somewhat on this parameter, further studies are necessary to optimize for other jet size. Lastly, Cambridge-Aachen jets with R = 1.2 using the mass-drop filtering parameter µ frac = 0.67 were found to perform well for boosted twopronged analyses such as H → bb or searches involving boosted W → qq decays.
A final algorithm that is currently being investigated is the N -subjettiness algorithm [74] presented in Section 3.
Several new techniques and ideas are emerging, that aim to improve boosted top identification and reconstruction.
One such technique is that of shower deconstruction [99]. This method aims to identify boosted hadronic top quarks by computing the probability for a top quark decay to produce the observed jet, including its distribution of constituents. The probability for the same jet to have originated from a background process is also computed. These probabilities are computed by summing over all possible shower formations resulting in the observed final state, accounting for different gluon splittings and radiations, among other processes. This is done both for the signal shower processes and background shower processes. A likelihood ratio is formed from the signal and background probabilities and used to discriminate boosted top quarks from generic QCD jets. The process of evaluating all shower histories can be computationally intensive, so certain requirements are made on the number of constituents used in the method to make the problem tractable. The results presented in Ref. [100] show an improvement on the top taggers described previously. Specifically, the shower deconstruction method reduces the top mistag rate by a factor of 3.6 compared to the JHU top tagger, while maintaining the same signal acceptance. This method is also applicable to the lower p T regime, and there improves upon the top mistag rate from the HEP top tagger by a factor of 2.6, again keeping identical signal efficiency.
Another algorithm under development is the template overlap method [115]. The template overlap method is designed for use in boosted top identification as well as boosted Higgs identification. The method is similar to that of shower deconstruction, in that it attempts to quantify how well a given jet matches a certain expectation such as a boosted top quark or boosted Higgs decay. However, this method uses only final state configurations, whereas the shower deconstruction method takes into account the showering histories. A catalog of templates is formed by analyzing signal events. Once this is in place, individual jets can be analyzed by evaluating an overlap function which evaluates how well the current jet matches the templates from the signal process of interest. For example, a template for hadronic boosted top quark decays would consist of three energy  Fig. 8 Overview of evolution of the sensitivity of tt resonance searches in the first years of LHC operation. The sensitivity is presented in terms of the lower limit on the mass of a narrow Z boson. The production rate for this new state is given by a benchmark model that is common to all experiments (a leptophobic topcolor Z boson). deposits within the jet. In studies with high-p T jets, the rejection factor of QCD jets compared to jets from boosted top quark decays is of the order 10 2 . One additional feature of this template overlap method is the automatic inclusion of additional parton radiation into the template catalog, such as for Higgs decays to bottom quark pairs, where there is commonly an additional gluon radiated, resulting in 3 energy deposits instead of the 2 from the b quarks. Finally, the Q-jets [116] scheme could be used for top-tagging. This is a method to remove dependence of analysis results on the choice of clustering algorithm used to reconstruct jets. For example, one could use either the Cambridge-Aachen algorithm or the kT algorithm to cluster jets, and may obtain significantly different results in the jet masses. The Q-jets algorithm attempts to use all possible "trees" to cluster constituents, rather than using the single tree provided by the specific clustering algorithm used. In this way, each jet now has a distribution of possible masses instead of a single jet. This provides additional information which can enhance signal discrimination. For example, the variance of the jet mass between individual clustering trees can be examined, rather than relying on just a single value. The statistical stability is also enhanced when using the Q-jets algorithm.

Searches with boosted top quarks
The first area where new tools developed specifically for the selection and reconstruction of boosted top quarks have shown their value is in searches for massive new states decaying to top quark pairs. The first application of techniques specifically aimed at boosted top decays was the CMS tt resonance search in the all-hadronic channel [110]. The evolution of the mass reach 17 of tt resonance searches in the more sensitive "lepton+jets" channel is shown in Fig. 8. By the start of the LHC program the Tevatron experiments had excluded a Z boson mass lower than 700 GeV [101,102]. In the course of 2011 and 2012 the limit was extended to 800 GeV by a D0 search on nearly 5 fb −1 [104] and to approximately 900 GeV by a CDF analysis of the complete Tevatron data set [105]. An ATLAS search on 2.4 fb −1 of 7 TeV LHC data [106] collected in 2011 reached a similar precision. All these analyses followed the conventional, resolved approach that is based on the assumption that 17 The sensitivity to massive particles is expressed in terms of the observed 95 CL lower limit on the mass of a leptophobic topcolor Z boson. The motivation of this particular model may not have survived recent advances in particle physics, but to monitor the sensitivity of searches it is still the best benchmark on the market. the six fermions from the decay of the top quark pair (t → W + b → l + ν l b and the charge conjugate process) can be resolved individually.
In some cases ATLAS and CMS analyses specifically designed for boosted top quarks [107, 111] scrutinized the same data set that had been used by the resolved approach. A direct comparison of these results demonstrates that the novel approach has considerably better sensitivity for massive states [107]. The final analyses on 2011 data [108, 111] combine resolved and boosted methods to attain good sensitivity over the complete mass spectrum. The excluded mass range is pushed up to 1.74 TeV.
Searches in the "lepton+jets" channel are complemented by analyses of the fully hadronic (tt → 6 jets) and di-lepton (tt → bbl + ν l l −ν l ) decay chains. Only one fully hadronic tt resonance search was performed at the Tevatron [103]. At the LHC, with a daunting multijet background, these searches are even more challenging. The advent of new algorithms has, however, greatly boosted their potential. The mass reach of the CMS [110] and ATLAS search [97] are compared to that of the "lepton+jets" searches in Table 2.
The prospects for progress are good. Preliminary results on the 2012 data set [109, 113, 114] have significantly extended previous limits.

Jet substructure performance and searches
The results in the previous Section demonstrate the proof-of-principle: the addition of jet substructure to the experimentalists' tool-box boosts the sensitivity of searches for new physics at the LHC. It is clear, however, that these tools are still in their infancy. In all searches discussed in the previous Section large systematic uncertainties are assigned to the large-R jets. It is natural to suspect that further progress could be made with better (and, especially, better understood) tools.
To quantify the impact of the jet-related systematics on the sensitivity we have evaluated expected limits on the narrow Z boson with all sources of systematic uncertainty, except one (so-called N − 1 limits) in several iterations of the ATLAS searches in the lep-ton+jets final state. The uncertainties associated with the large-R jet that captures the hadronic top decay are always the dominant source of uncertainty. Their impact is considerably larger than that of systematics associated with the narrow jets, even at relatively low resonance mass. The limits over a large mass range (1-2 TeV) would improve by approximately 5-10% if only the uncertainty on the scale and resolution of mass and energy of anti-k t jets with R = 1 is removed.
If we apply an ad hoc scale factor of two to this uncertainty (representing a failure to bring these uncertainties under control) we find that the sensitivity is further degraded. A significant reduction of large-R jet uncertainties, on the other hand, brings the N − 1 limits with no jet-related systematics and the limits with reduced large-R jet systematics to within 2%.
CMS has not published the N − 1 results for their searches, but qualitatively the same picture emerges. In the fully hadronic searches the jet-related uncertainties have the largest impact on the limits.
We conclude that further progress undertanding jet substructure still has substantial potential to increase our sensitivity to massive new states decaying to top quarks.

Further applications
The selection for boosted top quarks, in the lepton+jets and fully hadronic channels, have proven their value in tt resonance searches, but are more generally applicable.
The obvious direction to extend the range of applications is to other searches with boosted top quarks. The W → tb that are currently performed in the channel where the top quark decays to a charged lepton, neutrino and b-jet. We expect, however, that, ultimately the highest mass reach should be obtained in the hadronic decay (with a factor two large branching ratio if τleptons are not considered).
We expect differential cross-section measurements for tt to benefit from these techniques at large transverse momentum and invariant mass of the tt pair. Apart from the better selection efficiency in algorithms designed for this kinematic regime, the better truth-toreconstructed mapping of p T and m tt is expected to be an important advantage. We are looking forward to such measurements from the ATLAS and CMS experiments. Also analyses that rely strongly on the reconstruction of the top quark direction, such as the charge asymmetry measurement, should benefit.
Finally, several authors [117] have commented on the potential of events with mildly boosted top quarks for the observation of ttH and a measurement of the production rate.

Summary
Over the last five years, many ideas have been proposed to cope with the challenge of boosted top quark reconstruction. Since then, these ideas have been implemented by the experiments and put to the test, pri-marily in searches for massive new states decaying to tt pairs. The overview we presented in Fig. 8 and Table 2 is a testimony to the increase of sensitivity for such states fuelled by the performance of the LHC. Such progress would not have been possible if novel techniques for the study of boosted top quarks had not been developed. We expect the selection developed for the lep-ton+jets and fully hadronic to find further applications in searches and measurements.

Summary & Conclusions
This report of the BOOST2012 workshop provides answers to a number of important questions concerning the use of jet substructure for the study of boosted object production at the LHC.
We evaluated the current limitations in the description of jet substructure, both at the analytical level and in Monte Carlo generators. Impressive progress is being made for the former and we expect a meaningful comparison to LHC data to be a reality soon. Two approaches -perturbative QCD and Soft Collinear Effective Theory -to a first-principle resummation of the jet invariant mass are producing mature results. Measurements of the jet mass in Z+jet events are proposed, both inclusively and exclusively in the number of jets. We hope that in the not-too-distant future these calculations can enhance our understanding of the internal structure in jets.
Monte Carlo predictions remain crucial to searches and measurements employing jet substructure. We have compared the predictions of several mainstream generators for a number of substructure observables a and for several signal and background topologies. While jet mass is still poorly described by several generators, several ways of introducing the inherent uncertainties become evident. Jet grooming reduces the spread among Monte Carlo models, as do several alternative jet substructure observables.
We also studied potential experimental limitations that could check further progress, in particular the impact of the large number of simultaneous proton-proton interactions. We find that, even if the substructure of large-radius jets is quite sensitive to pile-up, a combination of a state-of-the-art correction technique and jet grooming can effectively restore the jet mass scale and strongly mitigate the impact on the jet mass resolution.
Finally, we reviewed top-tagging techniques deployed in the LHC experiments and assessed their impact on the sensitivity to new physics. A series of tt resonance searches performed by ATLAS and CMS provide clear proof of the power of techniques specifically designed for boosted top quarks. Through an evaluation of the impact of all sources of systematic uncertainties, we show that further progress can still be made with an enhanced understanding of jet substructure. We expect to see these techniques applied in further searches involving boosted top quarks and in measurements of the boosted top production rate.