Performance of jet substructure techniques for large-R jets in proton-proton collisions at sqrt(s) = 7 TeV using the ATLAS detector

This paper presents the application of a variety of techniques to study jet substructure. The performance of various modified jet algorithms, or jet grooming techniques, for several jet types and event topologies is investigated for jets with transverse momentum larger than 300 GeV. Properties of jets subjected to the mass-drop filtering, trimming, and pruning algorithms are found to have reduced sensitivity to multiple proton-proton interactions, are more stable at high luminosity and improve the physics potential of searches for heavy boosted objects. Studies of the expected discrimination power of jet mass and jet substructure observables in searches for new physics are also presented. Event samples enriched in boosted W and Z bosons and top-quark pairs are used to study both the individual jet invariant mass scales and the efficacy of algorithms to tag boosted hadronic objects. The analyses presented use the full 2011 ATLAS dataset, corresponding to an integrated luminosity of 4.7 +/- 0.1 /fb from proton-proton collisions produced by the Large Hadron Collider at a center-of-mass energy of sqrt(s) = 7 TeV.


JHEP09(2013)076
Contents  The dominant feature of high-energy proton-proton (pp) collisions at the Large Hadron Collider (LHC) is the production of highly collimated sprays of energetic hadrons, called jets, that originate from the quarks and gluons in the primary collisions. The large centreof-mass energy at the LHC enables the production of Lorentz-boosted heavy particles, whose decay products can be reconstructed as one large-area jet. The study of the internal structure of jets goes beyond the four-momentum description of a single parton and yields new approaches for testing Quantum Chromodynamics (QCD) and for searching for new physics in hadronic final states. However, many of the new tools developed for the study of jet substructure at the LHC have only recently been validated with data in a hadron-hadron collider environment. For example, the effect of multiple pp interactions on large-area jet measurements has not been extensively studied experimentally. This paper presents a comprehensive set of studies designed to establish the efficacy, accuracy, and precision of several of the tools available for determining and analysing the internal structure of jets at the LHC. New jet algorithms and strategies, referred to as jet grooming, that refine the definition of a jet in a high-luminosity environment, are studied using data taken at a centre-of-mass energy of √ s = 7 TeV during 2011. A variety of techniques and tagging algorithms intended to improve the mass resolution in the reconstruction of boosted objects that decay hadronically are studied in the data both in inclusive jet samples and in samples enriched in events containing boosted W /Z bosons and top quarks. Evaluations of the systematic uncertainties for jet mass measurements are presented for a variety of jet algorithms. Comparisons of the discrimination between signal and background provided by various observables are also evaluated for a selection of models of new physics containing boosted hadronic particle decays.
The organization of the paper is as follows. In this section, motivation for the use of new jet reconstruction techniques for Lorentz-boosted particles is given and the jet algorithms and jet substructure variables that are used in the analyses presented here are defined. Section 2 provides descriptions of the ATLAS detector and the Monte Carlo simulations, and section 3 defines the jet reconstruction and calibration procedures that are used throughout. The latter section includes a discussion of the jet mass scale and the subjet energy scale, which are important ingredients in the jet grooming algorithms. Section 4 describes studies of the effect of jet grooming on jet properties in the presence of pile-up, which represents a major experimental challenge at the present and future LHC machine.
Studies of the performance of the various jet algorithms are conducted with three classes of event samples in both data and Monte Carlo simulation in section 5: inclusive jet events, which are dominated by light-quark or gluon jets whose properties are defined primarily by soft gluon emission; boosted hadronically decaying W and Z bosons, which form jets that are dominated by two high-p T components; and top-quark decays, where the W boson decays hadronically, which form jets that have three prominent components (due to the b-quark in addition to the W ). In section 5.1, the effect on jet resolution of the various jet grooming algorithms is compared in simulated events separately for signal -2 -

JHEP09(2013)076
(W , Z, and top jets) and background from light-quark and gluon jets. The discrimination between background and signal is then studied using a number of grooming configurations by comparing jet properties for the different types of events before and after grooming. This is followed in section 5.2 by a direct comparison of multiple Monte Carlo predictions and inclusive jet data. Lastly, section 5.3 presents jet grooming studies on boosted top-quark events. Finally, conclusions are drawn in section 6.

Motivation
The centre-of-mass energy of the LHC has opened new kinematic regimes to experimental study. The new phase space available for the production of Standard Model (SM) particles with significant Lorentz boosts, or even new massive particles that decay to highly boosted SM particles, necessitates new techniques to conduct measurements in novel final states. For example, when sufficiently boosted, the decay products of W bosons [1][2][3][4][5], top quarks [6][7][8], and Higgs bosons [9] can become collimated to the point that standard reconstruction techniques begin to fail. When the separation of the quarks in these boosted topologies becomes smaller than the radius parameter of the jets, they often fail to be individually resolved by standard jet algorithms and configurations. Moreover, the highluminosity conditions at the LHC can further degrade even the most complex procedures for reconstructing decays of boosted hadronic objects. Multiple pp interactions per bunch crossing (pile-up) produce soft particles unrelated to the hard scattering that can contaminate jets in the detector considerably more than at previous hadron-hadron colliders. In events where boosted particle decays are fully contained within individual large-radius jets, a diminished mass resolution due to pile-up may dramatically weaken sensitivity to new physics processes. It is crucial that the above issues be addressed together, as the efficacy of a given technique for boosted object reconstruction may depend critically on its vulnerability to experimental conditions. One example of a new physics process that may produce heavy objects with a significant Lorentz boost is the decay of a new heavy gauge boson, the Z , to top-quark pairs. Figure 1 shows the angular separation between the W and b decay products of a top quark in simulated Z → tt (m Z = 1.6 TeV) events, as well as the separation between the light quarks of the subsequent hadronic decay of the W boson. In each case, the angular separation of the decay products is approximately where ∆R = (∆y) 2 + (∆φ) 2 , and p T and m are the transverse momentum and the mass, respectively, of the decaying particle. 1 For p W T > 200 GeV, the ability to resolve the 1 The ATLAS coordinate system is a right-handed system with the x-axis pointing to the centre of the LHC ring and the y-axis pointing upwards. The polar angle θ is measured with respect to the LHC beam-line. The azimuthal angle φ is measured with respect to the x-axis. The rapidity is defined as y = 0.5 × ln[(E + pz)/(E − pz)], where E denotes the energy and pz is the component of the momentum along the beam direction. The pseudorapidity η is an approximation for rapidity y in the high-energy limit, and it is related to the polar angle θ by η = − ln tan(θ/2). Transverse momentum and energy are defined as pT = p × sin θ and ET = E × sin θ, respectively.  individual hadronic decay products using standard narrow-radius jet algorithms begins to degrade, and when p t T is greater than 300 GeV, the decay products of the top quark tend to have a separation ∆R < 1.0. Techniques designed to recover sensitivity in such cases focus on large-R jets in order to maximize efficiency. In this paper, large-R refers to jets with a radius parameter R ≥ 1.0. At √ s = 7 TeV, nearly one thousand SM tt events per fb −1 are expected with p t T greater than 300 GeV. New physics may appear in this region of phase space, the study of which was limited by integrated luminosity and available energy at previous colliders.
A single jet that contains all of the decay products of a massive particle has significantly different properties than a jet of the same p T originating from a light quark. The characteristic two-body or three-body decays of a high p T vector boson or top quark result in a hard substructure that is absent from typical high p T jets formed from gluons and light quarks. These subtle differences in substructure can be resolved more clearly by removing soft QCD radiation from jets. Such adaptive modification of the jet algorithm or selective removal of soft radiation during the process of iterative recombination in jet reconstruction is generally referred to as jet grooming [9,11,12].
Recently many jet grooming algorithms have been designed to remove contributions to a given jet that are irrelevant or detrimental to resolving the hard decay products from a boosted object (for recent reviews and comparisons of these techniques, see for example refs. [13,14]). The structural differences between jets formed from gluons or light quarks and individual jets originating from the decay of a boosted hadronic particle form the basis for these tools. The former are characterized primarily by a single dense core of energy surrounded by soft radiation from the parton shower, hadronization, and underlying event (UE) remnants [15][16][17]. Jets containing the decay products of single massive particles, on the other hand, can be distinguished by hard, wide-angle components representative of the individual decay products that result in a large reconstructed jet mass, as well as typical kinematic relationships among the hard components of the jet [4,7,9,[18][19][20][21][22][23]. Grooming algorithms are designed to retain the characteristic substructure within such a jet while reducing the impact of the fluctuations of the parton shower and the UE, thereby improving the mass resolution and mitigating the influence of pile-up. These features have only recently begun to be studied experimentally [24][25][26][27][28][29][30][31] and have been exploited heavily in recent studies of the phenomenological implications of such tools in searches for new physics [9,[32][33][34][35][36][37][38][39][40][41][42]. A groomed jet can also be a powerful tool to discriminate between the often dominant multi-jet background and the heavy-particle decay, which increases signal sensitivity. Figure 2 demonstrates this by comparing the invariant mass distribution of single jets in events containing highly boosted hadronically decaying Z bosons before and after the application of a grooming procedure referred to as mass-drop filtering. In this simulated Z → qq sample described in section 2.2, pile-up events are also included. Prior to the application of this procedure, no distinct features are present in the jet mass distribution, whereas afterwards, a clear mass peak that corresponds to the Z boson is evident.

JHEP09(2013)076
used infrared and collinear-safe jet algorithms available for hadron-hadron collider physics today. Furthermore, in the case of the k t and C/A algorithms, the clustering history of the algorithm -that is, the ordering and structure of the pair-wise subjet recombinations made during jet reconstruction -provides spatial and kinematic information about the substructure of that jet. The anti-k t algorithm provides jets that are defined primarily by the highest-p T constituent, yielding stable, circular jets. The compromise is that the structure of the jet as defined by the successive recombinations carried out by the anti-k t algorithm carries little or no information about the p T ordering of the shower or wide angular-scale structure. It is, however, possible to exploit the stability of the anti-k t algorithm and recover meaningful information about the jet substructure: anti-k t jets are selected for analysis based on their kinematics (η and p T ), and then the jet constituents are reclustered with the k t algorithm to enable use of the k t -ordered splitting scales described in section 1.2.2. The four-momentum recombination scheme is used in all cases and the jet finding is performed in rapidity-azimuthal angle (y-φ) coordinates. Jet selections and corrections are made in pseudorapidity-azimuthal angle (η-φ) coordinates.

Jet properties and substructure observables
Three observables are used throughout these studies to characterize jet substructure and distinguish massive boosted objects from gluons or light quarks: mass, k t splitting scales, and N -subjettiness.
Jet mass: the jet mass is defined as the mass deduced from the four-momentum sum of all jet constituents. Depending on the input to the jet algorithm (see section 3.1), the constituents may be considered as either massive or massless four-momenta. k t splitting scales: the k t splitting scales [2] are defined by reclustering the constituents of a jet with the k t recombination algorithm, which tends to combine the harder constituents last. At the final step of the jet recombination procedure, the k t distance measure, d ij , for the two remaining proto-jets (intermediate jet-like objects at each stage of clustering), referred to as subjets in this case, can be used to define a splitting scale variable as: where ∆R ij is the distance between the two subjets in η −φ space. With this definition, the subjets identified at the last step of the reclustering in the k t algorithm provide the √ d 12 observable. Similarly, √ d 23 characterizes the splitting scale in the second-to-last step of the reclustering. The parameters √ d 12 and √ d 23 can be used to distinguish heavy-particle decays, which tend to be reasonably symmetric when the decay is to like-mass particles, from the largely asymmetric splittings that originate from QCD radiation in light-quark or gluon jets. The expected value for a two-body heavy-particle decay is approximately √ d 12 ≈ m particle /2, whereas jets from the parton shower of gluons and light quarks tend to have smaller values of the splitting scales and to exhibit a steeply falling spectrum for both √ d 12 and √ d 23 (see for example figure 30).

JHEP09(2013)076
N -subjettiness: the N -subjettiness variables τ N [18,51] are observables related to the subjet multiplicity. The τ N variable is calculated by clustering the constituents of the jet with the k t algorithm and requiring that exactly N subjets be found. This is done using the exclusive version of the k t algorithm [48] and is based on reconstructing clusters of particles in the jet using all of the jet constituents. These N final subjets define axes within the jet. The variables τ N are then defined in eq. (1.3) as the sum over all constituents k of the jet, such that where R is the jet radius parameter in the jet algorithm, p Tk is the p T of constituent k and δR ik is the distance from subjet i to constituent k. Using this definition, τ N describes how well jets can be described as containing N or fewer k t subjets by assessing the degree to which constituents are aligned with the axes of these subjets for a given hypothesis N . The ratios τ 2 /τ 1 and τ 3 /τ 2 can be used to provide discrimination between jets formed from the parton shower of initial-state gluons or light-quarks and jets formed from two hadronic decay products (from Z-bosons, for example) or three hadronic decay products from boosted top quarks. These ratios are herein referred to as τ 21 and τ 32 respectively. For example, τ 21 1 corresponds to a jet that is very well described by a single subjet whereas a lower value implies a jet that is much better described by two subjets than one.

Jet grooming algorithms
Three jet grooming procedures are studied in this paper. Mass-drop filtering, trimming, and pruning are described, and performance measures related to each are defined. The different configurations of the grooming algorithms described in this section are summarized in table 1. Additionally, a technique to tag boosted top quarks using the mass-drop filtering method is introduced. Unless otherwise specified, the jet p T reported for a groomed jet is that which is calculated after the grooming algorithm is applied to the original jet.
Mass-drop filtering: the mass-drop filtering procedure 2 seeks to isolate concentrations of energy within a jet by identifying relatively symmetric subjets, each with a significantly smaller mass than that of the original jet. This technique was developed and optimized using C/A jets in the search for a Higgs boson decaying to two b-quarks: H → bb [9]. The C/A algorithm is used because it provides an angular-ordered shower history that begins with the widest combinations when reversing the cluster sequence. This provides useful information regarding the presence of potentially large splittings within a jet (see section 4 and section 5). Although the mass-drop criterion and subsequent filtering procedure are not based specifically on soft-p T or wide-angle selection criteria, the algorithm does retain the hard components of the jet through the requirements placed on its internal structure. The first measurements of the jet mass of these filtered jets was performed using 35 pb −1 of data collected in 2010 by the ATLAS experiment [25]. The mass-drop filtering procedure has two stages: 2 In this paper, mass-drop filtering is often shortened to only filtering in figures and captions. • Mass-drop and symmetry. The last stage of the C/A clustering is undone. The jet "splits" into two subjets, j 1 and j 2 , ordered such that the mass of j 1 is larger: m j 1 > m j 2 . The mass-drop criterion requires that there be a significant difference between the original jet mass (m jet ) and m j 1 after the splitting: where µ frac is a parameter of the algorithm. The splitting is also required to be relatively symmetric, which is approximated by the requirement that where ∆R j 1 ,j 2 is a measure of the opening angle between j 2 and j 1 , and y cut defines the energy sharing between the two subjets in the original jet. For the analyses presented here, y cut is set to 0.09, the optimal value for identifying two-body decays, obtained in previous studies [9]. To give a sense of the kinematic requirements that this places on a given decay, consider a hadronically decaying W boson with p W T ≈ 200 GeV. According to the approximation given by eq. (1.1), the average angular separation of the two daughter quarks is ∆R j 1 ,j 2 ∼ 0.8. The symmetry requirement determined by y cut in eq. (1.5) thereby implies that the transverse momentum of the softer (in p T ) of the two subjets is greater than approximately 30 GeV. Generally, this requirement entails a minimum p T of the softer subjet of p subjet T /p jet T > 0.15, thus forcing both subjets to carry some significant fraction of the momentum of the original jet. This procedure is illustrated in figure 3(a). If the mass-drop and symmetry criteria are not satisfied, the jet is discarded.
• Filtering. The constituents of j 1 and j 2 are reclustered using the C/A algorithm with radius parameter R filt = min[0.3, ∆R j 1 ,j 2 /2], where R filt < ∆R j 1 ,j 2 . The jet is then filtered; all constituents outside the three hardest subjets are discarded. The choice of three allows one additional radiation from a two-body decay to be captured. In isolating j 1 and j 2 with the C/A algorithm, the angular scale of any potential massive particle decay is known. By dynamically reclustering the jet at an appropriate angular scale able to resolve that structure, the sensitivity to highly collimated decays is maximized. This is illustrated in figure 3(b).
In this paper, three values of the mass-drop parameter µ frac are studied, as summarized in table 1. The values chosen for µ frac are based on a previous study [9] which has shown that µ frac = 0.67 is optimal in discriminating H → bb from background. A subsequent study regarding the factorization properties of several groomed jet algorithms [52] found that smaller values of µ frac (0.20 and 0.33) are similarly effective at reducing backgrounds, and yet they remain factorizable within the soft collinear effective theory studied in that analysis.  Trimming: the trimming algorithm [12] takes advantage of the fact that contamination from pile-up, multiple parton interactions (MPI) and initial-state radiation (ISR) in the reconstructed jet is often much softer than the outgoing partons associated with the hardscatter and their final-state radiation (FSR). The ratio of the p T of the constituents to that of the jet is used as a selection criterion. Although there is some spatial overlap, removing the softer components from the final jet preferentially removes radiation from pile-up, MPI, and ISR while discarding only a small part of the hard-scatter decay products and FSR. Since the primary effect of pile-up in the detector is additional low-energy deposits in clusters of calorimeter cells, as opposed to additional energy being added to already existing clusters produced by particles originating from the hard scattering process, this allows a relatively simple jet energy offset correction for smaller radius jets (R = 0.4, 0.6) as a function of the number of primary reconstructed vertices [53].
The trimming procedure uses a k t algorithm to create subjets of size R sub from the constituents of a jet. Any subjets with p Ti /p jet T < f cut are removed, where p Ti is the transverse momentum of the i th subjet, and f cut is a parameter of the method, which is typically a few percent. The remaining constituents form the trimmed jet. This procedure is illustrated in figure 4. Low-mass jets (m jet < 100 GeV) from a light-quark or gluon lose typically 30-50% of their mass in the trimming procedure, while jets containing the decay products of a boosted object lose less of their mass, with most of the reduction due to the removal of pile-up or UE (see, for example, figures 29 and 32). The fraction removed increases with the number of pp interactions in the event.   Six configurations of trimmed jets are studied here, arising from combinations of f cut and R sub , given in table 1. They are based on the optimized parameters in ref. [12] (f cut = 0.03, R sub = 0.2) and variations suggested by the authors of the algorithm. This set represents a wide range of phase space for trimming and is somewhat broader than considered in ref. [12].
Pruning: the pruning algorithm [3,11] is similar to trimming in that it removes constituents with a small relative p T , but it additionally applies a veto on wide-angle radiation. The pruning procedure is invoked at each successive recombination step of the jet algorithm (either C/A or k t ). It is based on a decision at each step of the jet reconstruction whether or not to add the constituent being considered. As such, it does not require the reconstruction of subjets. For all studies performed for this paper, the k t algorithm is used in the pruning procedure. This results in definitions of the terms wide-angle or soft that are not directly related to the original jet but rather to the proto-jets formed in the process of rebuilding the pruned jet.
The procedure is as follows: • The C/A or k t recombination jet algorithm is run on the constituents, which were found by any jet finding algorithm.
• At each recombination step of constituents j 1 and j 2 (where p j 1 T > p j 2 T ), either p j 2 T /p j1+j2 T > z cut or ∆R j 1 ,j 2 < R cut × (2m jet /p jet T ) must be satisfied. Here, z cut and R cut are parameters of the algorithm which are studied in this paper.   • j 2 with j 1 are merged if one or both of the above criteria are met, otherwise, j 2 is discarded and the algorithm continues.
The pruning procedure is illustrated in figure 5. Six configurations, given in table 1, based on combinations of z cut and R cut are studied here. This set of parameters also represents a relatively wide range of possible configurations.

HEPTopTagger
The HEPTopTagger algorithm [32,54] is designed to identify a top quark with a hadronically decaying W boson daughter over a large multi-jet background. The method uses the C/A jet algorithm and a variant of the mass-drop filtering technique described in section 1.2.3 in order to exploit information about the recombination history of the jet. This information is used to search for evidence within the jet of the presence of W decay products as well as an additional energy deposition -the b-quark -that are consistent with the W and top masses and the expected angular distribution of the final-state quarks. The HEPTopTagger algorithm is optimized for top-quark transverse momentum as low as 200 GeV and therefore uses a correspondingly large jet radius parameter. The algorithm proceeds as follows and is illustrated in figure 6.
Decomposition into substructure objects: The mass-drop criterion defined in eq. (1.4) is applied to a large-R C/A jet, where j 1 and j 2 are the two subjets from the last -11 -JHEP09(2013)076 stage of clustering, with m j 1 > m j 2 . If the criterion is satisfied, the same prescription is followed to split both j 1 and j 2 further. The iterative splitting continues until the subjets either have masses m i less than a tunable parameter m cut , or represent individual constituents, such as calorimeter energy deposits, tracks, or generator-level particles (i.e. no clustering history); see section 3.1 for definitions. This procedure results in N i subjets. If at any stage m j 1 > (m jet × µ frac ), the mass-drop criterion and subsequent iterative declustering is not applied to j 2 . The values of m cut and R studied are summarized in table 2. R values of 1.5 and 1.8, somewhat larger than used generally in mass-drop filtering, are chosen based on previous studies [32,55]. When the iterative process of declustering the jet is complete, there must exist at least three subjets (N i ≥ 3), otherwise the jet is discarded.
Filtering: all possible combinations of three subjets are formed, and each triplet is filtered one at a time. The constituents of the subjets in a given triplet are reclustered into N j new subjets using the C/A algorithm with a size parameter R filt = min[0.3, ∆R j 1 ,j 2 /2], where ∆R j 1 ,j 2 is the minimum separation between all possible pairs in the current triplet. It is therefore possible that N j > 3 after the reclustering step. All energy deposits not in the N j subjets are discarded.
Top mass window requirement: if the invariant mass of the four-momentum determined by summing the constituents of the N j subjets is not in the range 140 GeV ≤ m jet < 200 GeV, the triplet combination is ignored. If more than one triplet satisfies the criterion, only the one with mass closest to the top-quark mass, m t , is used. This triplet (which consists of N j ≥ 3 subjets) is thus identified as the top-candidate triplet.
Reclustering of subjets: from the N j subjets of the top-candidate triplet, N subjet leading-p T subjets are chosen, where N subjet is a parameter satisfying 3 ≤ N subjet ≤ N j . From this set of subjets, exactly three jets are built by re-applying the C/A algorithm to the constituents of the N subjet subjets, which are exclusively clustered using a distance parameter R jet listed in table 2. This latter step reflects the hypothesis that this is likely to be a top-jet candidate. These subjets are calibrated as described in section 3.5.
W boson mass requirements: relations listed in eqs. (A.1) of ref. [32] are defined using the total invariant mass of the three subjets (m 123 ) and the invariant mass m ij formed from combinations of two of the three subjets ordered in p T . These conditions include: Here, given in table 2), and the quantities m W and m t denote the W boson and top-quark masses, respectively. If at least one of the criteria in eqs. (A.1) of [32] is met, the four-momentum addition of the three subjets is considered a candidate top quark.

JHEP09(2013)076
(a) Every object encountered in the declustering process is considered a 'substructure object' if it is of sufficiently low mass or has no clustering history.
(b) The mass-drop criterion is applied iteratively, following the highest subjet-mass line through the clustering history, resulting in Ni substructure objects.
(c) For every triplet-wise combination of the substructure objects found in (b), recluster the constituents into subjets and select the N subjet leading-pT subjets, with 3 ≤ N subjet ≤ Ni (here, N subjet = 5).
(d) Recluster the constituents of the N subjet subjets into exactly three subjets to make the top candidate for this triplet-wise combination of substructure objects. Figure 6. The HEPTopTagger procedure.
2 The ATLAS detector and data samples

The ATLAS detector
The ATLAS detector [56, 57] provides nearly full solid angle coverage around the collision point with an inner tracking system covering |η| < 2.5, electromagnetic and hadronic calorimeters covering |η| < 4.9, and a muon spectrometer covering |η| < 2.7. Of the multiple ATLAS subsystems, the most relevant to this analysis are the barrel and endcap calorimeters [58,59] and the trigger [60]. The calorimeter comprises multiple sub-detectors of various designs, spanning the pseudorapidity range up to |η| = 4.9. The measurements presented here are performed using data predominantly from the central calorimeters, comprising the liquid argon (LAr) barrel electromagnetic calorimeter (|η| < 1.475) and the tile hadronic calorimeter (|η| < 1.7). Three additional calorimeter subsystems are located in the higher-η regions of the detector: the LAr electromagnetic endcap calorimeter, the LAr hadronic endcap calorimeter, and the forward calorimeter with separate components for electromagnetic and hadronic showers.
Dedicated trigger and data acquisition systems are responsible for the online event selection, which is performed in three stages: Level 1, Level 2, and the Event Filter. The measurements presented in this paper rely primarily on the single-jet and multi-jet triggers implemented at the Event Filter level, which has access to the full detector granularity, and finds multi-jet events with high efficiency. The intermediate trigger levels provide coarser jet finding and sufficient rate reduction to satisfy the trigger and offline selection requirements.

Data and Monte Carlo samples
Data from the entire 2011 ATLAS data-taking period are used, corresponding to 4.7 ± 0.1 fb −1 of integrated luminosity [61]. All data are required to have met baseline quality criteria and were taken during periods in which almost all of the detector was fully functional. Data quality criteria reject events with significant contamination from detector noise or with issues in the read-out, and are based on assessments for each subdetector individually. Multiple proton-proton collisions, or pile-up, result in several reconstructed primary vertices per event. The inclusive jet sample that is used for many studies in this paper is selected from the data using a single high-p T jet trigger that requires the leading jet in the event to have p jet T > 350 GeV. This trigger threshold was used for the entire 2011 data-taking period and thus represents the full integrated luminosity with negligible inefficiency.
These data are compared to inclusive jet events, which are dominated by lightquark or gluon jets whose properties are defined primarily by soft gluon emission, that are generated by three Monte Carlo (MC) simulation programs: PYTHIA 6.425 [10], HERWIG++ [62], and POWHEG-BOX 1.0 [63][64][65] (patch 4) interfaced to PYTHIA 6.425 for the parton shower, hadronization, and UE models. Both PYTHIA and HER-WIG++ use the modified-LO parton distribution function (PDF) set MRST LO* [66]. POWHEG+PYTHIA uses the CT10 NLO PDF [67] in the matrix element and CTEQ6L1 PDF set [68] for the PYTHIA parton shower. For both cases, PYTHIA is used with the corresponding ATLAS AUET2B tune [69,70] and HERWIG++ uses the so-called UE7-2 tune [71], which is tuned to UE data from the LHC experiments. PYTHIA or HER-WIG++ with POWHEG+PYTHIA provide an important comparison, at least at the matrix-element level, between leading-order (LO) (PYTHIA and HERWIG++) and nextto-leading-order (NLO) (POWHEG) calculations. Furthermore, PYTHIA and HER-WIG++ offer distinct approaches to the modelling of the parton shower, hadronization, and the UE.
Samples of tt events are generated with MC@NLO v4.01 [72] using the CT10 NLO PDF, interfaced to HERWIG v6.520 [73] and JIMMY v4.31 [74]. Alternative samples for the study of systematic uncertainties are generated with POWHEG, with showering provided by either HERWIG or PYTHIA. Samples generated with the AcerMC v3.8 [75] package, using CTEQ6L1 PDFs, with showering provided by PYTHIA are also used. In these samples PYTHIA parameters have been tuned to increase or decrease the amount of initial-and final-state radiation. Single-top-quark events in the s-channel and W t processes are also generated with MC@NLO using the CT10 NLO PDF set, with only leptonically decaying W bosons allowed in the final state. Single-top-quark events in the t-channel, where all W boson decay channels are produced, are generated with AcerMC using the CTEQ6L1 PDF set, and are showered using PYTHIA with the AUET2B tune.
Samples of W +jets and Z+jets events are produced with the ALPGEN v2.13 [76] generator, using CTEQ6L1 PDFs, interfaced to HERWIG for parton showering and hadronization. Samples of diboson production processes (W W , W Z and ZZ) are produced with the HERWIG generator.

JHEP09(2013)076
After simulation of the parton shower and hadronization, as well as of the UE, events are passed through the full Geant4 [77] detector simulation [78]. Following this, the same trigger, event, data quality, jet, and track selection criteria are applied to the Monte Carlo simulation events as are applied to the data.
Boosted particles decaying to hadrons are used for direct comparisons of the performance of the various reconstruction and jet substructure techniques. For two-pronged decays, a sample of hadronically decaying Z bosons is generated using the HERWIG v6.510 [73] event generator interfaced with JIMMY v4.31 [74] for the UE. A sample of hadronically decaying W bosons produced using the same configuration as for the Z boson sample is also used for comparisons of the HEPTopTagger performance in section 5.3.3. In order to test the performance of techniques designed for three-pronged decays, tt events from a non-Standard-Model heavy gauge boson (Z with m Z = 1.6 TeV) are generated using the same PYTHIA 6.425 tune as above. This model provides a relatively narrow tt resonance and top quarks with p T 800 GeV.
Pile-up is simulated by overlaying additional soft pp collisions, or minimum bias events, which are generated with PYTHIA 6.425 using the ATLAS AUET2B tune [70] and the CTEQ6L1 PDF set. The minimum bias events are overlaid onto the hard scattering events according to the measured distribution of the average number µ of pp interactions. The proton bunches were organized in trains of 36 bunches with a 50 ns spacing between the bunches. Therefore, the simulation also contains effects from out-of-time pile-up, i.e. contributions from the collision of bunches neighbouring those where the events of interest occurred. Simulated events are reweighted such that the MC distribution of µ agrees with the data, as measured by the luminosity detectors in ATLAS [61].

Inputs to jet reconstruction
The inputs to jet reconstruction are either stable particles with a lifetime of at least 10 ps (excluding muons and neutrinos) in the case of MC generator-level jets (also referred to as particle jets), charged particle tracks in the case of so-called track-jets [53], or three-dimensional topological clusters (topo-clusters) in the case of fully reconstructed calorimeter-jets. Stable particles, such as pions or protons in the simulation, retain their respective masses when input to the jet reconstruction algorithm. Tracks are assigned the pion mass when used as input to the jet algorithm. Quality selections are applied in order to ensure that good quality tracks that originate from the reconstructed hard scattering vertex are used to build track-jets. The hard scattering vertex is selected as the primary vertex that has the largest (p track T ) 2 in the event and that contains at least two tracks. The selection criteria are: • transverse momentum: p track T > 0.5 GeV; • transverse impact parameter: |d 0 | < 1.0 mm; • longitudinal impact parameter: |z 0 | × sin(θ) < 1.0 mm; -15 -

JHEP09(2013)076
• silicon detector hits on tracks: hits in pixel detector ≥ 1 and in the silicon strip detector ≥ 6; where the impact parameters are computed with respect to the hard scattering vertex, and θ is the angle between the track and the beam. In the reconstruction of calorimeter jets, calorimeter cells are clustered together using a three-dimensional topological clustering algorithm that includes noise suppression [79]. The resulting topo-clusters are considered as massless four-momenta, such that E = | p |. They are classified as either electromagnetic or hadronic based on their shape, depth and energy density. In the calibration procedure, corrections are applied to the energy in order to calibrate the clusters to the hadronic scale.

Jet quality criteria and selection
All jets in the event reconstructed with the anti-k t algorithm with R = 0.4 and a measured p jet T > 20 GeV are required to satisfy the looser requirements discussed in detail in ref. [80]. These selections are designed to retain good quality jets while rejecting as large a fraction as possible of those from non-collision beam background and calorimeter noise. Jets are required to deposit at least 5% of their measured total energy in the electromagnetic (EM) calorimeter as well as not more than 99% of their energy in a single calorimeter layer.
To prevent contamination from detector noise, these jet quality criteria are extended by several requirements applied in a specific detector region. Any event with an anti-k t R = 0.4 jet with p jet T > 20 GeV that fails the above non-collision beam background or noise rejection requirements is removed from the analysis.

Jet calibration and systematic uncertainties
The precision and accuracy of energy measurements made by the calorimeter system are integral to every physics analysis, and the procedures to calibrate jets are described in ref. [53]. The baseline energy scale of the calorimeters derives from the calibration of the electronic signal arising from the energy deposited by electromagnetic showers measured in beam tests, known as the electromagnetic scale. The hadronic calorimeter has been calibrated with electrons, pions, and muons in beam tests and the energy scale has been validated using muons produced by cosmic rays with the detector in situ in the experimental hall [59]. The invariant mass of the Z boson in Z → ee events measured in situ in the same data sample studied here is used to adjust the calibration for the EM calorimeters.

Monte Carlo based calibration
The MC hadronic calibration scheme starts from the measured calorimeter energy at the electromagnetic (EM) energy scale [81][82][83][84][85][86][87][88][89], which correctly measures the energy deposited by electromagnetic showers. A local cluster weighting (LCW) calibration method classifies topological clusters along a continuous scale as being electromagnetic or hadronic, using shower shapes and energy densities. Energy corrections are applied to hadronic clusters based on this classification scheme, which is derived from single pion MC simulations and tested in situ using beam tests. These corrections account for the effects of noncompensation, signal losses due to noise suppression and out-of-cluster effects in building -16 -

JHEP09(2013)076
topo-clusters, and energy lost in non-instrumented regions of the calorimeters. The results shown here use LCW clusters as input to the jet algorithm.
The final jet energy scale (JES) calibration is derived as a correction relating the calorimeter's response to the true jet energy. The jet energy response is defined as the ratio of the reconstructed calorimeter jet energy (or transverse energy) to that of its matched truth particle jet energy. The inverse of this response ratio defines the multiplicative jet energy scale factor that is dependent on the jet energy and pseudorapidity. This JES factor is then applied to the full jet four-momentum. It can be applied to EM scale jets, with the resulting calibrated jets referred to as EM+JES, or to LCW calibrated jets, with the resulting jets referred to as LCW+JES jets. More details regarding the evaluation and validation of this approach for standard anti-k t R = 0.4, 0.6 jets can be found in ref. [53].
The JES correction used here for large-R jets is derived from a PYTHIA MC sample including pile-up events. There is no explicit offset correction for pile-up contributions, as in the standard JES procedure [53]. For standard jet algorithms, the dependence of the jet response on the number of primary vertices (N PV ) and the average number of interactions ( µ ) is removed by applying a pile-up offset correction to the EM or LCW scale before applying the JES correction. However, no explicit pile-up correction is applied to large-R jets or to jets with the various grooming algorithms applied.
Since one of the primary goals of the use of large-R and groomed jet algorithms is to reconstruct the masses of jets accurately and precisely, a last step is added to the calibration procedure of large-R jets wherein the mass of the jet is calibrated based on the MC simulation of dijet events. An explicit jet mass calibration is important when using the individual invariant jet mass in physics analyses since it is particularly susceptible to soft, wide-angle contributions that do not otherwise significantly impact the jet energy scale. The procedure measures the jet mass response for jets built from LCW clusters after the standard JES calibration. The mass response is determined from the mean of a Gaussian fit to the core of the distribution of the reconstructed jet mass divided by the corresponding generator-level jet mass. Figure 7 shows the jet mass response (m reco /m true ) for several values of jet energy as a function of η for anti-k t , R = 1.0 jets, before and after calibration to the true jet mass and without jet grooming. In each case, the jet energy itself has been calibrated by applying the JES correction. One can see from this figure that even very high-energy jets near the central part of the detector can have a mean mass scale (or JMS) differing by up to 20% from the particle level true jet mass. In particular, the reconstructed mass is, on average, greater than that of the particle-level jet due in part to noise and pile-up in the detectors. Furthermore, the finite resolution of the detector has a differential impact on the mass response as a function of η. Following the jet mass calibration, performed also as a function of η, a uniform mass response can be restored within 3% across the full energy and η range. Results for C/A jets with R = 1.2 are similar, but with an additional ∼ 1% non-closure due to the increased contamination from pile-up, as these jets have a larger distance parameter compared to the anti-k t , R = 1.0 jets. After applying grooming to the C/A jets, such as the mass-drop filtering procedure, the pile-up contamination is reduced and the jet mass response variation is reduced to less than 3%.

Jet mass scale validation in inclusive jet events using track-jets
In order to validate the jet mass measurement made by the calorimeter, calorimeter-jets are compared to track-jets. Track-jets have a different set of systematic uncertainties and allow a reliable determination of the relative systematic uncertainties associated with the calorimeter-based measurement. Performance studies [90] have shown that there is excellent agreement between the measured positions of clusters and tracks in data, indicating no systematic misalignment between the calorimeter and the inner detector. The use of track-jets reduces or eliminates the impact of additional pp collisions by requiring the jet inputs (tracks) to come from the hard-scattering vertex. The inner detector and the calorimeter have largely uncorrelated instrumental systematic effects, and so a comparison of variables such as jet mass and energy between the two systems allows a separation of physics (correlated) and detector (uncorrelated) effects. It is therefore possible to validate the JES and JMS, and also to estimate directly the pile-up energy contribution to jets. This approach was used extensively in the measurement of the jet mass and substructure properties of jets in the 2010 data [25] where pile-up was significantly less important and the statistical reach of the measurement was smaller than with the full integrated luminosity of 4.7 fb −1 for the 2011 dataset.
The relative uncertainty is determined using the ratio of the transverse momentum of the calorimeter jet, p jet T , to that of the track-jet, p track jet T . The same procedure is repeated for the jet mass, m jet , by using the track-jet mass, m track jet . The ratios are defined as where the matching between calorimeter and track-jets is performed using a matching criterion of ∆R < 0.3. The mean values of these ratios are expected to be well described by the detector simulation if detector effects are well modelled. That is to say, even if some underlying physics process is unaccounted for in the simulation, as long as this process affects both the track-jet and calorimeter-jet p T or masses in a similar way, then the ratio of data to simulation should be relatively unaffected when averaged over many events.
Double ratios of r m track jet and r p T track jet are constructed in order to evaluate this agreement. These double ratios, R p T r track jet and R m r track jet , are defined as: The dependence of R p T r track jet and R m r track jet on p jet T and m jet provides a test of the deviation of simulation from data, thus allowing an estimate of the uncertainty associated with the Monte Carlo derived calibration. Figure 8 shows the distribution of r m track jet for two jet algorithms and for jets in the range 500 GeV ≤ p jet T < 600 GeV in the central calorimeter region, |η| < 0.8. Comparisons between MC simulation and the data are made using PYTHIA, HERWIG++, and POWHEG+PYTHIA, where the distributions are normalized to the number of events observed in the data. This p jet T range is chosen for illustrative purposes and because of its relevance to searches for boosted vector bosons and top quarks, as the decay products of both are expected to be fully merged into a large-R jet in this transverse momentum range. The peak near r m track jet ≈ 2 and the shape of the distribution are both generally well primarily to test the overall scale, so that the important comparison is of the mean values of the distributions, which are quite well described.
The relative systematic uncertainty is first estimated for each MC generator sample as the weighted average absolute deviation of the double ratio, R m r track jet , from unity. Measurements of R m r track jet are performed in exclusive p jet T and η ranges. The statistical uncertainty is used as the weight in this case. The final relative uncertainty is then determined by the maximum of the weighted average deviation among the MC samples considered. Comparisons are made using PYTHIA, HERWIG++, and POWHEG+PYTHIA. Figure 9 presents the distributions of both r m track jet and the double ratio with respect to MC simulation, R m r track jet , for the same algorithms and grooming configurations as shown in figure 8. In the peak of the jet mass distribution, logarithmic soft terms dominate [8] and lower-p T particles constitute a large fraction of the calorimeter-jet mass. These particles are bent more by the magnetic field than higher-p T particles, or are not reconstructed as charged-particle tracks, and thus contribute more to the calorimeter-jet mass than the track-jet mass. At much lower calorimeter-jet masses, charged particles can be completely bent out of the jet acceptance, thus reducing the calorimeter-jet mass for a fixed track-jet mass. These effects result in the shape observed in the r m track jet distribution in this region. Higher-mass jets tend to be composed of multiple higher-p T particles that are less affected by the magnetic field and therefore contribute more similarly to the calorimeter-based and track-based mass reconstruction. This results in a flatter and fairly stable r m track jet ratio. This flat r m track jet distribution is present across the mass range for trimmed and mass-drop filtered (not shown) jet masses, as both these algorithms are designed to remove softer particles. Although there is a difference in the phase space of emissions probed at low and high mass, the mean calorimeter response relative to the tracker response is well modelled by each of the three MC simulations.
The weighted average deviation of R m r track jet from unity ranges from approximately 2% to 4% for the set of jet algorithms and grooming configurations tested for jets in the range 500 GeV ≤ p jet T < 600 GeV and in the central calorimeter, |η| < 0.8. The results are fairly stable for the slightly less central η range 0.8 ≤ |η| < 1.2. Figure 10 presents the full set of jet mass scale systematic uncertainties for various jet algorithms estimated using the calorimeter-to-track-jet double ratios. The total relative -21 -

JHEP09(2013)076
uncertainty includes the 3% uncertainty on the precision of the jet mass scale calibration (see figure 7(b)) as well as the uncertainty on the track measurements themselves. The latter uncertainty takes into account the knowledge of tracking inefficiencies and their impact on the p jet T and m jet measurements using track-jets. Each of these two additional components is assumed to be uncorrelated and added in quadrature with the uncertainty determined solely from the calorimeter-to-track-jet double ratios. The uncertainties are smoothly interpolated between the multiple discrete η ranges in which they are estimated. Figure 10 shows the uncertainty evaluated at two points |η| = 0.0 (solid) and |η| = 1.0 (dashed) as a function of p jet T . The impact of the tracking efficiency systematic uncertainty on r m track jet is evaluated by randomly rejecting tracks used to construct track-jets according to the efficiency uncertainty. This is evaluated as a function of η and m jet for various p jet T ranges. Typically, this results in a 2-3% shift in the measured track-jet kinematics (both p T and mass) and thus a roughly 1% contribution to the resulting total uncertainty, since the tracking uncertainty is taken to be uncorrelated to that determined from the double ratios directly.
The total systematic uncertainty on the jet mass scale is fairly stable near 4-5% for all jet algorithms up to p jet T ≈ 800 GeV. At low p jet T , in the range 200 GeV ≤ p jet T < 300 GeV, the average uncertainty for some jet algorithms rises to approximately 5-7%. The estimated uncertainty is similar for both the ungroomed and the trimmed or mass-drop filtered jets, except for trimmed anti-k t jets (see figure 10(b)) for which the uncertainty in the range 900 GeV ≤ p jet T < 1000 GeV and |η| = 1.0 is approximately 8%.

Jet mass scale validation using hadronic W decays in tt events
An alternative approach to validating the jet mass scale is to study a hadronically decaying particle with a known mass. The most accessible source of hadronically decaying massive particles is events containing top-quark pairs (tt). The tt process at the LHC has a relatively large cross-section and the final state contains two W bosons. About 15% of the time, one of the W bosons decays to a muon and neutrino, while the other decays to hadrons. This leads to an abundant source of events with a distinctive leptonic signature and a hadronically decaying known heavy particle. Signal muons are defined as having p T > 25 GeV and |η| < 2.5, as well as passing a number of quality criteria. Events are required to contain a signal muon and no additional muons or electrons. The missing transverse momentum (E miss T ) is calculated as the negative of the vector sum of the transverse momenta of all physics objects, at the appropriate energy scale, and the transverse momentum of any remaining topo-clusters not associated with physics objects in the event. Events are required to have E miss T > 25 GeV. In events passing this leptonic selection, a W candidate is constructed from the signal muon and E miss T . In order to reject multi-jet background, this candidate is required to have transverse mass (m T ) greater than 40 GeV. 3 Boosted hadronically decaying W boson candidates are defined as single large-R jets with p T > 200 GeV. The hadronic and leptonic W candidates are required to be separated 3 Transverse mass mT is defined as E 2 T − p 2 T of the vector sum of the four-momentum of the signal muon and E miss T , assumed to be due to the neutrino.  by at least 1.2 radians in φ to minimize potential overlap between the decay products of the two W bosons. In order to further enhance the fraction of top-quark production processes, an anti-k t jet with R = 0.4, p T > 20 GeV and with ∆R > 1.0 to the hadronic W candidate is required. This additional jet is also required to be tagged as a b-jet by the MV1 algorithm at the 70% efficient working point [91].
The resulting mass distributions of hadronic W candidates for two jet algorithms can be seen in figure 11. A peak near the W mass is clearly observed for both the mass-drop filtering and trimming algorithms in data and simulated Monte Carlo events.
A fitting procedure is used to extract the features of the jet mass distribution. The peak produced by the hadronic decays of W bosons is modelled by a Voigtian function, which is the convolution of Gaussian and Lorentzian functions. 4 In this function the width of the Lorentzian component is fixed to the world-average W boson width. The width of the Gaussian function is a measure of the mass resolution although this is not explored in this study. The background shape has no simple analytic form and is assumed to be modelled by a quadratic polynomial. In order to simplify the background modelling, the W +jets MC prediction and multi-jet prediction are both subtracted from the data and the resulting distributions are compared to the sum of the tt, single-top quark, and W W MC samples. The results of this fit for Cambridge-Aachen jets can be seen in figure 12.
After fitting, the parameter controlling the mean of the Voigtian function (µ) is a measure of the reconstructed mass of hadronically decaying W bosons. As shown in figure 12, this scale is observed to be µ data = 86.9±0.8 GeV and µ MC = 87.4±0.2 GeV. The departure from the world-average value of m W = 80.385 ± 0.015 GeV [92] is due to clustering and detector effects, as well as physics effects such as contributions from the UE and hadroniza-     tion that are not completely removed by the grooming procedures. The ratio (µ data /µ MC ) is therefore the relevant figure of merit for any mismodelling of the reconstructed scale in Monte Carlo simulation. A relative scale difference of (−0.6 ± 1.0)% is found for C/A jets after mass-drop filtering, whereas the value is (0.5 ± 1.2)% for anti-k t jets after trimming. The uncertainties are statistical, as extracted from the fitting procedure.

JHEP09(2013)076
It is also possible to repeat the fitting procedure after dividing the data and Monte Carlo simulation samples into bins of |η|. This allows a test of any potential |η| dependence in a range outside that covered by track-jet techniques. The result of these fits can be seen in figure 13. The results extracted in individual bins show no statistically significant deviation from the average. Furthermore, systematic uncertainties are evaluated for this technique. The Monte Carlo modelling is assessed by replacing the tt and single-top samples with alternative samples as described in section 2.2. The impact of mismodelling of the jet resolution is assessed by applying additional smearing. The effect of mismodelling of the p T scale is studied by shifting the scale in the Monte Carlo predictions. Other systematic uncertainties considered are: a possible bias in the fitting procedure; cross-section uncertainties; muon reconstruction uncertainties; energy, resolution and b-tagging uncertainties associated with the additional jet; E miss T reconstruction uncertainties. The most significant of these are the uncertainties associated with Monte Carlo modelling and jet resolution.

In situ validation of the subjet energy scale
The trimming and mass-drop filtering procedures both rely heavily on the energy scale of subjets in order to evaluate f cut or µ frac , respectively. A similar approach based on double ratios as described in section 3.3.2 is used in order to determine the uncertainty on the energy scale of these subjets. Tracks measured by the inner detector are utilized as an independent reference with which to compare calorimeter measurements. Each subjet measured by the calorimeter has a set of tracks associated with it. The following momentum ratio is defined using the calorimeter p T (p subjet T ) and the track p T (p track T ) for each subjet: It is useful to identify the general features of the subjet structure of large-R jets in dijet events in order to guide the study of the kinematic properties of subjets. These jets are typically characterized by a highly energetic leading-p subjet T subjet located close to the parent jet axis as shown in figure 14. These leading subjets (with R sub = 0.3) carry a large fraction of the parent jet energy, and this fraction increases with the parent jet p T : p subjet,lead T /p jet T ≈ 0.71 for 100 GeV ≤ p jet T < 150 GeV and p subjet,lead T /p jet T ≈ 0.86 for 400 GeV ≤ p jet T < 500 GeV. The second leading-p subjet T subjet carries approximately 10% of the energy of the parent jet, for jets in the range 100 GeV ≤ p jet T < 500 GeV. The leading-p subjet T subjet is located on average at ∆R ≤ 0.07 from the axis of the parent jet, while less energetic subjets are more distant from the axis of the parent jet: ∆R ≥ 0.5. These observations are consistent with the increased radial collimation expected in jets produced from gluons and light quarks as the jet p T rises [93]. Therefore, the subjet structure of jets from dijet events can be characterized by looking only at the leading and sub-leading subjets. Given the crowded nature of the subjet topology within a jet, many standard approaches to studying the subjet energy scale begin to fail. Associating individual tracks with calorimeter subjets may begin to suffer from the proximity of multiple such subjets to a given track, which can in turn impact the measurement of r subjet trk . Geometrical matching relies on assumptions that are not fulfilled in high-density environments. This association assumes a cone-like shape and an area of πR 2 sub for all subjets, and that all tracks within this area belong to the subjet. This assumption generally works well in the case of an isolated anti-k t jet. Conversely, k t , C/A, and even anti-k t subjets in high-multiplicity environments often have very irregular boundaries and the question of which tracks to associate becomes more difficult to answer. Ghost-association [94,95] provides a much more appropriate matching of the tracks to the calorimeter subjets for this scenario. In this technique, tracks are treated as infinitesimally soft, low-p T particles by setting their p T to 1 eV. These tracks are then added to the list of inputs for jet finding. The low scale means the tracks do not affect the reconstruction of calorimeter-jets. However, after jet finding, it is possible to identify which tracks are clustered into which subjets. This technique shows a more stable dependence of the ratio r subjet trk on the angular separation between subjets. Generally, this approach facilitates the measurement of the effective area of a jet, or the so-called ghost area. Instead of tracks, a uniform, fixed density (one per ∆y × ∆φ = 0.01 × 0.01) of infinitesimally soft particles is distributed within the event and are allowed to participate in the jet clustering algorithm. Instead of identifying tracks associated with the resulting jets, the number of such ghost particles present in the jet after reconstruction defines the effective area of that jet. Figure 15 shows the mean ratio r subjet trk as a function of the distance between the subjet and its closest neighbour subjet in the η − φ plane (∆R min ). The numerator of r subjet trk corresponds to the total transverse momentum of tracks associated with the subjet using either a standard geometric association ( figure 15(a)  nique ( figure 15(b)). For standard geometric track association, a rise in r subjet trk is observed for close-by subjets with ∆R min ≤ 2 × R sub . The impact of the track association scheme is more significant for the second leading subjet, which often has a very energetic subjet nearby (i.e. the leading-p subjet T one). For these cases geometrical association of tracks fails dramatically, and the ratio r subjet trk is a factor of two smaller at ∆R min = 0.6 than at ∆R min = 0.3, which is approximately the smallest separation between jets. The double ratio R subjet trk = r subjet trk data / r subjet trk MC provides an estimate of the calibration uncertainty. Any difference is well within 5% for the leading-p subjet T subjet and 20% for the second leading subjet, independent of the track matching scheme.

Calibration of subjets
The HEPTopTagger [32] also relies on the energy scale of subjets to analyse the structure of R = 1.5 jets and to reconstruct a top-quark candidate four-momentum. This section describes a dedicated calibration procedure for C/A subjets with a radius parameter R = 0.2 − 0.5, which are used in the study of HEPTopTagger performance. Both the compatibility of the structure with hadronic top-quark decay and the dependence of the reconstructed four-momentum on the calibration are discussed here.
C/A jets -reconstructed as independent jets and not as constituents of parent jets, as above -are first calibrated using a simulation of the calorimeter response to jets by comparing the energy and pseudorapidity of a generator-level jet to that of a matched calorimeter jet. Calibration constants obtained from this procedure are then applied to the actual subjets reconstructed by the HEPTopTagger. For this reason, these C/A jets are referred to using the same subjet notation as in the previous section. Prior to calibration, the reconstructed jet energy is lower than the particle jet's energy and is corrected as a function of p T and in bins of η. For example, the correction for C/A jets with R = 0.4 and |η| < 0.1 is +9% at low p T and up to +2.5% for p T > 500 GeV. The corrected p T matches that of the particle jet to within 2% for all energies and pseudorapidities. For C/A R = 0.2 jets this closure test is 4% at p T = 20 GeV and better for higher p T .
Uncertainties on the jet calibration are determined from the quality of the modelling of the calorimeter-jet p T . The direct ratio p jet T (MC)/p jet T (data) is sensitive to mismodelling of jets at the hadron level. To reduce this effect, the calorimeter-jet p T is normalized to the p T of the tracks within the jet. This is done because the uncertainty of the track-jet p T tends to be small compared to the calorimeter-jet p T in the kinematic regime considered here. Tracks are matched to calorimeter-jets using ghost-track association. The jets are required to be within |η| < 2.1 to ensure coverage of the associated tracks by the tracking detector.
The average r subjet trk for a subset of the data characterized by an average number of interactions per bunch crossing in the range 4 < µ < 7 is shown in figure 16(a) as a function of p jet T for both data and simulation for C/A R = 0.4 jets with |η| < 0.8. The double ratio R trk = r subjet trk data / r subjet trk MC is shown in figure 16(b). Deviations of R trk from unity serve as an estimate of the uncertainty of the MC calibrated calorimeter-jet p T . The largest deviation from unity is seen at low p T and is 4% with a statistical uncertainty of 1%. The statistical uncertainty-weighted average double ratio is indicated by the horizontal line.
Similar results are obtained when varying the jet radius parameter between 0.2 and 0.5 and for higher pile-up conditions (evaluated using a subset of the data characterized by 13 < µ < 15). A jet energy uncertainty of 3.5% is assigned.
The imperfect knowledge of the material distribution in the tracking detector constitutes the dominating systematic uncertainty. It results in an additional uncertainty in R trk of ≈ 2% for |η| < 1.4 and ≈ 3% for 1.4 < |η| < 2.1, although it does not introduce a measurable shift. The jet p T systematic uncertainty is taken to be the absolute deviation of the central weighted-average R trk from unity, with the shifts introduced by the systematic variations added in quadrature. The uncertainty varies between 2.3% and 6.8%, depending on the jet p T , η, and the jet radius parameter. The uncertainty has been determined independently in samples with low and high pile-up µ-values and no significant difference has been found.
A sample of boosted top quarks is used to study the impact of heavy flavour and close-by jet topologies on the systematic uncertainties estimated above. The track-based validation is applied to a sample of events that contains 50% semileptonically decaying tt pairs, in which the top quark with the hadronically decaying W boson has p t T > 200 GeV. The remaining events are dominated by W +jets production. Figure 17 shows r subjet trk and R trk for C/A R = 0.4 jets in these events. The jet p T uncertainty in this sample varies between 2.4% and 5.7%. This section elaborates on the impact of pile-up on the jet mass and other observables, and the extent to which trimming, mass-drop filtering, and pruning are able to minimize these effects. In particular, these measures of performance are used as some of the primary figures of merit in determining a subset of groomed jet algorithms on which to focus for physics analysis in ATLAS. Figure 18 shows the dependence of the mean uncalibrated jet mass, m jet , on the number of reconstructed primary vertices, N PV , for a variety of jet algorithms in the central region |η| < 0.8. The events used for these comparisons are obtained with the   . For these comparisons, only the final period of data collection from 2011 is used, which corresponds to approximately 1 fb −1 of integrated luminosity but is the period with the highest instantaneous luminosity recorded at √ s = 7 TeV, where µ = 12, higher than the average over the whole 2011 data-taking period. The lower range, 200 GeV ≤ p jet T < 300 GeV, represents the threshold for most hadronic boosted-object measurements and searches, whereas the range 600 GeV ≤ p jet T < 800 GeV is expected to contain top quarks for which the decay products are fully merged within an R = 1.0 jet nearly 100% of the time. In each figure, the full set of grooming algorithm parameter settings is included for comparison. As noted in table 1, two values of the subjet radius, R sub , are used for trimming, three R cut factors for pruning are tested (using the k t algorithm with the procedure in all cases), and three µ frac settings are evaluated using the filtering algorithm.
Several observations can be made from figure 18. Of the grooming configurations tested, trimming and filtering both significantly reduce the rise with pile-up of m jet seen for ungroomed jets, whereas pruning does not. For at least one of the configurations tested, trimming and filtering are both able to essentially eliminate this dependence. Furthermore, the trimming configurations tested provide a highly tunable set of parameters that allow a relatively continuous adjustment from small to large reduction of the pile-up dependence of the jet mass. The trimming configurations with R sub = 0.2, f cut = 0.03 and R sub = 0.3, f cut = 0.05 exhibit good stability for both small and large p jet T , with the f cut = 0.05 configuration exhibiting a slightly smaller impact from pile-up at high N PV for low p jet T (not shown). The other parameter settings either do not reduce the pile-up dependence at low p jet T (e.g. f cut = 0.01) or result in a downward slope of m jet as a function of pile-up at high p jet T (e.g. f cut = 0.05, R sub = 0.2). Pruning, on the other hand, exhibits the smallest impact on the pile-up dependence of the jet mass for these large-R jets. Only by increasing the z cut parameter from z cut = 0.05 to z cut = 0.10 can any reduction on the dependence of m jet on pile-up be observed. This is equivalent to reducing the low-p T contributions during the jet recombination; in the language of trimming, this is analogous to raising f cut . This change slightly reduces the magnitude of the variation of the mean jet mass as a function N PV for low p jet T . The R cut parameter has very little impact on the performance, with nearly all of the differences observed being due to the change in z cut . This observation holds for both small and large p jet T . The mass-drop filtering algorithm can be made to affect m jet significantly solely via the mass-drop criterion, µ frac . A drastic change in m jet is observed for all configurations of the jet filtering, with the strictest µ frac = 0.20 setting rejecting nearly 90% of the jets considered and resulting in a slightly negative slope in the mean jet mass versus N PV . Nevertheless, the other two settings of µ frac tested exhibit no significant variation as a function of the number of reconstructed vertices, and the optimum value of µ frac = 0.67 found previously seems to have the best stability. Studies from 2010 [25]  that this reduction in the sensitivity to pile-up is due primarily to the filtering step in the algorithm as opposed to the jet selection itself. 5 Figure 19 presents the pile-up dependence of the mean leading-p jet T jet mass, m jet 1 , in data compared to the three simulations. Here, only the range 600 GeV ≤ p jet T < 800 GeV for ungroomed and trimmed anti-k t jets is shown for brevity, but similar conclusions apply in all p jet T ranges. The comparison is made using the full 2011 dataset. PYTHIA, HERWIG++, and POWHEG+PYTHIA all model the data fairly accurately, with a slight 5%-10% discrepancy appearing in the predictions from PYTHIA and HERWIG++ for the trimmed jets. Most importantly, the impact of pile-up is very well modelled, with the slope of the dependence of m jet 1 on N PV in data agreeing within 3% with the POWHEG+PYTHIA prediction for both the ungroomed and trimmed jets.
Beyond simply providing a pile-up-independent average jet mass, the optimal grooming configurations render the full jet mass spectrum insensitive to high instantaneous luminosity. Figure 20 demonstrates this by comparing the jet mass spectrum for leading-p jet T ungroomed and trimmed anti-k t jets for various values of N PV . The comparison is performed both in an inclusive data sample and using the Z → tt MC sample, which produces a characteristic peak at the top-quark mass. The inclusive jet sample obtained from data shows that a nearly identical trimmed m jet spectrum is obtained regardless of the level of pile-up. The peak of the leading-p jet T jet mass distribution for events with N PV ≥ 12 is shifted comparatively more due to trimming: from m jet ≈ 125 GeV to m jet ≈ 45 GeV . Jet mass spectra for four primary vertex multiplicity ranges for anti-k t jets with R = 1.0 in the range 600 GeV ≤ p jet T < 800 GeV. Both untrimmed (left) and trimmed (right) anti-k t jets are compared for the various N PV ranges in data (top) and for a Z → tt Monte Carlo sample (bottom). compared to an initial peak position of m jet ≈ 90 GeV for events with 1 ≤ N PV ≤ 4. Nonetheless, the resulting trimmed jet mass spectra exhibit no dependence on N PV .
Comparisons performed using the simulated Z → tt sample demonstrate the same performance of the trimming algorithm, but in the context of the reconstruction of highly boosted top quarks. Figures 20(c)-(d) indicate that the ability to render the full jet mass distribution independent of pile-up does not come at the cost of the mass resolution or scale. Prior to jet trimming, a variation in the peak position of the jet mass of nearly 15 GeV is observed between the lowest and the highest ranges of N PV studied. After jet trimming, the resulting mass spectra for the various N PV ranges are narrower and -33 -

JHEP09(2013)076
lie directly on top of one another, even in the case of a jet containing a highly boosted, and thus very collimated, top-quark decay. This observation, combined with that above, demonstrates that the trimming algorithm is working as expected by removing soft, wideangle contributions to the calculation of the jet mass while retaining the relevant hard substructure of the jet.
Finally, although not shown explicitly, the mass-drop filtered jet mass is also stable with respect to pile-up.
The track-jet approach used to evaluate the jet mass uncertainty (see section 3.3.2) is also used to understand the effects of pile-up. It is observed that r m track jet is nearly equal for the various trimming configurations in the case of little or no in-time pile-up (i.e. N PV ≈ 1) whereas filtering shows a significant, although small, difference between the configurations using µ frac = 0.67, 0.33, 0.20. This shows that the filtering method does affect the magnitude of m jet and m track jet slightly differently, resulting in an approximate 12% relative drop in r m track jet after filtering. These distributions are nonetheless well modelled by the simulation, resulting in double ratios of data to simulation very close to one. The trimming configuration with R sub = 0.3, f cut = 0.01 has almost no impact on the dependence of r m track jet with pile-up.

Figures 21 and 22
show the variation with pile-up observed in data for the splitting scales and N -subjettiness observables for jets in the range 600 GeV ≤ p jet T < 800 GeV. In this case, the focus is on jet trimming since the mass-drop filtering algorithm makes a predefined choice to search for properties of a jet characteristic of a two-body decay. The constraints placed on subjet multiplicity by the filtering procedure are not appropriate for calculating generic jet shapes given the strict substructure requirements they place on a jet. Furthermore, pruning of jets with R = 1.0 does not seem to mitigate the effects of pile-up. The trimming configurations with R sub = 0.3 and f cut = 0.03, 0.05 yield the most stable jet substructure properties with the smallest deviation in their observed mean values at low N PV . This conclusion holds for all other jet p jet T ranges as well, with larger differences between f cut = 0.03, 0.05 appearing at low p jet T . Figure 23 presents a comparison of data and Monte Carlo simulation for √ d 12 and τ 32 for ungroomed and trimmed (R sub = 0.3, f cut = 0.05) anti-k t jets with R = 1.0. The slope of these observables as a function of N PV is well modelled by the simulation.

Impact of pile-up on signal and background in simulation
In addition to the comparisons between data and simulation, and between the various grooming configurations, a comparison of how grooming impacts signal-like events versus background-like events in searches for resonances decaying to boosted jets is crucial. Figure 24 shows the variation of the average leading-p jet T jet mass, m jet 1 , with N PV for events with 600 GeV ≤ p jet T < 800 GeV for ungroomed and trimmed anti-k t , R = 1.0 jets, for both the Z → tt sample and the POWHEG+PYTHIA dijet sample.     hadronic boosted top-quark decays contained in a single jet. The mass reconstruction in this case proceeds as usual (four-momentum recombination) and the mass distribution is highly peaked near the top-quark mass of approximately 175 GeV. Jets in this peak but without grooming exhibit a slope of roughly d m jet 1 /dN PV ≈ 2.15 GeV/N PV , or about 30% smaller than in the inclusive jet sample. In the case of trimmed jets, the slopes as a function of N PV for both signal-like jets and jets in dijet events are consistent with zero.  Most importantly, the average separation between the mean jet mass for signal-like jets in the Z sample and those in the POWHEG+PYTHIA dijet sample increases by nearly 50% after trimming and remains stable across the full range of N PV . This allows for much better discrimination between the two processes. The separation shown here is significant since the widths of the peaks of each of the distributions are also simultaneously -36 -narrowed by the grooming algorithm, as shown in figure 20. This differential impact of trimming is again due to the design of the algorithm: soft, wide-angle contributions to the jet mass that are ubiquitous in jets produced from light quarks and gluons are suppressed whereas the hard components present in a jet with true substructure -as in the case of the top-quark jets here -are preserved.

Jet substructure and grooming with boosted objects in data and simulation
Comparisons between jets containing signal-like boosted objects and a light-quark or gluon jet background are presented here. Boosted objects are divided into two categories depending on the event topology: two-pronged, such as hadronically decaying W or Z bosons, and three-pronged, such as the top quark decaying into a b-jet and a hadronically decaying W boson. Performance measures are shown for both simulated samples of Z → qq and top quarks (from Z → tt), as well as for inclusive jet data and events enriched in boosted top quark pairs. In addition to the event and object selection listed in section 2.2, the large-R ungroomed leading-p jet T jet axis is required to be within |η| < 2.0. For the signal distributions, a ∆R < 1.5 match between the four-momentum of the hadronically decaying boosted object in the truth record and the reconstructed ungroomed leading-p jet T jet is made to minimize the contamination from light-quark or gluon jets (or top quarks with a leptonically decaying W boson in the Z sample).

Jet mass resolution for background
The fractional jet mass resolution is defined as the width of a Gaussian fit to the central part of the distribution that is generated by taking the difference between the generator-level jet mass and the reconstructed jet mass, divided by the same generator-level jet mass. Here, the generator-level jet is the simulated particle shower that has been groomed according to the same grooming algorithms used after jet reconstruction. Large-R generator-level and reconstructed jets before grooming are matched if they are within ∆R < 0.7. The matching is performed only once and comparisons between generator-level and reconstructed jets after grooming are made using the groomed versions of the matched ungroomed jets. The mass-drop filtering method is not applied to anti-k t jets, as discussed in section 1.2. Figure 25 shows the fractional mass resolution for leading-p jet T jets in the POWHEG+PYTHIA dijet sample with the same pile-up conditions as were observed in the data. Abbreviated versions of the groomed algorithm names used to label the figures are listed in table 3. In general, the groomed jets have better resolution than the ungroomed large-R jets, with improvements of up to ∼10% (absolute) in some cases. The trimmed and pruned jet resolution improves with increasing p T , where the calibrated jets gain ∼3 − 5% over the range 300 GeV ≤ p jet T < 800 GeV. The pruning algorithm, especially with C/A jets, produces larger tails in the resolution distribution compared to the trimmed algorithm, worsening the overall fractional resolution in comparison. The resolution is fairly stable for the mass-drop filtering algorithm over a large range of p T . It is important to -37 -  Table 3. Labels used in figures to represent the various configurations of the grooming algorithms. † Groomed jets have been calibrated for both anti-k t and C/A jets. ‡ Groomed jets have been calibrated for anti-k t only. § Groomed jets have been calibrated for C/A only. note that the efficiency is considerably lower for jets resulting from this algorithm compared with jets produced in other grooming procedures (∼30%) due to the strict mass-drop requirement, which is often not met for jets without boosted object substructure.

JHEP09(2013)076
A summary of the fractional mass resolution for jets before and after grooming in the presence of various pile-up conditions is shown in figure 26. Trimming in both anti-k t and C/A jets reduces the dependence of the jet mass on pile-up (spread in the points) compared to the ungroomed jet, as does the mass-drop filtering procedure in the case of C/A jets, while pruning has little impact. In all cases, no pile-up subtraction is applied to the ungroomed jet kinematics. In particular, the trimming parameters f cut = 0.03 and 0.05 slightly outperform the looser f cut = 0.01 setting in events with a mean number of Simulation ATLAS LCW

Jet mass resolution for simulated signal events
Figures 27 and 28 show the fractional mass resolution for the two-pronged and threepronged cases, respectively. The mass-drop filtering algorithm is shown only for the simulated two-pronged signal events with C/A jets. In the two-pronged case, as for the case of jets in the inclusive jet events shown in figure 25, the C/A mass-drop filtering algorithm performs the best, but with a signal reconstruction efficiency of ∼ 45% in Z → qq events (for µ frac = 0.67). In both the two-pronged and three-pronged configurations, the trimmed jets have better fractional mass resolution (∼ 5 − 10%) than the pruned jets, especially for those jets with grooming applied after the C/A algorithm. The trimmed jet mass resolution also remains fairly stable across a large p jet T range, with equivalent performance for anti-k t and C/A jets.

Signal and background comparisons with and without grooming
Leading-p jet T jet distributions of mass, splitting scales and N -subjettiness are compared for jets in simulated signal and background events in the range 600 GeV ≤ p jet T < 800 GeV.  T rm P tF 1 R 3 0 T rm P tF 3 R 3 0 T rm P tF 5 R 3 0 T rm P tF 1 R 2 0 T rm P tF 3 R 2 0 T rm P tF 5 R 2 0 P rn R c 1 0 Z c 5 P rn R c 1 0 Z c 1 0 P rn R c 2 0 Z c 5 P rn R c 2 0 Z c 1 0 P rn R c 3 0 Z c 5 P rn R c 3 0 Z c 1 0 < 400 GeV T rm P tF 1 R 3 0 T rm P tF 3 R 3 0 T rm P tF 5 R 3 0 T rm P tF 1 R 2 0 T rm P tF 3 R 2 0 T rm P tF 5 R 2 0 P rn R c 1 0 Z c 5 P rn R c 1 0 Z c 1 0 P rn R c 2 0 Z c 5 P rn R c 2 0 Z c 1 0 P rn R c 3 0 Z c 5 P rn R c 3 0 Z c 1 0 < 400 GeV figures [32][33][34][35] showing comparisons for the three-pronged decay case, better discrimination between signal and background is obtained after grooming. In these figures, the ungroomed distributions are normalized to unit area, while the groomed distributions have the efficiency with respect to the ungroomed large-R jets folded in for comparison. This is especially conspicuous in the C/A jets with mass-drop filtering applied as mentioned previously.
The mass resolution of the simulated Z → qq signal events shown in figure 29 dramatically improves after trimming or mass-drop filtering for anti-k t jets with R = 1.0 and   C/A jets with R = 1.2, respectively. Mass-drop filtering has an efficiency of approximately 55% and therefore fewer jets remain in this figure. After trimming or mass-drop filtering, the mass peak corresponding to the Z boson is clearly seen at the correct mass. Note that the dijet background is pushed much lower in mass after grooming as was demonstrated in figure 20, while the constituents of signal jets have higher p T and survive the grooming procedure, thus improving discrimination between signal and background. The small excess of signal events below 50 GeV is the result of one of the two quarks from the decay of the Z boson being removed by the jet grooming, thus leaving only one quark reconstructed as the jet and making it indistinguishable from the background. Figure 30 shows the splitting scale √ d 12 in Z → qq events. The signal exhibits a splitting scale roughly equal to half the mass of the jet, whereas the splitting scale distribution for jets produced in dijet events peaks at smaller values of √ d 12 and falls more steeply. This effect is enhanced after grooming, especially in the case of C/A jets after mass-drop filtering. In figure 31, the N -subjettiness variable τ 21 is observed to have improved discrimination between signal and background with anti-k t trimmed jets compared to C/A mass-drop filtered jets, where the discrimination is worsened after applying the mass-drop filtering criteria. The filtering step is explicitly reconstructing a fixed number of final subjets (three, in this case), thereby shaping the background and worsening the resulting separation.
The three-pronged hadronic top-jet mass distributions from Z → tt events are shown in figure 32, where the signal peak is relatively unshifted between groomed and ungroomed jets, especially with anti-k t jets. Again the mass resolution for the signal improves after grooming, where the W -mass peak can also be seen after trimming is applied. hancement of the W -mass peak is seen especially in jets with lower p jet T , as the jet from the b-quark decay falls outside the radius of the large-R jet. Figures 33 and 34 show the variables √ d 12 and √ d 23 , respectively, for Z → tt events compared to jets produced in dijet events. As in the two-pronged case, signal discrimination with the splitting scales is enhanced after jet trimming.
One of the primary applications of N -subjettiness is as a discriminating variable in searches for highly boosted top quarks [18]. A common method of comparing the performance of such discriminating variables or tagging algorithms is to compare the rate at which light-quark or gluon jets are selected (the mis-tag rate) to the efficiency for retaining jets containing the hadronic particle decay of interest [13,14]. This comparison is performed for both ungroomed and trimmed jets in order to assess the impact of grooming on the discrimination power of this observable. Figure 35 shows the τ 32 distribution before and after trimming. Here, trimming of anti-k t and C/A jets results in similar discrimination between signal and background. In order to understand the utility of the τ 32 selection criterion and the potential impact of jet grooming, trimmed anti-k t , R = 1.0 jets are compared to their ungroomed counterparts in a boosted top sample for two jet momentum ranges. The signal mass range is defined as that which contains a large fraction of the boosted top signal. For ungroomed jets this fraction is set to 90%, and the mass range that satisfies this requirement is 100 GeV ≤ m jet < 250 GeV. A slightly lower signal fraction of 80% for the same mass range is required for groomed jets; this is motivated by the tendency for trimmed jets to populate an additional small peak around the W mass, as shown in figure 32.  The mis-tag rate is defined as the fraction of the POWHEG dijet sample that remains in the mass window after a simple selection based on τ 32 . The signal top jet efficiency is defined as the fraction of top jets selected in the Z sample with the same τ 32 selection.  Figure 36 shows the performance of the N -subjettiness tagger for jets with 600 GeV ≤ p jet T < 800 GeV and 800 GeV ≤ p jet T < 1000 GeV. In both cases, for a fixed top-jet efficiency, the reduction in high-invariant-mass jets due to trimming results in a relative reduction of several percent in the mis-tag rate. Moreover, in the case of very high p jet T , as in figure 36(b), the slightly more aggressive trimming configuration results in a slight performance gain as well.

Inclusive jet data compared to simulation with and without grooming
Previous studies conducted by ATLAS [25] and CMS [26] suggest that even complex jet substructure observables are fairly well modelled by the MC simulations used by the LHC experiments. This section reviews the description provided by PYTHIA, HERWIG++, and POWHEG+PYTHIA of the jet grooming techniques introduced above, and of the substructure of the ungroomed and groomed jets themselves. Figure 37 presents a comparison of the jet invariant mass for ungroomed, trimmed, and filtered jets in the range 600 GeV ≤ p jet T < 800 GeV and in the central calorimeter, |η| < 0.8. Similar performance is observed in all other p T regions in the range p T > 300 GeV, and |η| < 2.1. The description of both the ungroomed and trimmed anti-k t jets with R = 1.0 provided by PYTHIA is poor for large masses. The descriptions provided by HERWIG++ as well as for the NLO generator POWHEG+PYTHIA are more accurate. PYTHIA tends to underestimate the fraction of high-mass large-R anti-k t jets, whereas HERWIG++ and POWHEG+PYTHIA are accurate to within a few percent, even for very massive jets. The ungroomed anti-k t , R = 1.0 jets are poorly described by all three MC simulations at low mass; this could be due to non-perturbative and detector effects which increase the jet mass. This generally soft contribution is removed by grooming.
A similarly poor description of the low-mass region is observed for C/A jets with R = 1.2. In this case however, PYTHIA, in addition to both HERWIG++ and POWHEG +PYTHIA, provides a fairly good description of the high-mass regime of the jet mass spectrum. This suggests that there is a slight angular scale dependence, and the slightly smaller radius used for the large-R anti-k t jets in these studies could play a role in the observed discrepancy with PYTHIA. Figure 37 also shows that the shape of the jet mass distribution is significantly affected by the mass-drop filtering technique. This change is well described by all of the MC simulations, although the accuracy of the HERWIG++ and POWHEG+PYTHIA predictions is again observed to be slightly better. Figure 38 presents an overview of the shape of the jet mass spectrum for several configurations of the jet trimming algorithm for anti-k t jets and for C/A jets with massdrop filtering applied. These spectra are measured using approximately 1 fb −1 of data from the last data-taking period of 2011 where µ = 12, higher than the average over the whole 2011 data-taking period. The significant spectral shift and shape difference compared to the original jet is apparent for both grooming algorithms shown (and also for pruning, which is not shown here). Significant variation is also observed among the configurations tested, with the large f cut , small R sub setting for trimming and the small µ frac setting for mass-drop filtering exhibiting the most dramatic changes. For jet masses in the range of 50 GeV ≤ m jet < 300 GeV, which is expected to be the most relevant in searches for new physics, the trimming configurations exhibit efficiencies in the range of 30%-70%, defined as the ratio of the yield after grooming to that prior to grooming. In particular, the trimming configuration with f cut = 0.05 and R sub = 0.3 yields an approximate 47% efficiency in this mass range. Mass-drop filtering provides a more stringent selection, yielding efficiencies in the same 50 GeV ≤ m jet < 300 GeV mass range of 20%, 12%, and 3% for µ frac = 0.67, 0.33, 0.30, respectively.
The significant change observed in the jet mass distribution is due primarily to a reduction in the effective area of each jet (see section 3.4 a detailed description of the jet area). Soft and wide-angle jet constituents are removed from the jet, thereby reducing the overall catchment area. This has the desirable effect of also reducing the impact of pile-up on the jet properties. Figure 39 shows the effect of grooming on the average jet area as a function of the jet mass for both the ungroomed and trimmed anti-k t , R = 1.0 jets. Prior to jet trimming, the anti-k t , R = 1.0 area is very close to π (i.e. πR 2 , with R = 1.0). A small rise in the ungroomed jet area is observed for jets with very large mass, characterized by small additional clusters near the edge of the jet. For trimmed jets at low mass, the average jet area is reduced by a factor of 3-5 and continuously rises as a function of the jet mass to a maximum of approximately half of the original jet area for high-mass ungroomed jets. These features are very well described by the MC simulations across the entire spectrum of jet mass, both before and after trimming. Similar observations are made with respect to lower and higher p jet T bins, as well as for pruning and filtering. As discussed in section 1.2, the splitting scales subjets, ordered by p subjet T . These observables are therefore dominated by contributions originating from energetic partons, either from the parton shower or from massive particle decay. It is therefore not surprising that the jet grooming, which removes low-p T components of the jet, does not significantly affect √ d 12 , shown in figure 40. It is also not surprising that POWHEG+PYTHIA describes this variable better than PYTHIA, especially at large values. Interestingly, HERWIG++ also models this substructure characteristic well, despite providing only a LO description of the hard process. In the case of events selected only for the presence of a single high-p T jet, a second hard splitting is not highly probable, as also evidenced by the spectrum of √ d 23 , shown in figure 41, falling more steeply than √ d 12 . This demonstrates that the trimming tends to affect this splitting scale slightly more when a third -p subjet T subjet within the parent jet is modified during the trimming procedure.
N -subjettiness helps to discriminate between jets that have well-formed substructure and those that do not. Figures 42 and 43 demonstrate that the MC simulations model the distributions of τ 21 and τ 32 observed in the data within about 20%. The jets in this case are selected to have 600 GeV ≤ p jet T < 800 GeV. Both PYTHIA and POWHEG+PYTHIA exhibit a shift in the τ 21 distribution towards larger values of τ 21 with respect to the data for both the ungroomed and trimmed jets. HERWIG++, however, provides a much better description of τ 21 for ungroomed jets, and shows a slight shift towards smaller values of τ 21 compared to the data. The fraction of events with a small value of τ ij is predicted to be slightly smaller than in data, while it is the opposite for high values of τ ij . In addition, the distributions of both τ 21 and τ 32 are broadened for trimmed jets (figures 42(b) and 43(b)) and shifted slightly towards smaller values of τ ij and into the region expected to be populated by boosted hadronic particle decays. These observations suggest that the use of shape observables for ungroomed jets, and of jet mass and substructure observables (like √ d 12 ) may lead to better discrimination between signal and background. For τ 32 , the MC distributions slightly underestimate the fraction of jets with 0.3 < τ 32 < 0.7, which is the signal range for boosted top-quark candidates. Since these observables are intended to be used as discriminants between boosted object signal events and the inclusive jet background, such differences are important for the resulting estimation of signal efficiency compared to background rejection. However, the variations observed in the distribution of each observable translate into much smaller differences in efficiency and rejection.
The modelling of the background with respect to massive boosted objects can be tested by evaluating, for example, the evolution of the mean of substructure variables as a function of jet mass. This is shown for τ 32 in figure 44 for the same 600 GeV ≤ p jet T < 800 GeV range as used above. Three important observations can be made. Values of τ 32 are slightly lower in data than those predicted by the MC simulations, and the trimmed values are lower compared to ungroomed jets. Furthermore, τ 32 is a slowly varying function of the jet mass for both the ungroomed and trimmed jets. This variation is slightly reduced for trimmed jets.

Semi-leptonic tt selection
A selection of tt→ (W b)(Wb) → (µνb)(qqb) events is used to demonstrate in data the effect of grooming on large-R jets with substructure. The semileptonic tt decay mode in which one W boson decays into a neutrino and a muon is chosen in order to tag the tt event and reduce the overwhelming multi-jet background so that the top-quark signal is visible. This provides a relatively pure sample of top quarks and is also very close to the selection used in searches for resonances that decay to pairs of boosted top quarks [96,97]. The following event-level and physics object selection criteria are applied to data and simulation: • Event-level trigger and data quality selection: the standard data quality and vertex requirements described in section 2.2 are applied. Events are selected if they satisfy the single-muon Event Filter trigger with muon p T > 18 GeV. • Event-level jet selection: events are required to have at least four anti-k t jets with R = 0.4 having p jet T > 25 GeV and jet vertex fraction |JV F | >0. 75. The jet vertex fraction is a discriminant that contains information regarding the probability that a jet originated from the selected primary vertex in an event [98].
• Lepton selection: muons must be reconstructed in both the inner detector and the muon spectrometer and have p T > 20 GeV and |η| < 2.5. The opening angle between the muon and any R = 0.4 jet with p T > 25 GeV and jet vertex fraction |JV F | > 0.75 must be greater than ∆R = 0.4 to be well isolated. Events with one or more electrons passing standard criteria as described in ref. [97] are rejected.
• Event-level neutrino and leptonic W -decay requirement: to tag events with a leptonically decaying W boson from a top-quark decay, events are required to have missing transverse momentum E miss T > 20 GeV. Additionally, the scalar sum of E miss T and the transverse mass of the leptonic W boson candidate must satisfy E miss T +m W T > 60 GeV, where m W T = 2p T E miss T (1 − cos ∆φ) is calculated from the muon p T and E miss T in the event, and ∆φ is the azimuthal angle between the charged lepton and the E miss T , which is assumed to be due to the neutrino.
After these selection requirements, the W +jets process constitutes the largest background, with smaller contributions from Z+jets and single-top-quark processes. Figure 45 shows the leading-p jet T jet mass of anti-k t jets with R = 1.0 having p T > 350 GeV before and after trimming (f cut = 0.05, R sub = 0.3) after the above selection criteria are applied. The data and simulation agree within statistical uncertainty. The W → µν events produced in association with jets form the largest background. Since large-R jets in W   events are formed from one or more random light-quark or gluon jets, trimming causes the mass spectrum to fall more steeply and the peak of the distribution to lie at smaller masses, similar to the multi-jet background in figure 38(a). However, trimming does not alter the signal mass spectrum drastically, and any signal loss near the top-mass peak is due to events in which the top quark is not boosted enough to have all three hadronic decay products fall within R = 1.0.

Performance of trimming in tt events
In order to look at events with a reduced W + jets background, a b-tagging requirement on at least one anti-k t jet in the event with R = 0.4 is applied in addition to the selection criteria described in section 5.3.1. Figure 46 shows the effect of trimming on the mass of the leading-p jet T anti-k t jet with R = 1.0 in a sample of nearly pure tt events. Trimming clearly enhances the mass discrimination compared to the ungroomed case, with a peak at low mass corresponding to large-R jets containing one quark or gluon (probably from a fully leptonic tt event) and a peak around the top mass, where all three top decay products, the b-jet and hadronic W -decay daughters, fall inside the large-R jet radius. √ d 23 , respectively, before and after trimming. Again, the top-quark distribution remains relatively unaffected by trimming, while the W +jets background is pushed to lower values; however, the effect is smaller for these variables than for the mass. Also shown are the distributions with the simulated background subtracted. Comparing the shape before and after grooming with the signal distributions shown in figures 33 and 34 for very high-p T top quarks, it is apparent that not all top quarks in this data sample were sufficiently boosted to have their decay products fall within a single R = 1.0 jet. Figure 49 shows the signal-enriched distributions of τ 32 before and after trimming. Here, there is less discrimination between the W +jets background and the top-quark signal. This is due to the fact that multiple quark and gluon jets in W events can be reconstructed as one R = 1.0 anti-k t jet and mimic the subjettiness signature of the large-R jet containing the hadronic decay products of the top quark. For shape comparisons with figure 35, the background-subtracted plots are also shown.

Application of the HEPTopTagger in tt events
Basic kinematic distributions for large-R jets before and after applying the HEPTopTagger algorithm, but without any b-tagging requirement, are shown in this section. Top-quark candidates in data and simulation are compared after selecting events according to the criteria listed in section 5.3.1. Figure 50 shows the jet mass distributions of C/A jets with R = 1.5 and R = 1.8 before applying the HEPTopTagger, for jets having p T > 200 GeV. The sample consists of approximately 50% tt-pair events, with other contributions coming mainly from W +jets and Z+jets events. A larger contribution from multi-jet events compared to that observed in section 5.3.2 is expected, due to the larger jet radius and lower p T threshold. The large-R jet mass is generally well described by the simulation. Figure 51 shows the top-candidate mass distribution after applying the HEPTopTagger with four different filtering and large-R jet configurations (see table 2 for details). For all settings, the top-mass peak shape is generally well described by MC simulation and a relatively pure tt selection is obtained for top-quark candidate masses above 120 GeV.
The good agreement between data and simulation both before and after applying the HEPTopTagger shows that the exploited substructure is modelled well, even for jets with a very large radius. Figure 52 shows the variables used in the requirements imposed by the HEPTopTagger on the subjet mass ratios, defined in eq. (1.6) and eq. (1.7). For example in figure 52(b), if the sub-leading p T and the sub-sub-leading p T subjets are the decay products of a W boson -55 -   The efficiency of the HEPTopTagger is measured as a function of the transverse momentum of the generated top quark, and is the product of the large-R jet finding efficiency and the efficiency to tag the jet correctly: ε(total) = ε(large-R jet) · ε(tag). Figure 53 shows ε(total) for four different filtering configurations of the HEPTopTagger as a function of the generator-level true top-quark p T for the tt MC sample. The efficiency for the default settings is 20% at 250 GeV and reaches a plateau of 40% at 500 GeV. Below 400 GeV the efficiency can be improved by 5% by using a larger radius parameter of R = 1.8. The maximum efficiency for the tight filtering settings is 30%.
The fake efficiency, shown in figure 54(a), is defined in exactly the same way but is evaluated using the PYTHIA inclusive jet sample. The p T of the leading-p jet T anti-k t jet with R = 0.4 has been chosen to compute the efficiency as it provides a measure for the energy available in the event and is easily comparable between different tagging approaches. The fake efficiency shows a sharp turn-on around 200 GeV with efficiencies below 0.5% below and a plateau of 4% (2.5%) for the default and loose (tight) filtering settings. Figure 54(b) shows the fake tagging efficiency as a function of the top-jet p T in a multi-jet background sample and for events with a hadronically decaying W boson. The  fake efficiency rises sharply at 300 GeV and reaches a plateau of 2.5% at a large-R jet p T of 400 GeV.
The efficiency of the HEPTopTagger to select jets from boosted top quarks can be increased by varying the filtering parameters. Since this also increases the fake efficiency, the optimal working-point depends on the analysis in question.

Conclusions
It has been demonstrated experimentally in this paper that jet grooming algorithms can improve the identification of Lorentz-boosted physics objects that decay to jets, as well as increase sensitivity to several new physics processes. The performance of large-R jets is improved overall, and the dependence on pile-up and the underlying event is reduced.
Jet mass calibrations have been derived in simulation for various large-R jet algorithms, subjets, as well as for jets with grooming applied. These have been validated for boosted W bosons in tt events and using calorimeter-jet versus track-jet double ratios. Uncertainties   on the jet energy scale and the jet mass scale have been provided over a wide range of large-R jet momentum.
The mass distributions observed in data for large-R jets in the inclusive jet sample, before and after grooming, are well reproduced by the ATLAS simulation, especially using the POWHEG NLO generator. The substructure variables presented here also show good agreement between data and simulation, typically within 5% for key observables for modern -59 -

JHEP09(2013)076
NLO plus parton shower Monte Carlo programs such as POWHEG+PYTHIA, as well as for the LO MC program HERWIG++.
The parameters of trimmed, pruned and mass-drop filtered jet algorithms have been optimized for searches and precision Standard Model measurements using various performance measures. Among the configurations tested here, the trimming algorithm exhibits better performance than the pruning algorithm, with superior mass resolution and reduced dependence on pile-up. In particular, the anti-k t algorithm with R = 1.0 and trimming parameters f cut = 0.05 and R sub = 0.3 is recommended for boosted top physics analyses, where a minimum p T requirement of 200 GeV is typical. It is important to note, though, that only the k t -pruning for large-R (R = 1.0) jets has been tested in this work, and that future studies should expand the comparisons to include the C/A-pruning as well. Additionally, C/A jets with R = 1.2 using the mass-drop filtering parameter µ frac = 0.67 are recommended for boosted two-pronged analyses such as H → bb or searches using W → qq.
The benefit of using these grooming algorithms along with substructure variables has been demonstrated in top-tagging studies, where the efficiency of finding a boosted top quark for a given background jet mis-tag rate is greatly increased after grooming is applied due to the improved mass resolution. Grooming has been shown to leave the boosted signal mass peak relatively unaffected while systematically shifting the light-quark and gluon jet background lower in mass, thus increasing the discrimination of signal from background.
The HEPTopTagger has been demonstrated to be a robust and versatile tool to reconstruct hadronically-decaying top quarks in the presence of the underlying event and pile-up, using jet grooming and substructure techniques. A comparison to data shows that the algorithm is well modelled by the simulations. The HEPTopTagger performance (efficiency, rejection, mass resolution) can be optimized for a given analysis by varying the algorithm parameters.
From the studies presented here, groomed jets and substructure variables are ready to be used in further ATLAS physics analyses. These techniques will become extremely beneficial tools in upcoming searches for boosted physics objects in supersymmetric and exotic models, measurements of boosted Higgs topologies, and in detailed precision Standard Model measurements of QCD and electroweak processes with jets and boosted hadronic objects. Open Access. This article is distributed under the terms of the Creative Commons Attribution License which permits any use, distribution and reproduction in any medium, provided the original author(s) and source are credited.