Power Counting to Better Jet Observables

Optimized jet substructure observables for identifying boosted topologies will play an essential role in maximizing the physics reach of the Large Hadron Collider. Ideally, the design of discriminating variables would be informed by analytic calculations in perturbative QCD. Unfortunately, explicit calculations are often not feasible due to the complexity of the observables used for discrimination, and so many validation studies rely heavily, and solely, on Monte Carlo. In this paper we show how methods based on the parametric power counting of the dynamics of QCD, familiar from effective theory analyses, can be used to design, understand, and make robust predictions for the behavior of jet substructure variables. As a concrete example, we apply power counting for discriminating boosted Z bosons from massive QCD jets using observables formed from the n-point energy correlation functions. We show that power counting alone gives a definite prediction for the observable that optimally separates the background-rich from the signal-rich regions of phase space. Power counting can also be used to understand effects of phase space cuts and the effect of contamination from pile-up, which we discuss. As these arguments rely only on the parametric scaling of QCD, the predictions from power counting must be reproduced by any Monte Carlo, which we verify using Pythia8 and Herwig++. We also use the example of quark versus gluon discrimination to demonstrate the limits of the power counting technique.


Introduction
Over the past several years there has been an explosion in the number of jet observables and techniques developed for discrimination and grooming [1][2][3]. Several of these are used by the ATLAS and CMS experiments, and their performance has been validated directly on data  and employed in new physics searches in highly boosted regimes [26][27][28][29][30][31][32][33][34]. Analyses using jets will become increasingly important at the higher energies and luminosities of Run 2 of the LHC.
While the proliferation of jet observables is exciting for the field, the vast majority of proposed observables and procedures have been analyzed exclusively in Monte Carlo simulation. Monte Carlos are vital for making predictions at the LHC, but should not be a substitute for an analytical understanding, where possible. Because Monte Carlos rely on tuning the description of non-perturbative physics to data, this can obscure what the robust perturbative QCD predictions are and hide direct insight into the dependence of the distributions on the parameters of the observable. This is especially confusing when different Monte Carlo programs produce different results.

JHEP12(2014)009
Perturbative predictions of distributions have traditionally been constrained to only the simplest observables, such as the jet mass [35][36][37][38][39], but to high accuracy. Such highorder calculations are important for reducing the systematic theoretical uncertainties. More recently, resummation has been applied to some simple jet substructure variables [40][41][42][43], and an understanding of some of the subtleties of resummation for ratio observables, as often used in jet substructure, has been developed [44][45][46]. Even the simplest calculations have suggested new, improved techniques, like the modified Mass Drop Tagger [41,42], or uncovered unexpected structures in perturbative QCD, like Sudakov Safety [44,46,47]. For more complex observables, however, an analytic calculation may be essentially impossible, and we must rely on Monte Carlo simulations. Because of the wide variety of jet observables, some of which can be calculated analytically and some that cannot, it is necessary to find an organizing principle that can be used to identify the robust predictions of QCD, without requiring a complete calculation to a given perturbative accuracy.
In this paper we show how power counting methods can be used to design and understand the behavior of jet substructure variables. With minimal computational effort, power counting accurately captures the parametric predictions of perturbative QCD. The dynamics of a QCD jet are dominated by soft and collinear emissions and so by identifying the parametric scaling of soft and collinear contributions to a jet observable, we are able to make concrete and justified statements about the performance of jet substructure variables. Formal parametric scaling, or power counting, is widely used in the formalism of soft-collinear effective theory (SCET) [48][49][50][51], an effective field theory of QCD in the soft and collinear limits. However, in this paper, we will not rely on any results from SCET so as to make the discussion widely accessible. Similar techniques were employed in ref. [52], but with the goal of determining which jet observables are calculable.
As a concrete application of the soft and collinear power counting method, we will focus on observables formed from the generalized n-point energy correlation functions e (β) n [53], relevant for discriminating massive QCD jets from boosted, heavy objects. Measuring multiple energy correlation functions on a jet defines a multi-dimensional phase space populated by signal and background jets. By appropriately power counting the dominant regions of phase space, we are able to identify the signal-and background-rich regions and determine powerful observables for discrimination. In addition, from power counting arguments alone, we are able to predict the effect of pile-up contamination on the different regions of phase space. We apply power counting to the following: • Boosted Z Bosons vs. QCD The two-and three-point energy correlation functions, e 2 and e (β) 3 , have been shown to be among the most powerful observables for identifying the hadronic decays of boosted Z bosons [53]. We discuss the phase space defined by e (β) 2 and e (β) 3 , and determine which regions are populated by signal and background jets. Using this understanding of the phase space, we propose a powerful discriminating variable to identify boosted two prong jets, given by (1.1)

JHEP12(2014)009
This should be contrasted with the variable C 2 ) 2 originally proposed in ref. [53]. We also show that power counting can be used to understand the impact of pile-up radiation on the different regions of phase space, and in turn to understand the susceptibility of signal and background distributions to pile-up.
• Quarks vs. Gluons Quark versus gluon jet discrimination is somewhat of a nonexample for the application of power counting because there is nothing parametrically distinct between quark and gluon jets. However, this will illustrate why quark versus gluon discrimination is such a hard problem, and why different Monte Carlos can have wildly different predictions [43].
The outline of this paper is as follows. In section 2 we will precisely define what we mean by "collinear" and "soft" modes of QCD and introduce the observables used throughout this paper. While we will mostly focus on the energy correlation functions, we will also discuss the N -subjettiness observables [54,55] as a point of reference. In section 3, we apply power counting to the study of Z versus QCD discrimination using the two-and three-point energy correlation functions e (β) 2 and e (β) 3 . We argue that the single most powerful observable for discrimination is e 2 ) 3 . Power counting is used to understand how the addition of pile-up radiation effects the distributions of this variable, and show that they are more robust to pile-up than for previously proposed variables formed from the energy correlation functions. 1 We verify that these predictions are borne out in Monte Carlo. In section 4, we attempt to apply power counting to quark versus gluon jet discrimination. Naïvely, this should be the simplest case, however, power counting arguments are not applicable because all qualities of quarks and gluons only differ by order-1 numbers. Finally, we conclude in section 5 by re-emphasizing that power counting is a useful predictive tool for jet observables that are too complicated for direct analytic calculations, and suggest some problems to which it may prove fruitful.

Observables
Throughout this paper, our analyses will be focused around the (normalized) n-point energy correlation functions e (β) n . 2 The two-, and three-point energy correlation functions are 1 The CMS study of ref. [18] found that the observable C 2 ) 2 suggested in ref. [53] for boosted Z identification is very sensitive to pile-up contamination. 2 The notation e (β) n differs from the original notation ECF(n, β) presented in ref. [53] where the energy correlation functions were defined, but we hope that this notation used here is more compact. Specifically, the relationship is (2.1)

JHEP12(2014)009
defined as where p T J is the transverse momentum of the jet with respect to the beam, p T i is the transverse momentum of particle i, and n J is the number of particles in the jet. The boost-invariant angle R 2 ij = (φ i − φ j ) 2 + (y i − y j ) 2 is the Euclidean distance in the azimuthrapidity plane and for infrared and collinear (IRC) safety, the angular exponent β > 0. In this paper we will only study up through e (β) 3 , but higher-point energy correlation functions are defined as the natural generalization. We will often omit the explicit dependence on β, denoting the n-point energy correlation function simply as e n .
The energy correlation functions have many nice properties that make them ideal candidates for defining a basis of jet observables. First, the energy correlation functions are defined such that e (β) n → 0 in any of the soft or collinear limits of a configuration of n particles. Second, because all angles in the energy correlation functions are measured between pairs of particles, e (β) n is insensitive to recoil or referred to as "recoil-free" [53,[56][57][58][59]. This means that it is not sensitive to the angular displacement of the hardest particle (or jet core) from the jet momentum axis due to soft, wide angle radiation in the jet. The effects of recoil decrease the sensitivity of an observable to the structure of radiation about the hard core of the jet, making it less efficient for discrimination purposes.
Depending on the application, different energy correlation functions are useful as discriminating observables. As discussed in ref. [53], the two-point energy correlation function is sensitive to radiation about a single hard core, and so is useful for quark versus gluon discrimination. Similarly, the three-and four-point energy correlation functions are useful for 2-or 3-prong jet identification, respectively, corresponding to boosted electroweak bosons (W/Z/H) or hadronically decaying top quarks. By measuring appropriate energy correlation functions we define a phase space, populated by signal and background jets.
As a point of reference, we will also study the N -subjettiness observables and compare the structure of their phase space with that of the energy correlation functions. The (normalized) N -subjettiness observable τ The angle R iK is measured between particle i and subjet axis K in the jet. Thus, Nsubjettiness partitions a jet into N subjet regions and measures the p T -weighted angular distribution with respect to the subjet axis of each particle. There are several different choices for how to define the subjet axes; here, we will define the subjet axes by the exclusive k T jet algorithm [60] with the winner-take-all (WTA) recombination scheme [59,61,62]. In contrast to the traditional E-scheme recombination [63], which defines the (sub)jet axis to coincide with the net momentum direction, the WTA recombination scheme produces JHEP12(2014)009 (sub)jet axes that are recoil-free and nearly identical to the β = 1 minimized axes. 3 With this definition, the observables e (β) 2 and τ (β) 1 are identical through NLL accuracy for all β > 0 [59].
Since N -subjettiness directly identifies N subjet directions in a jet, it is a powerful variable for N -prong jet discrimination. In particular, the N -subjettiness ratios , relevant for boosted W/Z/H and top quark identification, respectively, are widely-used in jet studies at the ATLAS and CMS experiments. Numerical implementations of the energy correlation functions and N -subjettiness are available in the EnergyCorrelator and Nsubjettiness FastJet contribs [66,67].

Soft and collinear modes of QCD
At high energies, QCD is approximately a weakly-coupled conformal gauge theory and so jets are dominated by soft and collinear radiation. Because it is approximately conformal, there is no intrinsic energy or angular scale associated with this radiation. To introduce a scale, and so to determine the dominant soft and collinear emissions, we must break the conformal invariance by making a measurement on the jet. The scale of the soft and collinear emissions is set by the measured value of the observable. 4 This observation can be exploited to make precise statements about the energy and angular structure of a jet, depending on the value of observables measured on that jet. This reasoning is often implicitly understood in the jet community and literature, and is formalized in SCET. Nevertheless, these precise power-counting arguments are not widely used outside of SCET, and so we hope that the applications in this paper illustrate their effectiveness and relative simplicity.
We begin by defining a soft emission, s, as one for which where j is any other particle in the jet and R sj ∼ 1 means that R sj is not associated with any parametric scaling. Similarly, a collinear emission, c, is defined as having a p T fraction but with an angle to other particles which depends on whether they are also collinear or soft: The β = 1 minimized axes are also referred to as "broadening axes" [55,59] as they correspond to axes that minimize the value of broadening [56,64,65]. 4 It is important to note that since QCD is not a conformal field theory, we can only use the power counting presented here to study the phase space defined by a set of IRC safe observables. If we considered IRC unsafe observables, then generically, we would need to power count contributions from non-perturbative physics such as hadronization.

JHEP12(2014)009
Here, R cc is the angle between two collinear particles, while R cs is the angle between a soft particle and a collinear particle. The precise scalings of R cc and z s will depend on the observable in question, as will be explained shortly. Soft emissions also implicitly include radiation that is simultaneously both soft and collinear.
To introduce these ideas concretely, we use the example of the two-point energy correlation function: e Consider performing a measurement of e  can be expressed as where we have separated the contributions to e 2 , we will throw away those contributions that are parametrically smaller, according to our definitions of soft and collinear above. First, p T s p T c , and so we can ignore the first term to leading power. Because R cs ∼ 1, we set R cs = 1 in the second term. Also, note that p T c ∼ p T J and so we can replace the instances of p T c with p T J in the second and third terms. Making these replacements, we find where we have ignored any corrections arising at higher power in the soft and collinear emissions' energies and angles. We wish to emphasize with the explicit summation symbols that we have not restricted to a single soft or collinear emission, but consider an arbitrary number of emissions. Furthermore, we do not assume a strongly ordered limit, but instead explore the complete phase space arising from soft and collinear emissions, including regions where such ordering is explicitly broken. eq. (2.9) demonstrates the dominant structure of a jet on which we have measured e  from soft and collinear emissions do not mix to this 5 In SCET, Rcc and zs are often immediately assigned a related scaling. While this is true for this example, it is not in general true in the case of multiple measurements, and we wish to emphasize in this section how the measurement sets both scalings.

JHEP12(2014)009
accuracy; that is, they factorize from one another. Also, because there is no measurement to distinguish the soft and collinear contributions to e (β) 2 , we then have that That is, the measured value of e (β) 2 sets the p T of the soft particles and the splitting angle of the collinear particles, and therefore defines the structure of the jet.
In eq. (2.9), we have explicitly written a summation over the particles with soft and collinear scalings. To determine the scalings of the different contributions, it is clearly sufficient to consider the scaling of an individual term in each sum. In the remainder of this paper we will drop the explicit summation for notational simplicity.
Scaling arguments similar to the power counting approach discussed here are often used in other approaches to QCD resummation to identify the relevant soft and collinear scales, and could also be used to analyze observables. For example, in the method of regions [68,69], the regions of integration over QCD matrix elements which contribute dominantly to a given observable are determined, and an expansion about each of these regions is performed. These regions of integration, and the scaling of the momenta in these regions, correspond to the modes of the effective theory determined through the power counting approach. 6 Similarly, in the CAESAR approach to resummation [58], implemented in an automated computer program, the first step of the program is the identification of the relevant soft and collinear scales. This is performed by expanding a given observable in the soft and collinear limits, and considering the region of integration for a single emission. This procedure is similar to that used in the case of e (β) 2 just discussed, and would identify the same dominant contributions and scalings. Using the knowledge of the behavior of the QCD splitting functions, CAESAR then performs a resummed calculation of the observable. However, the CAESAR computer program is currently restricted to observables for which the relevant scales, and hence the logarithmic structure, is determined by a single emission, and further, to single differential distributions.
When considering observables relevant for jet substructure, one is interested in variables such as e (β) n , n > 2, whose behavior is not determined by the single emission phase space. For such observables, the single-emission analysis is not sufficient and the explicit analysis of QCD matrix elements to determine the dominant regions of integration which contribute becomes quite complicated. For these cases, we find the power counting approach of the effective field theory paradigm to be a particularly convenient organizing principle. Using the knowledge that on-shell soft and collinear modes dominate, a consistent power counting can be used to determine the relevant scalings of these modes in terms of the measured observables, which is reduced to a simple algebraic exercise. Although the evaluation of QCD matrix elements in these scaling limits is of course required for a complete calculation, it is not required to determine the power counting, and we will see that power counting alone will often be sufficient for constructing discriminating observables for jet substructure studies.

JHEP12(2014)009
Throughout the rest of this paper, we will employ these power-counting arguments to determine the dominant structure of jets on which multiple measurements have been made, for example e (β) 2 and e (β) 3 . In this case, the phase space that results is much more complicated than the example of e (β) 2 discussed above, but importantly, appropriate power counting of the contributions from soft and collinear emissions will organize the phase space into well-defined regions automatically.

Power counting boosted Z boson vs. QCD discrimination
As a detailed example of the usefulness of power counting, we consider the problem of discriminating hadronically-decaying, boosted Z bosons from massive QCD jets. Because Z boson decays have a 2-prong structure, we will measure the two-and three-point energy correlation functions, e 2 and e 3 , on the jets, defining a two-dimensional phase space. We will find that there are two distinct regions of this phase space corresponding to jets with one or two hard prongs. QCD jets exist dominantly in the former region while boosted Z bosons exist dominantly in the latter. Power counting these phase space regions will allow us to determine the boundaries of the regions and to define observables that separate the signal and background regions most efficiently.
Both because it is a non-trivial application, as well as still being tractable, we will present a detailed analysis of the phase space regions for boosted Z identification. This will require several pieces. First, we will study the full phase space of perturbative jets defined by e 2 and e 3 and identify signal and background regions via power counting. This will lead us to define a discriminating variable, D (β) 2 . Second, any realistic application of a boosted Z tagger includes a cut on the jet mass in the window around m Z , and the effect of the mass cut on the discrimination power can also be understood by a power counting analysis of the phase space. Third, at the high luminosities of the LHC, contamination from pile-up is important and can substantially modify distributions for jet substructure variables. By appropriate power counting of the pile-up radiation, we can understand the effect of pile-up on the perturbative phase space and determine how susceptible the distributions of different discrimination variables are to pile-up contamination. As a reference, throughout this section we will contrast the energy correlation functions to the N -subjettiness observables τ (β) 1 and τ (β) 2 [54,55]. A full effective theory analysis and analytic calculation of D (β) 2 will be presented in ref. [70].

Perturbative radiation phase space
We begin by studying the (e 2 , e 3 ) phase space arising from perturbative radiation from the jet. The measurement of e 2 and e 3 on a jet can resolve at most two hard subjets. The phase space for the variables e 2 and e 3 is therefore composed of jets which are unresolved by the measurement, dominantly from the QCD background, and shown schematically in figure 1a, and jets with a resolved 2-prong structure, as from boosted Z decays, shown schematically in figure 1b. We will find that the resolved and unresolved jets live in parametrically different regions of the phase space, and the boundary between the two regions can be understood from a power counting analysis.

JHEP12(2014)009
(a) (b) Figure 1: a) 1-prong jet, dominated by collinear (blue) and soft (green) radiation. The angular size of the collinear radiation is R cc and the p T fraction of the soft radiation is z s . b) 2-prong jet resolved into two subjets, dominated by collinear (blue), soft (green), and collinear-soft (orange) radiation emitted from the dipole formed by the two subjets. The subjets are separated by an angle R 12 and the p T fraction of the collinear-soft radiation is z cs . Table 1: Scaling of the contributions of 1-prong jets to e 3 on a jet with a single hard core of radiation, as in figure 1a, which is dominated by soft radiation with characteristic p T fraction z s 1, and collinear radiation with a characteristic angular size R cc 1. All other scales are order-1 numbers that we will assume are equal to 1 without further discussion. With these assumptions, we are able to determine the scaling of the contributions to e  table 1 for contributions from three collinear particles (CCC), two collinear and one soft particle (CCS), one collinear and two soft particles (CSS), and three soft particles (SSS). 7 Dropping those contributions that are manifestly power-suppressed, the two-and three- 7 The contributions in table 1 are from any subset of three particles in the jet. We do not single out an initial parton from which the others arise as in a showering picture.

JHEP12(2014)009
To go further, we must determine the relative size of z s and R β cc . There are two possibilities, depending on the region of phase space identified by the measurement: either z s makes a dominant contribution to e 2 , or its contribution is power suppressed with respect to R β cc . In the case that z s contributes to e 2 , this immediately implies that e 3 ∼ (e 2 ) 2 , regardless of the precise scaling of R β cc . 8 If instead z s gives a subleading contribution compared to R β cc in e 2 , then e 3 ∼ (e 2 ) 3 . 9 Therefore, from this simple analysis, we have shown that 1-prong jets populate the region of phase space defined by (e 2 ) 3 e 3 (e 2 ) 2 . Fascinatingly, this also implies that the relative values of e 2 and e 3 provide a direct probe of the ordering of emissions inside the jet, so that assumptions about the measured values of e 2 and e 3 are observable proxies for the ordering of emissions. The scaling of R cc and z s on each boundary of the phase space can then easily be determined, but will not be important for our discussion. This analysis shows that 1-prong jets fill out a non-trivial region in the (e 2 , e 3 ) phase space, and of particular interest for the design of discriminating observables is the fact that this region of phase space has a lower boundary. This region is shown in blue in figure 2.

JHEP12(2014)009
modes e (β) 3 Table 2: Scaling of the contributions from global soft (S), collinear (C), and collinear-soft (C s ) radiation correlated with the two hard subjets (denoted by C 1 and C 2 ) in 2-prong jets to e (β) 3 from the different possible configurations.
To understand the region of phase space for e 3 (e 2 ) 3 we must consider the case in which the measurement of e 2 and e 3 resolves two subjets within the jet.
The setup for the power counting of 2-prong jets is illustrated in figure 1b. We consider a jet with two subjets, each of which carry O(1) of the jet p T and are separated by an angle R 12 1. Each of the subjets has collinear emissions at a characteristic angle R cc R 12 . Because R 12 1, there is in general global soft radiation at large angles with respect to the subjets with characteristic p T fraction z s 1. For color-singlet jets, like boosted Z bosons, this global soft radiation contribution comes purely from initial state radiation (ISR). 10 Finally, there is radiation from the dipole formed from the two subjets (called "collinear-soft" radiation), with characteristic angle R 12 from the subjets, and with p T fraction z cs . The effective theory of this phase space region for the observable N -jettiness [71] was studied in ref. [72].
We now consider the power counting of e 3 , it is clear that the leading contributions must arise from correlations between the two hard subjets with either the global soft, collinear or collinear-soft modes. The scaling of these different contributions to e (β) 3 is given in table 2, from which we find that the scaling of the two-and three-point energy correlation functions for 2-pronged jets is There is no measurement performed to distinguish the three contributions to e (β) 3 and so we must assume that they all scale equally.
This result is sufficient to set the relative scaling of e (β) 2 and e (β) 3 . As we assume that the jet only has two hard subjets, we have that z cs 1 and so

JHEP12(2014)009
which defines the 2-prong jet region of phase space as that for which e 3 (e 2 ) 3 . With this identification, note the scaling of the various modes: While not important for our goals here, the fact that the energy correlation functions parametrically separate the scaling of the modes that contribute to the observables is vital for an effective theory analysis and calculability [70]. Note that because e 2 is first nonzero at a lower order in perturbation theory than e 3 , e 3 can be zero while e 2 is non-zero. Therefore, this 2-prong region of phase space extends down to the kinematic limit of e 3 = 0, as shown in red in figure 2. This power counting analysis, although very simple in nature, provides a powerful picture of the phase space defined by the measurement of e 2 and e 3 , which is shown in figure 2. The 1-and 2-prong jets are defined to populate the phase space regions where Background QCD jets dominantly populate the 1-prong region of phase space, while signal boosted Z decays dominantly populate the 2-prong region of phase space. This has important consequences for the optimal discrimination observable.
An interesting observation about the boundary between 1-and 2-prong jets, defined by e 3 ∼ (e 2 ) 3 , is that it is approximately invariant to boosts along the jet direction. For a narrow jet, a boost along the jet direction by an amount γ scales p T s and angles as Therefore, under a boost, e Thus, the boundary between 1-and 2-prong jets, where e 3 ∼ (e 2 ) 3 , is invariant to boosts along the jet direction. That is, under boosts, a jet will move along a contour of constant e 3 /(e 2 ) 3 in the (e 2 , e 3 ) plane. The analysis presented in this section is also the initial step in establishing rigorous factorization theorems in the different regions of phase space, allowing for analytic resummation of the double differential cross section of e 2 and e 3 [45,70].

Optimal discrimination observables
The fact that the signal and background regions of phase space are parametrically separated implies that from power counting alone, we can determine the optimal observable for separating signal from background. Because the boundary between the backgroundrich and signal-rich regions is e 3 ∼ (e 2 ) 3 , this suggests that the optimal observable for JHEP12(2014)009 discriminating boosted Z bosons from QCD jets is 11 Signal jets will be characterized by a small value of D  (3.10) 11 We thank Jesse Thaler for suggesting the notation "D" for these observables. Unlike C (β) 2 , whose name was motivated by its relation to the classic e + e − event shape parameter C, D is not related to the D parameter. Table 3: Scaling of the contributions of 1-prong jets to τ 2 , as shown in figure 3. Therefore, from the power counting perspective, we would expect that C (β) 2 is a poor boosted Z boson discriminating observable. Nevertheless, ref. [53] found that with a tight jet mass cut, and in the absence of pile-up, C (β) 2 is a powerful boosted Z discriminant. A mass cut constrains the phase space significantly, which we will discuss in detail in section 3.1.3, allowing us to understand the result of ref. [53]. Pile-up will be addressed in section 3.2.

JHEP12(2014)009
It is important to recall that while e 3 are IRC safe observables, so that their phase space can be analyzed with power counting techniques, ratios of IRC safe observables are not in general IRC safe [44,46,73]. The observables C 2 are however Sudakov safe [44,46], and therefore can be reliably studied with Monte Carlo simulation without applying any form of additional cut, such as a jet mass cut, on the phase space.

Contrasting with N -subjettiness
At this point, it is interesting to apply the power counting analysis to other observables for boosted Z discrimination and see what conclusions can be made. For concreteness, we will contrast the energy correlation functions with the N -subjettiness observables τ As with the energy correlation functions, we will consider τ as measured on 1-prong and 2-prong jets and determine the regions of phase space where background and signal jets populate. This can then be used to determine the optimal observable for boosted Z discrimination from the N -subjettiness observables. We use the same notation for the scalings of the modes as in section 3.1.
Starting with 1-prong jets, and repeating the analysis of section 3.1, we find the dominant contributions to τ is either dominated by z s or by R β cc . In this configuration, the two subjet axes can either lie on the two collinear particles or one axis can be on a collinear particle and the other on a soft particle. Importantly, the measurement JHEP12(2014)009 Table 4: Scaling of the contributions from global soft (S), collinear (C), and collinear-soft (C s ) radiation correlated with the two hard subjets (denoted by C 1 and C 2 ) in 2-prong jets to τ cannot distinguish these two possibilities and therefore cannot determine if the second axis in the 1-prong jet is at a small or large angle with respect to the first. 12 With either configuration, τ That is, for 1-prong jets, τ is dominated by the hard splitting, as was the case with the two-point energy correlation function, hence τ 2 , the two axes lie along the two hard prongs, so, just like with the three-point energy correlation function, τ Demanding that the jet only has two hard prongs implies that τ 1 , but no other conclusions can be made from power counting alone. Unlike the well-defined division of phase space by the energy correlation functions, N -subjettiness has a much weaker division of 1-prong jet: τ This does suggest, however, that the optimal discrimination variable using N -subjettiness is τ , which is what is widely used experimentally. Nevertheless, the weaker phase space separation of N -subjettiness compared with that for the energy correlation functions would naïvely imply that e 2 and e 3 provides better discrimination than τ 2 , e 3 in the presence of a mass cut. Contours of constant C

Effect of a mass cut
In an experimental application of D (β) 2 to boosted Z discrimination, a mass cut is performed on the jet around the mass of the Z boson. In addition to removing a large fraction of the background, this cut also guarantees that the identified jets are actually generated from boosted Z decays. To fully understand the effect of the mass cut on the phase space requires analyzing the three-dimensional phase space of the mass, e 2 , and e 3 . While complete, this full analysis would be distracting to the physics points that we wish to make in this section, and the impact of the mass cut can be understood without performing this analysis. For β = 2, the two-point energy correlation function is simply related to the jet mass m at fixed jet p T : for central jets assuming that m p T and up to overall factors of order 1. Therefore, a cut on the jet mass is a cut on e 2 . In this section, we will begin by discussing the simpler case of β = 2, and then proceed to comment on the effect of a mass cut for general β.
The phase space in the e 2 , e 3 plane with the jet mass constrained to a window, and for some finite range of jet p T is shown schematically in figure 4. Jets of a given mass can have that mass generated either by substantial soft radiation (for 1-prong jets) or by a hard splitting in the jet (a 2-prong jet), and so we want a discrimination observable that separates these two regions cleanly. The boundary between the 1-prong and 2-prong jet regions is still defined by e 2 ) 3 , and so we expect D (2) 2 to be the most powerful discriminant. However, by making a mass cut, the region of phase space at small masses, 12 For this reason, soft and collinear contributions to τ   dominated by 1-prong jets, is removed. Therefore, the fact that contours of the observable C (2) 2 mix both 1-and 2-prong jets is much less of an issue. Except at very high signal efficiencies, when one is sensitive to the functional form of the boundary between the signal and background regions, the discrimination performance of C 2 has the advantage that its discrimination power does not suffer from significant dependence on the value of the lower mass cut.
While a lower mass cut is important for removing 1-prong background jets, an upper mass cut is also necessary for powerful discrimination. The mass distribution of QCD jets has a long tail extending to masses of order the p T of the jet. For these jets, the mass is generated by an honest hard splitting, and so these background jets look exactly like the signal from their substructure. While the cross section for these high mass QCD jets is suppressed by α s , they can still be a significant background and therefore should be removed.
Let's now consider the general β case. We will first consider the effect of a mass cut in the 2-prong region of the e which defines the upper boundary of the 1-prong phase space. These phase space boundaries for jets on which two two-point energy correlation functions with different angular exponents (or recoil-free angularities [59]) are measured is discussed in detail in ref. [45]. Depending on whether β is less than or greater than 2, the mass cut manifests itself differently. For β < 2, note that from eqs. can pass through the background region of phase space and significantly reduce the discrimination power. Again, because it respects the parametric scaling of the phase space boundaries, we expect the discrimination power of D (β) 2 to be more robust as β decreases from 2. However, the precise discrimination power depends on understanding the O(1) region around the 1prong and 2-prong jet boundary as β moves away from 2. This observation also explains why ref. [53] found that the optimal choice for boosted Z boson discrimination using C (β) 2 with a tight mass cut was β 2.

Summary of power counting predictions
Here, we summarize the main predictions from our power counting analysis of boosted Z discrimination, before a Monte Carlo study in section 3.1.5. We have: • The parametric scaling of the boundary between 1-prong and 2-prong jets in the (e 2 , e 3 ) phase space is e 3 ∼ (e 2 ) 3 . Therefore, D (β) 2 should be a more powerful discrimination observable than C • When a mass cut is imposed on the jet, C for β 2.
• The discrimination power of C (β) 2 should decrease substantially as β decreases from 2 when there is a mass cut on the jets. By contrast, the discrimination power of D should be more robust as β decreases from 2.
• The power counting predictions stated above should be robust to Monte Carlo tuning and reproduced by any Monte Carlo simulation, e.g. Herwig++ or Pythia 8, since they are determined by parametric scaling of QCD dynamics.

Monte Carlo analysis
To test these predictions, we will study the different ratio observables formed from e in Monte Carlo simulation. We generated background QCD jets from pp → Zj events, with the Z decaying leptonically, and boosted Z decays from pp → ZZ events, with one Z decaying leptonically, and the other to quarks. Events were generated with MadGraph5 2.1.2 [74] at the 8 TeV LHC, and showered with either Pythia 8.183 [75,76] or Herwig++ 2.6.3 [77][78][79][80], to test the robustness of our predictions to the details of the Monte Carlo generator. Anti-k T [60] jets with radius R = 1.0 and p T > 400 GeV were clustered in FastJet 3.0.3 [66] using the Winner Take All (WTA) recombination scheme [46,59]. The energy correlation functions and N -subjettiness ratio observables were calculated using the EnergyCorrelator and Nsubjettiness FastJet contribs [66,67].
We first compare the discrimination power of C 2 with no lower mass cut on the jets for several values of the angular exponent. We require that m J < 100 GeV which removes a significant fraction of QCD jets that have honest 2-prong structure. Therefore, we are testing the power of C to discriminate between 1-prong and 2-prong jets. In figure 6, we show the raw distributions of C 2 . This is exactly as predicted by the power counting, because C (β) 2 mixes the signal and background regions of phase space, an effect that is magnified at smaller β. The discrimination power is quantified in figure 8 where we show the signal vs. background efficiency curves (ROC curves) for the three choices of β for C  In the presence of a narrow mass cut window, the power counting analysis of section 3.1.3 predicted that for β near 2, the discrimination power of C at 90% signal efficiency, as a function of the lower mass cut on the jets. 13 When the lower mass cut is near zero, D (1.7) 2 is significantly more efficient at rejecting QCD background than is C (1.7) 2 , as observed earlier. As the lower mass cut increases, however, the difference in discrimination power between the two observables decreases in both Pythia 8 and Herwig++ Monte Carlos. This dependence on the lower mass cut shows that D does not. The light QCD jets that are added as the mass cut is lowered should be rejected by a variable that partitions the phase space into regions of 1-prong and 2-prong jets, increasing the observed rejection efficiency. This is true for D (β) 2 ; however, exactly the opposite is true for C (β) 2 . We now study in more detail the case in which we have constrained the jet mass to lie in the tight mass cut window of 80 < m J < 100 GeV. Over the whole signal efficiency range, C .
Carlo generator. This should be contrasted with the actual numerical value of the QCD rejection, which depends on the generator. For β 2 with a tight mass cut window of 80 < m J < 100 GeV, any discriminating variable of the form e to a narrow window, and all discrimination power comes from e (2) 3 alone. This demonstrates why ref. [53] observed near-optimal discrimination power using C should be much more robust than C (β) 2 as β decreases from 2, with a mass cut on the jets. This behavior is reproduced in Monte Carlo, as shown in figure 10, where we have plotted the QCD rejection efficiency at 50% signal efficiency as a function of β. We have also included in these plots the N -subjettiness ratio τ (β) 2,1 for comparison. As β → 0, the discrimination power of both C Nevertheless, note that as β → 0, τ (β) 2,1 , while not the optimal discrimination observable, has even more robust discrimination power than D Thus, we see that all the power counting predictions are realized in both Monte Carlo simulations, demonstrating that parametric scalings are indeed determining the behavior of the substructure observables. We emphasize that the level of agreement between the Monte Carlo generators for the power counting predictions is quite remarkable, given that numerical values for rejection or acceptance efficiencies, for example, do not agree particularly well between the generators. Power counting has allowed us to identify the robust predictions of perturbative QCD. performing slightly better at high signal efficiencies (middle), behavior which is reproduced by both Monte Carlo generators. The QCD rejection rate at 50% signal efficiency as a function of β is shown at bottom for C

Including pile-up
The power counting analysis of the previous section included only perturbative radiation. At a high luminosity hadron collider such as the LHC, also important is the effect of multiple proton collisions per bunch crossing, referred to as pile-up. Pile-up radiation is uncorrelated with the hard scattering event, and as such, has an energy scale that is independent of the hard parton collision energy. Thus, pile-up can produce a significant amount of contaminating radiation in the event and substantially change jet p T s, masses, or observables from their perturbative values. An important problem in jet substructure is both to define observables that are less sensitive to the effects of pile-up, as well as to remove or "groom", to the greatest extent possible, radiation in a jet or event that most likely is from pile-up. Several methods for jet grooming and pile-up subtraction have been presented [47,73,[81][82][83][84][85][86][87][88], and are used by the experiments [6,[16][17][18]21], but we will not consider them here. 14 Instead, we will demonstrate that power counting can be used to understand the effect of pile-up radiation on the (e 3 ) phase space, and therefore on signal and background distributions for observables formed from the energy correlation functions. We envision that similar techniques could be used to develop jet substructure variables with improved resilience to pile-up, but in this paper we will restrict ourselves to an understanding of the behavior of C To incorporate pile-up radiation into the power counting analysis, we must make some simplifying assumptions. Because pile-up is independent of the hard scattering event, we will assume that pile-up radiation is uniformly distributed over the jet area. 16 This assumption essentially defines pile-up as another soft mode in the jet, with all angles associated with pile-up scaling as O(1). We will denote the p T fraction of pile-up radiation in the jet as z pu ≡ p T pu p T J . (3.23) No assumption of the relative size of the perturbative soft radiation energy fraction z s with respect to z pu is made at this point, and indeed the impact of pile-up on the phase space will depend on this relation. Assuming only that the pile-up p T fraction z pu 1, the two-and three-point correlation functions for 1-prong jets have the scaling e (β) The effects of jet grooming techniques can be understood using power counting techniques, and have been considered in [52]. 15 While the following analysis is quite general, it is restricted to recoil-free observables defined with a recoil-free jet algorithm, as used in this paper. In the case of a recoil sensitive observable, there is a nonlinear response to pile-up due to the displacement of soft and collinear modes with respect to the jet axis. In this case the power counting analysis described here does not apply directly, and a more thorough analysis is required. 16 This model of pile-up would be removed by area subtraction [73]. However, this would also remove perturbative soft radiation depending on the region of phase space. This could be studied in detail using power counting.

JHEP12(2014)009
For 2-prong jets, the correlation functions have the scaling e (β) From these scalings, we will be able to understand how pile-up radiation impacts jets in different regions of phase space, and hence the distributions in C 2 . Note that z pu is a fixed quantity measuring the fraction of pile-up radiation in the jet, and unlike the scalings for the soft, collinear and collinear-soft modes, its scaling is constant throughout the phase space. To understand the impact of the pile-up radiation on different regions of phase space, we will therefore need to understand how the values of e We begin the study of the phase space at small e (β) 2 . In the limit when z pu z s , pile-up dominates the structure of the jet. In this limit, both 1-prong and 2-prong jets are forced into the region of phase space where e (β) 2 ) 2 . Note however that the scaling of the upper boundary of the phase space is robust. We must assume, as we will in what follows, that the value of z pu is such that this region does not extend far into the phase space, or else the energy correlation functions cannot be used to discriminate 1-and 2-prong jets, as their structure is completely dominated by pile-up radiation.
Moving to slightly larger values of e For z pu ∼ z s , the addition of pile-up radiation therefore pushes all 1-prong jets towards the boundary e 2 . This behavior is illustrated in figure 11, and will imply a very different behavior for the distributions of C We will now use this understanding of the effect of pile-up radiation on different regions of the (e 2 , e 3 ) phase space to understand its impact on the distributions for the observables C accumulates about this value, with a minimal change in the mean, but significant decrease in the width of the distribution. This behavior is relatively distinct from that of most event shapes under pile-up, which tend to have a shift of the mean as pile-up is increased. The reason that this behavior is pronounced with D We can also understand the behavior of the signal distribution under the addition of soft pile-up radiation. As was discussed, and is shown schematically in figure 11, signal jets at larger e 2 , the primary effect of the addition of pile-up radiation will be to shift the mean of the distribution to larger values, with a limited modification to its shape. Furthermore, due to the cubic contours for D (β) 2 , the shift of the mean of the distribution will be smaller for D Note that a similar analysis can be straightforwardly applied to the N -subjettiness observables τ shifts as This corresponds to a vertical movement in the τ 2 ) in the limit that uniform pile-up dominates.
• For background C (β) 2 distributions the primary effect of the addition of pile-up radiation is a shift of the peak to larger values by an O(1) amount proportional to the pile-up. The distribution also becomes compressed, and accumulates around C (β) 2 = 1/2 in the limit that uniform pile-up dominates.
• For signal, the primary effect of the addition of pile-up radiation is to translate the mean of the distribution. The displacement of the mean is expected to be smaller for D

Monte Carlo analysis
We now study these predictions in Monte Carlo using the Pythia 8 event samples described in section 3.1.5. 17 Pile-up was simulated by adding N P V minimum bias events at the 8 TeV LHC, generated with Pythia 8, to the pp → Zj and pp → ZZ samples. To demonstrate the resilience of the distributions to pile-up, we wish to add pile-up radiation to a set of jets with well-defined perturbative properties. To do this, we cluster jets with the WTA recombination scheme [46,59] and require that the mass of the jets in the absence of pile-up is m J < 100 GeV. It was shown in ref. [46] that the jet axis found by the WTA recombination scheme is robust to pile-up and so, when pile-up is included, the perturbative content of the jets will be unaffected. This procedure, although clearly not related to an experimental analysis, provides a measure of the sensitivity of the distributions to soft pile-up radiation. This procedure is similar to that used in ref. [73] to assess the impact of pile-up and pile-up subtraction techniques on a variety of different jet shapes. Using this sample, we can assess the degree to which the power counting predictions of section 3.2 are realized in the Monte Carlo simulation. We begin by considering the effect of pile-up on the background distributions. In figure 12, we plot background distributions for C  is particularly pronounced at small β. At larger β, the D (β) 2 distribution exhibits a jump at small amounts of pile-up, and then remains stable as pile-up increases. Unfortunately, we have not been able to understand this behavior completely from power counting. Nevertheless, the improved stability of the distributions of D For a more quantitative study of the stability of the distributions to pile-up, we define where X normalized by the width of the distribution, which is important since the observables C is not a shift of the mean, and so the change of the distribution is not accurately captured by the measure of eq. (3.39), the deviation of the mean is a commonly studied measure of an observable's susceptibility to pile-up. In figure 15 we plot δ(N P V ) for the variables C

Power counting quark vs. gluon discrimination
Unlike the case of boosted Z bosons vs. massive QCD jets, applying a power counting analysis to quark vs. gluon jet discrimination demonstrates the limitations of the technique. Both quark and gluon jets dominantly have only a single hard core, and so the natural discrimination observables are the two-point energy correlation functions, e With power counting alone, this is as far as our analysis can go. e (β) 2 does not parametrically separate quark and gluon jets from one another. 18 To next-to-leading logarithmic accuracy, e (β) 2 are identical to the recoil-free angularities [59].

JHEP12(2014)009
This result is not surprising, however, because there are no qualities of quark and gluon jets that are parametrically different. Indeed, where C A and C F are the color factors for gluons and quarks, N C is the number of colors, and n f the number of active fermions. Predictions of what the best observable for quark vs. gluon discrimination is requires a detailed analysis of the effects of these order-1 parameters, which has been studied in several papers [43,53,89,90]. However, with the additional input of the form of the splitting functions for quarks and gluons, we can predict that the discrimination power of e (β) 2 improves as β decreases because smaller β emphasizes the collinear region of phase space over soft emissions. Collinear emissions are sensitive to the spin of the parton in addition to the total color of the jet, and thus are more distinct between quark and gluon jets. This prediction is borne out by explicit calculation to next-to-leading logarithmic accuracy [53]. An analytic calculation of the improved discrimination power from simultaneous measurement of the recoil-free angularities for two different powers of the angular exponent was calculated in [43].
Nevertheless, this suggests that power counting does make a definite prediction of quark vs. gluon discrimination performance. Because all the physics of quark vs. gluon jet discrimination is controlled by order-1 numbers, the predicted discrimination should be sensitive to the tuning of order-1 numbers in a Monte Carlo. It has been observed that Pythia 8 and Herwig++ give wildly different predictions for quark vs. gluon discrimination power [19,23,43,53], and presumably the difference is dominated by the tuning of the Monte Carlos. However, isolating pure samples of quark and gluon jets is challenging experimentally [19,20,23,91] and most of the subtle differences between quarks and gluons only appear at an order formally beyond the accuracy of a Monte Carlo. Therefore, to solve this issue will require significant effort from experimentalists, Monte Carlo authors, and theorists to properly define quark and gluon jets, to identify the dominant physics, and to isolate pure samples for tuning.

Conclusions
In this paper we have demonstrated that power counting techniques can be a powerful guiding principle when constructing observables for jet substructure and for understanding their behavior. Since power counting captures the parametric physics of the underlying theory, its predictions should be robust to Monte Carlo tunings. Using the simple example of discriminating boosted Z bosons from QCD jets with the energy correlation functions, we showed that a power counting analysis identified D  3 ) phase space dominated by 1-and 2-prong jets. The distinction between 1-and 2-prong jets is invariant to boosts along the jet direction.

JHEP12(2014)009
To verify the power counting predictions, we performed a Monte Carlo analysis comparing D (β) 2 with a previously proposed observable, C (β) 2 , also formed from the energy correlation functions. We showed that D (β) 2 is a superior observable for discrimination because C (β) 2 inextricably mixes signal-rich and background-rich regions of phase space. All power counting predictions were confirmed by both Herwig++ and Pythia 8, showing that the dominant behavior of the observables is governed by parametric scalings and not by O(1) numbers. This was contrasted with the case of quark vs. gluon discrimination for which no parametric differences exist, leading to large discrepancies when simulating quark vs. gluon discrimination with different Monte Carlo generators.
We also demonstrated that power counting can be used to understand the impact of pile-up on different regions of the phase space, and hence on the distributions of discriminating variables. The distributions for D We anticipate many directions to which the power counting approach could be applied. We have restricted ourselves in this paper to a study of observables formed from ratios of energy correlation functions with the same angular exponent. A natural generalization is to ratios of energy correlation functions with different angular exponents, where the optimal observable is given by D 2 ) 3β/α . Such variables could be useful when considering pile-up in the presence of mass cuts, which are required experimentally.
In the presence of a mass cut, an angular exponent of e 2 near 2 provides a simple restriction on the phase space, while lowering the angular exponent of e 3 reduces the effect of soft wide angle radiation. Along these lines, the impact of grooming techniques on the phase space is also simple to understand by power counting [52], and could be used to motivate the design of variables with desirable behavior under grooming.
As another example of considerable interest, the power counting analysis can be extended to the study of top quark discrimination variables by considering the phase space for 1-, 2-and 3-prong jets defined by the two-, three-and four-point energy correlation functions. While a complete analytic calculation for this case is not feasible, a power counting analysis is, and can be used to predict discriminating observables with considerably improved performance compared to those originally proposed in [53]. In the case of a three dimensional phase space, a cut on the jet mass only reduces the phase space to a two dimensional subspace, so that the functional form of the observable remains important. This will be studied further in future work.
Our observation that boost-invariant combinations of the energy correlation functions are the most powerful discriminants can also be exploited for discrimination: we can use boost invariance as a guide for defining the best observables. Together with power counting, this gives a simple but powerful analytic handle to understand and design jet substructure observables.