Analytic Boosted Boson Discrimination

Observables which discriminate boosted topologies from massive QCD jets are of great importance for the success of the jet substructure program at the Large Hadron Collider. Such observables, while both widely and successfully used, have been studied almost exclusively with Monte Carlo simulations. In this paper we present the first all-orders factorization theorem for a two-prong discriminant based on a jet shape variable, $D_2$, valid for both signal and background jets. Our factorization theorem simultaneously describes the production of both collinear and soft subjets, and we introduce a novel zero-bin procedure to correctly describe the transition region between these limits. By proving an all orders factorization theorem, we enable a systematically improvable description, and allow for precision comparisons between data, Monte Carlo, and first principles QCD calculations for jet substructure observables. Using our factorization theorem, we present numerical results for the discrimination of a boosted $Z$ boson from massive QCD background jets. We compare our results with Monte Carlo predictions which allows for a detailed understanding of the extent to which these generators accurately describe the formation of two-prong QCD jets, and informs their usage in substructure analyses. Our calculation also provides considerable insight into the discrimination power and calculability of jet substructure observables in general.


JHEP05(2016)117
Contents The last several years has seen a surge of interest in the field of jet substructure [1][2][3][4], both as an essential tool for extending new physics searches at the Large Hadron Collider (LHC) into the TeV energy regime, and as an important playground for improving our understanding of high energy QCD, both perturbative and non-perturbative. Of particular phenomenological interest are substructure observables that are sensitive to hard subjets within a jet. In the highly boosted regime, the hadronic decay products of electroweakscale particles can become collimated and each appear as a jet in the detector. Unlike typical massive QCD jets, however, these boosted electroweak jets exhibit a multi-prong substructure that can be identified by the measurement of appropriate observables. Many such observables have been proposed and studied on LHC simulation or data  or used in new physics searches [28][29][30][31][32][33][34][35][36][37][38][39][40].
The vast majority of proposed jet substructure observables, however, have been analyzed exclusively within Monte Carlo simulation. While Monte Carlos play an essential role in the simulation of realistic hadron collision events, they can often obscure the underlying physics that governs the behavior of a particular observable. Additionally, it is challenging to disentangle perturbative physics from the tuning of non-perturbative physics so as to understand how to systematically improve the accuracy of the Monte Carlo. Recently, there has been an increasing number of analytical studies of jet substructure observables, including the calculation of the signal distribution for N -subjettiness to next-to-next-tonext-to-leading-log order [41], a fixed-order prediction for planar flow [42], calculations of groomed jet masses [43][44][45][46] and the jet profile/ shape [47][48][49][50][51][52][53] for both signal and background jets, an analytic understanding of jet charge [54,55], predictions for fractional jet multiplicity [56], and calculations of the associated subjet rate [57]. Especially in the case of the groomed jet observables, analytic predictions informed the construction of more performant and easier to calculate observables. With the recent start of Run 2 of the LHC, where the phase space for high energy jets only grows, it will be increasingly important to have analytical calculations to guide experimental understanding of jet dynamics.
It is well known that the measurement of observables on a jet can introduce ratios of hierarchical scales appearing in logarithms at every order in the perturbative expansion. Accurate predictions over all of phase space require resummation of these large logarithms to all orders in perturbation theory. While this resummation is well understood for simple observables such as the jet mass [58][59][60][61][62], where it has been performed to high accuracy, a similar level of analytic understanding has not yet been achieved for more complicated jet substructure observables. Jet substructure observables are typically sensitive to a multitude of scales, corresponding to characteristic features of the jet, resulting in a much more subtle procedure for resummation.
A ubiquitous feature of some of the most powerful observables used for identification of jet substructure is that they are formed from the ratio of infrared and collinear (IRC) safe observables. Examples of such observables include ratios of N -subjettiness variables [63,64], ratios of energy correlation functions [65][66][67], or planar flow [68]. In general, ratios of IRC safe observables are not themselves IRC safe [69] and cannot be calculated -1 -

JHEP05(2016)117
to any fixed order in perturbative QCD. Nevertheless, it has been shown that these ratio observables are calculable in resummed perturbation theory and are therefore referred to as Sudakov safe [70][71][72][73]. Distributions of Sudakov safe observables can be calculated by appropriately marginalizing resummed multi-differential cross sections of IRC safe observables. An understanding of the factorization properties of multi-differential jet cross sections has been presented in refs. [74][75][76] by identifying distinct factorization theorems in parametrically separated phase space regions defined by the measurements performed on the jet. Combining this understanding of multi-differential factorization with the required effective field theories, all ingredients are now available for analytic resummation and systematically improvable predictions.
As an explicit example, observables that resolve two-prong substructure are sensitive to both the scales characterizing the subjets as well as to the scales characterizing the full jet. A study of the resummation necessary for describing jets with a two-prong substructure was initiated in ref. [77] which considered the region of phase space with two collinear subjets of comparable energy, and introduced an effective field theory description capturing all relevant scales of the problem. Recently, an effective field theory description for the region of two-prong jet phase space with a hard core and a soft, wide angle subjet was developed in ref. [76], where it was applied to the resummation of non-global logarithms [78]. Combined, the collinear subjet and soft subjet factorization theorems allow for a complete description of the dominant dynamics of jets with two-prong substructure.
In this paper we will study the factorization and resummation of the jet substructure observable D 2 [66], a ratio-type observable formed from the energy correlation functions. We will give a detailed effective theory analysis using the language of soft-collinear effective theory (SCET) [79][80][81][82] in all regions of phase space required for the description of a one or two-prong jet, and will prove all-orders leading-power factorization theorems in each region. We will then use these factorization theorems to calculate the D 2 distribution for jets initiated by boosted hadronic decays of electroweak bosons or from light QCD partons and compare to Monte Carlo simulation. These calculations will also allow us to make first-principles predictions for the efficiency of the observable D 2 to discriminate boosted electroweak signal jets from QCD background jets.
Our factorized description is valid to all orders in α s , expressing the cross section as a product of field theoretic matrix elements, each of which is calculable order by order in perturbation theory, allowing for a systematically improvable description of the D 2 observable. Furthermore, the factorization theorem enables a clean separation of perturbative and non-perturbative physics, allowing for non-perturbative contributions to the observable to be included in the analytic calculation through the use of shape functions [83,84]. In this paper we work to next-to-leading logarithmic (NLL) accuracy to demonstrate all aspects of the required factorization theorems necessary for precision jet substructure predictions. We will see that even at this first non-trivial order, we gain insight into qualitative and quantitative features of the D 2 distribution. While we will give an extensive discussion of our numerical results and comparisons with a variety of Monte Carlo programs in this paper, in figure 1 we compare our analytic predictions for the D 2 observable, including non-perturbative effects, for hadronically-decaying boosted Z bosons and QCD jets in e + e − collisions with the distributions predicted by the Vincia [85][86][87][88][89][90] Monte Carlo program at hadron level. Excellent agreement between analytic and Monte Carlo predictions is observed, demonstrating a quantitative understanding of boosted jet observables from first principles.

Overview of the paper
While there exists a large number of two-prong discriminants in the jet substructure literature, any of which would be interesting to understand analytically, we will use calculability and factorizability as guides for constructing the observable to study in this paper. This procedure will ultimately lead us to the observable D 2 and will demonstrate that D 2 has particularly nice factorization and calculability properties. This approach will proceed in the following steps: 1. Identify the relevant subjet configurations for the description of a two-prong discriminant.
2. Isolate each of these relevant regions by the measurement of a collection of IRC safe observables.
3. Study the phase space defined by this collection of IRC safe observables, and prove all-orders factorization theorems in each parametrically-defined region of phase space. 4. Identify a two-prong discriminant formed from the collection of IRC safe observables which respects the parametric factorization theorems of the phase space. A detailed analysis of each of these steps will be the subject of this paper. Here, we provide a brief summary so that the logic of the approach is clear, and so that the reader can skip technical details in the different sections without missing the general idea of the approach.
The complete description of an observable capable of discriminating one-from twoprong substructure requires the factorized description of the following three relevant subjet configurations, shown schematically in figure 2: • Soft haze: figure 2a shows a jet in what we will refer to as the soft haze region of phase space. In the soft haze region there is no resolved subjet, only a single hard core with soft wide angle emissions. This region of phase space typically contains emissions beyond the strongly ordered limit, but is the dominant background region for QCD jets, for which a hard splitting is α s suppressed.
• Collinear subjets: figure 2b shows a jet with two hard, collinear subjets. Both subjets carry approximately half of the total energy of the jet, and have a small opening angle. This region of phase space, and its corresponding effective field theory description, has been studied in ref. [77].
• Soft subjet: figure 2c shows the soft subjet region of phase space which consists of jets with two subjets with hierarchical energies separated by an angle comparable to the jet radius R. The soft subjet probes the boundary of the jet and we take R ∼ 1. An effective field theory description for this region of phase space was presented in ref. [76].
As a basis of IRC safe observables for isolating these three subjet configurations, we use the energy correlation functions [65], which we define in section 2.1. In particular, we will show that the measurement of three energy correlation functions, two 2-point energy  3 , allows for parametric separation of the different subjet configurations. While we will focus on the particular case of observables formed from the energy correlation functions, we believe that this approach is more general and could be applied to other IRC safe observable bases.
With the energy correlation functions as our basis, we study the multi-differential phase space defined by the simultaneous measurement of these observables on a jet in section 2. Using the power counting technique of refs. [66,67], we show that the angular exponents of the energy correlation functions, α and β, can be chosen such that the different subjet configurations occupy parametrically separated regions of this phase space, and extend to all boundaries of the phase space. This parametric separation allows for each region to be separately described by its own effective field theory. The required effective field theories are described in section 3, and are formulated in the language of SCET. The formulation in SCET allows us to prove all-orders factorization theorems valid at leadingpower in each of the phase space regions, and to resum logarithms to arbitrary accuracy using renormalization group techniques.
Having understood in detail both the structure of the phase space defined by the IRC safe measurements as well as the factorization theorems defined in each region, we will show in section 4.1 that this leads unambiguously to the definition of a two-prong discriminant observable which is amenable to factorization. This observable will be a generalized form of D 2 [66] which will depend on both angular exponents α and β. Calculating the distribution of D 2 is accomplished by appropriate marginalization of the multi-differential cross section. Depending on the phase space cuts that have been made, D 2 may or may not be IRC safe itself, and so the marginalization will in general only be defined within resummed perturbation theory.
The outline of this paper is as follows. In section 2 we define the energy correlation functions used in this paper and describe how the different subjet configurations shown schematically in figure 2 can be isolated by demanding parametric relations between the measured values of these observables. In section 3 we discuss the effective field theory descriptions in the different phase space regions, and present the factorization theorems that describe their dynamics. Although some of the relevant effective field theories have been presented elsewhere, we attempt to keep the discussion self-contained by providing a brief review of their most salient features. All field theoretic definitions of the functions appearing in the factorization theorems, as well as their calculations to one-loop accuracy, are provided in appendices.
In section 4 we show how the detailed understanding of the multi-differential phase space leads to the definition of D 2 as a powerful one-versus two-prong jet discriminant. In section 4.2 we emphasize that without a mass cut, D 2 is not IRC safe but is Sudakov safe and whose all-orders distribution exhibits paradoxical dependence on α s . In section 4.3 we study the fixed-order distribution of D 2 in the presence of a mass cut to understand its behavior in singular limits. In section 4.4 we discuss how the different effective field theories can be consistently merged to give a factorized description of the D 2 observable, and introduce a novel zero-bin procedure to implement this merging.

JHEP05(2016)117
In section 5 we present numerical results for both signal and background distributions for D 2 as measured in e + e − collisions and compare our analytic calculation with several Monte Carlo generators. We emphasize many features of the calculation which provide considerable insight into two-prong discrimination, and the ability of current Monte Carlo generators to accurately describe substructure observables. In section 6 we discuss numerical results for the D 2 observable at e + e − collisions at the Z pole at LEP, and demonstrate that being sensitive to correlations between three emissions, the D 2 observable can be used as a more differential probe of the perturbative shower for tuning Monte Carlo generators. In section 7 we discuss how to extend our calculations to pp collisions at the LHC. We conclude in section 8, and discuss future directions for further improving the analytic understanding of jet substructure.

Characterizing a two-prong jet
In this section, we develop the framework necessary to construct the all-orders factorization theorems for analytic two-prong discrimination predictions. We begin in section 2.1 by defining the energy correlation functions, which we will use to isolate the three subjet configurations discussed in the introduction. Using the power counting analysis of refs. [66,67], we study the phase space defined by measuring the energy correlation functions in section 2.2. Throughout this paper, our proxy for a two-prong jet will be a boosted, hadronically decaying Z boson, but our analysis holds for W or H bosons, as well.

Observable definitions
To distinguish the three different subjet configurations of figure 2 with IRC safe measurements, observables which are sensitive to both one-and two-prong structure are required. Although many possible observable bases exist, in this paper we will use the energy correlation functions [65,66], as we will find that they provide a convenient basis.
The n-point energy correlation function is an IRC safe observable that is sensitive to n-prong structure in a jet. For studying the two-prong structure of a jet, we will need the 2-and 3-point energy correlation functions, which we define for e + e − collisions as [65] Here J denotes the jet, E i and p i are the energy and four momentum of particle i in the jet and α is an angular exponent that is required to be greater than 0 for IRC safety. The

JHEP05(2016)117
4-point and higher energy correlation functions are defined as the natural generalizations of eq. (2.1), although we will not use them in this paper. While we will mostly focus on the case of an e + e − collider, the energy correlation functions have natural generalizations to hadron colliders, by replacing E by p T and using hadron collider coordinates, η and φ. This definition is given explicitly in eq. (7.1). At central rapidity, this modification does not change the behavior of the observables, or any of the conclusions presented in the next sections. Of course, the hadron collider environment has other effects not present in an e + e − collider, like initial state radiation or underlying event, that will affect the energy correlation functions. A brief discussion of the behavior of the energy correlation functions in pp colliders will be given in section 7. Numerical implementations of the energy correlation functions for both e + e − and hadron colliders are available in the EnergyCorrelator FastJet contrib [93,94].

Power counting the
With a basis of IRC safe observables identified, we now demonstrate that the measurement of multiple energy correlation functions parametrically separates the three different subjet configurations identified in figure 2. In particular, the simultaneous measurement of e 3 is sufficient for this purpose, and we will study in detail the phase space defined by their measurement. From this analysis, we will be able to determine for which values of the angular exponents α and β the three subjet configurations are parametrically separated within this phase space. The power counting parameters that define "parametric" will be set by the observables themselves, as is typical in effective field theory.
We begin by considering how the energy correlation functions can be used to separate one-and two-prong jets. This has been previously discussed in ref. [66] by measuring e with α and β in general different. A minimal constraint on the angular exponents, both for calculability and discrimination power, is that the soft haze and collinear subjets configurations are parametrically separated by the measurements. A power counting analysis of the soft subjet region yields no new constraints beyond those from the soft haze or collinear subjets.
The setup for the power counting analysis of the soft haze and collinear subjets configurations is shown in figure 3, where all relevant modes are indicated. The one-prong jet illustrated in figure 3a is described by soft modes with energy fraction z s emitted at O(1) angles, and collinear modes with characteristic angular size θ cc with O(1) energy fraction. The collinear subjets configuration illustrated in figure 3b consists of two subjets, each of which carry an O(1) fraction of the jet's energy and are separated by an angle θ 12 1. Each of the subjets has collinear emissions at a characteristic angle θ cc θ 12 , and global soft radiation at large angles with respect to the subjets, with characteristic energy fraction z s 1. In the case of two collinear subjets arising from the decay of a color singlet particle, the long wavelength global soft radiation is not present due to color coherence, but the power counting arguments of this section remain otherwise unchanged. Finally, there is radiation from the dipole formed from the two subjets (called "collinear-soft" radiation), with characteristic angle θ 12 from the subjets, and with energy fraction z cs . The  Figure 3. a) Schematic of a one-prong soft haze jet, dominated by collinear (blue) and soft (green) radiation. The angular size of the collinear radiation is θ cc and the energy fraction of the soft radiation is z s . b) Schematic of a jet resolved into two collinear subjets, dominated by collinear (blue), soft (green), and collinear-soft (orange) radiation emitted from the dipole formed by the two subjets. The subjets are separated by an angle θ 12 and the energy fraction of the collinear-soft radiation is z cs . effective theory of this phase space region for the observable N -jettiness [95] was studied in ref. [77]. 2 We are now able to determine the parametric form of the dominant contributions to the observables e From these parametrics, it is straightforward to show that one-prong jets live in a region of the e (β) 2 , e (α) 3 phase space bounded from above and below, whose precise scaling depends on the relative size of the angular exponents α and β. The scaling of upper and lower boundaries of the one-prong region of phase space for all α and β are listed in table 1. For α = β, as studied in ref. [66], one-prong jets live in the region defined by e 2 It is of historical interest to note that the generalization of two-prong event shapes, such as thrust, to event shapes for characterizing three jet structure was considered early on, for example with the introduction of the triplicity event shape [96]. However, it was not until more recently, with the growth of the jet substructure field at the LHC, that significant theoretical study was given to such observables. 3 It is important to understand that this relationship is valid to an arbitrary number of emissions. When performing the power counting, a summation over all the particles with soft and collinear scalings in the jet must be considered. However, to determine the scalings of the observable, it is sufficient to consider the scaling of the different types of individual terms in the sum. For example, the three terms contributing to the expression for e (α) 3 arise from correlations between subsets of three collinear particles, one collinear particle and two soft particles, and two collinear particles and a soft particle, respectively. Contributions from other combinations of particles are power suppressed. Because of this simplification, in this paper we will never write explicit summations when discussing the scaling of observables. . For optimal discrimination, the one-and two-prong regions of this phase space should not overlap. Since they are physically distinct, a proper division of the phase space will allow distinct factorizations, simplifying calculations. Comparing the boundaries of the one-prong region listed in table 1 with the upper boundary of the two-prong region from eq. (2.4), we find that the one-and two-prong jets do not overlap with the following restriction on the angular exponents α and β: (2.5) Note that when α = β this is satisfied, consistent with the analysis of ref. [66]. Because these power counting arguments rely exclusively on the parametric behavior of QCD in the soft and collinear limits, they must be reproduced by any Monte Carlo simulation, regardless of its shower and hadronization models. To illustrate the robust boundary between the one-and two-prong regions of phase space predicted in eq. (2.5), in figure 4, we plot the distribution in the e plane of jets initiated by light QCD partons and those from boosted hadronic decays of Z bosons as generated in e + e − collisions in Pythia [97,98]. Details of the Monte Carlo generation are presented in section 5. QCD jets are dominantly one-pronged, while jets from Z decays are dominantly two-pronged. We have chosen to use angular exponents α = β = 1 for this plot, as the small value 3 plane, for QCD quark jets (left) and boosted Z → qq jets (right). The parametric scalings predicted by the power counting analysis are shown as dashed lines, and the one-and two-prong regions of phase space are labelled, and extend between the parametric boundaries. Note the upper boundary is constrained to have a maximal value of of the angular exponent allows the structure of the phase space to be seen in a nonlogarithmic binning. The predicted behavior persists for all values of α and β consistent with eq. (2.5), while the choice made here is simply for illustrative aesthetics. On these plots, we have added dashed lines corresponding to the predicted one-and two-prong phase space boundaries to guide the eye. The one-prong QCD jets and the two-prong boosted Z jets indeed dominantly live in their respective phase space regions as predicted by power counting.
The measurement of e alone is sufficient to separate one-and two prong jets. However, the two-prong jets can exhibit either collinear subjets or a soft, wide angle subjet. To separate the collinear and soft subjet two-prong jets, we make an additional IRC safe measurement on the full jet. Following ref. [76], in addition to e For α = β and e (β) 2 1, these two regions are parametrically separated. Equivalently, in the two-prong region of phase space the measurement of both e (α) 2 and e (β) 2 can be used to give IRC safe definitions to the subjet energy fraction and splitting angle, allowing the soft subjet and collinear subjets to be distinguished. In figure 5 we summarize and illustrate the measurements that we make on the jet and the parametric relations between the measured values of the energy correlation functions that define the three phase space regions. The phase space plots of figures 5b and 5c were also presented in ref. [76].

Jet mass cuts
In addition to discriminating QCD jets from boosted Z bosons by their number of resolved prongs, we must also impose a mass cut on the jet to ensure that the jet is compatible with a Z decay. To include a mass cut in our analysis, for general angular exponents α and β, we would need to measure four observables on the jet: e where m J is the mass of the jet. Therefore, choosing β = 2 we can trivially impose a mass cut within the framework developed here. Throughout the rest of this paper, we will set β = 2 for this reason. Importantly, from Monte Carlo studies it has been shown that -11 -

JHEP05(2016)117
β ∼ 2 provides optimal discrimination power [65,66], so this restriction does not limit the phenomenological relevance of our results. Substituting the value β = 2 into the power counting condition of eq. (2.5), we find that the one-and two-prong regions of phase space are separated if To achieve a parametric separation of the one-and two-prong regions of phase space, we will demand that the scalings defining the different regions be separated by at least a single power of e , which are parametrically different. We therefore restrict ourselves to the range of angular exponents We expect that for α < 2 our effective field theory description will begin to break down, while as α is increased above 2 it should improve.

Factorization and effective field theory analysis
In each region of phase space identified in section 2, hierarchies of scales associated with the particular kinematic configuration of the jet appear. These include the soft subjet energy fraction z sj in the soft subjet region of phase space, or the splitting angle θ 12 of the collinear subjets. Logarithms of these scales appear at each order in perturbation theory, and need to be resummed to all orders to achieve reliable predictions. To perform this resummation, we will prove factorization theorems in each region of phase space by developing an effective field theory description which captures all the scales relevant to that particular region of phase space. These effective field theories are formulated in the language of SCET [79][80][81][82], but include additional modes which are required to describe the dynamics of the scales associated with the jet's particular substructure. Resummation is then achieved by renormalization group evolution within the effective theory. In this section we discuss each of the effective theories required for a description of the e phase space. For each region of the phase space, we present an analysis of the modes required in the effective field theory description and present the factorization theorem. We also provide a brief discussion of the physics described by each of the functions appearing in the factorization theorem. Field theoretic operator definitions of the functions, as well as their calculation to one-loop accuracy, are presented in appendices.

QCD background
Three distinct factorization theorems are required to describe the full phase space for massive QCD jets, corresponding to the soft haze, collinear subjets, and soft subjet configurations. Detailed expositions of the factorization theorems for the collinear subjets and soft subjet configurations have been presented in refs. [76,77], but here we review the important features of the factorization theorems to keep the discussion self-contained.

JHEP05(2016)117
Throughout this section, all jets are defined using the e + e − anti-k T clustering metric [93,99] with the Winner-Take-All (WTA) recombination scheme [72,100]. To focus on the aspects of the factorization relevant to the jet substructure, we will present the factorization theorems for the specific case of e + e − → qq. The factorization theorem for gluon initiated jets is identical to the quark case, and can be performed using the ingredients in the appendices. The extension to the production of additional jets or pp colliders will be discussed in section 7.

Collinear subjets
An effective field theory describing the collinear subjets configuration was first presented in ref. [77] and is referred to as SCET + . We refer the interested reader to ref. [77] for a more detailed discussion, as well as a formal construction of the effective theory. To our knowledge, our calculation is the first, other than that of ref. [77], to use this effective theory.

Mode structure
The modes of SCET + are global soft modes, two collinear sectors describing the radiation in each of the collinear subjets, and collinear-soft modes from the dipole of the subjet splitting. These are shown schematically in figure 6. The additional collinear-soft modes, as compared with traditional SCET, are necessary to resum logarithms associated with the subjets' splitting angle. This angle, which is taken to be small, is not resolved by the long wavelength global soft modes.
The parametric scalings of the observables in the collinear subjets region were given in section 2.2 and are: Although the measurement of two 2-point energy correlation functions is required to be able to distinguish the soft and collinear subjets, they are redundant in the collinear subjets region from a power counting perspective, due to the relation e Operator definitions, and one-loop calculations for the operators appearing in the factorization theorem of eqs. (3.17) and (3.18) are given in appendix C.

Soft haze
The soft haze region defines the upper boundary of the e (β) 2 , e (α) 3 phase space. In this region of phase space jets consist of a single hard core, with no resolved subjets. A factorization theorem describing this region of phase space has not been presented elsewhere, but can be straightforwardly formulated in standard SCET involving only n andn collinear sectors.
As discussed in section 2.2, the power counting in the soft haze region depends sensitively on the relative values of α and β, and therefore so does the structure of the factorization theorem. Since, from eq. (2.10), we restrict ourself to α ≥ β, we will for simplicity only discuss the factorization theorems valid in this case. Factorization theorems for other values of α and β can be determined by performing a similar analysis.

Mode structure
In the soft haze region the observables have the power counting e (α) It is also interesting to consider the case α = β because in the soft haze region it is not necessary to measure two different 2-point energy correlation functions, unlike in the two-prong region of phase space. In the case that α = β, we have instead, where the second term in the expression for e (α) 3 is no longer power suppressed. This will modify the factorization theorem between the two cases.
In both cases, the scaling of the modes is then given by with β = α in the second case. Here E J is the energy of the jet and the subscripts denote the light-like directions with respect to which the momenta is decomposed. This scaling should be recognized as the usual power counting of the collinear and soft modes for the angularities with angular exponent β [74,110].

Factorization theorem
The factorization theorem in the soft haze region of phase space can now be straightforwardly read off from the power counting expressions of the previous sections. We state it both for the case α = β and α > β. For α > β, we have where we have suppressed the convolution over the out-of-jet measurement B, to focus on the structure of the in-jet measurements. For α = β, the factorization theorem takes an interesting form 5 S nn e s 2 , e s 2 , e s 3 , R, B , 5 When calculating the tail of the D2 distribution, one might be tempted to marginalize over e     where again the convolution over B has been suppressed. A brief description of the functions appearing in the factorization theorems is as follows: • H nn Q 2 is the hard function describing the underlying short distance process. In this case we consider e + e − → qq.
• Jn(B) is the jet function describing the collinear modes for the recoiling jet.
• J n (e c 2 ) is the jet function describing the collinear modes for the jet in the n direction.
• S nn e s 2 , e s 2 , e s 3 , R, B and S nn e s 2 , e 2 , e (α) 3 , R, B are soft functions describing the global soft radiation from the nn dipole. In the case of α = β, an additional energy correlation, e s 2 , is measured on the soft radiation, with an angular factor of 2α, which multiplies against the collinear contribution to e (α) 2 when contributing to e (α) 3 . These also carry the jet algorithm constraints denoted by R, and any out-of-jet measurements B.
These functions, and a schematic depiction of the radiation which they define are indicated in figure 8. In appendix F, we give operator definitions of these functions and the leadingpower expression for the e (α) 3 measurement operator in the soft function. There are several interesting features about the factorization theorems of eqs. (3.29) and (3.30). First, the soft functions are multi-differential, in that they require the simultaneous measurement of multiple quantities. Such multi-differential jet and soft functions have been discussed in detail in ref. [74,75]. One other interesting feature of the factorization theorem of eq. (3.30), for the case of equal angular exponents, is the appearance of 3 . This product structure follows from the power counting of eq. (3.25) which describes the properties of the 3-point energy correlation function in the soft and collinear limits. It is important to note that this product form does not violate soft-collinear factorization, since only the knowledge of the total e (α) 2 of the soft or collinear sector is required. The soft contribution to the 3-point energy correlation is first non-vanishing with two real emissions. Therefore at one-loop, the factorization theorem of eq. (3.29) reduces exactly to the factorization theorem for the multi-differential angularities studied in refs. [74,75], whereas the factorization theorem of eq. (3.30) reduces to the factorization theorem for a single angularity. In this paper, we will not perform the two-loop calculation necessary to obtain a non-trivial contribution to the three point energy correlation function. Instead, we will obtain an approximation to the cross section in this region by taking a limit of our factorization theorems in the two-prong region of phase space. This is possible, because as we will show in section 4.3 by studying the fixed order distributions for the observable D 2 , there is no fixed order singularity in the soft haze region of phase space in the presence of a mass cut. This implies that the resummation is not needed to regulate a fixed order singularity. This will be discussed in section 4.4.2. The field theoretic definitions of the functions appearing in the factorization theorem of eq. (3.29) as well as power expansions of the measurement operators are collected in appendix F. However, because of the fact that we do not explicitly use the results of the soft haze factorization theorem in our calculation, we simply refer the reader to refs. [74,75] for the calculations of the one-loop functions relevant to the factorization theorems of eqs. (3.29) and (3.30), and leave for future work the full two-loop calculation.

Refactorization of the global soft function
In each of the factorization theorems required for the description of QCD background jets, namely the collinear subjets, soft subjet, and soft haze factorization theorems, there is a global soft function, which is sensitive to both the in-jet measurement of the energy correlation functions, as well as the out-of-jet measurement B. To ensure that all large logarithms are resummed by the renormalization group evolution, we must perform a refactorization of the soft function [60,62,[110][111][112]. This ensures that the only logarithms which appear in a given soft function that are sensitive to both in-jet and out-of-jet scales are true non-global logarithms (NGLs) [78], which first appear at two-loop order in the calculation of a particular soft function. 6 Here we focus on the refactorization of the soft subjet and collinear subjets factorization theorems of sections 3.1.1 and 3.1.4, which will be used in our numerical calculation. For both of these factorization theorems, we can write the soft where we have explicitly indicated the renormalization scale µ dependence [113]. The nonglobal part of the soft function S NGL e (α) 3 , B; R is first non-trivial at two-loop order, beyond the accuracy to which we explicitly calculated the soft functions in this paper. Furthermore, the anomalous dimension of the soft function factorizes to all orders in perturbation theory as and therefore the renormalization group kernels factorize as well. Briefly, this occurs because renormalization group consistency relates the soft anomalous dimension to the sum of all the other anomalous dimensions, each of which can be associated with the in-jet or out-of-jet contributions. 7 While similar refactorizations of the global soft function have been discussed previously, and used in numerical calculations (see especially ref. [62] for a detailed discussion), we will discuss it here for completeness. The refactorization of the global soft function plays a role in our numerical results and is particularly important in appropriately separating scales in the global soft function of the soft subjet factorization theorem of section 3.1.4. In ref. [76] the structure of the one-loop calculation of the soft subjet factorization theorem was discussed in detail, with a particular focus on the dependence on the angle ∆θ sj between the soft subjet and the boundary. There it was found that the while the out-of-jet soft function contained dependence on the angle between the soft subjet and the boundary, ∆θ sj , this dependence vanishes in the in-jet contribution to the soft function due to a zero bin subtraction. Renormalization group consistency is achieved since the ∆θ sj dependence associated with the in-jet region is carried by the boundary soft function. Therefore, the refactorization of the global soft function for the soft subjet factorization theorem allows the soft function to be separated into a piece with ∆θ sj dependence, and a piece with no ∆θ sj dependence, and is crucial for resumming all large logarithms associated with this scale. The one-loop anomalous dimensions, split into out-of-jet and in-jet contributions, as well as canonical scales for both the in-jet and out-of-jet soft functions are given in appendix B, appendix C, and appendix D. Further details of this refactorization, and in particular a discussion on the dependence on ∆θ sj is also given.
For completeness, we also give the final refactorized expressions for the factorization theorems for the collinear subjets and soft subjet factorization theorems that will be used when presenting numerical results. For the collinear subjets factorization theorem, we have

JHEP05(2016)117
while for the soft subjet factorization theorem, we have In this form, each function in eqs. (3.33) and (3.34) contains logarithms of a single scale, which can be resummed through renormalization group evolution.

Boosted boson signal
In this section we discuss the effective field theory and factorization theorem relevant for the hadronically-decaying boosted boson signal. For concreteness, we will consider the case of a boosted Z boson decaying to a massless qq pair; however, the extension to other color-neutral boosted particles is trivial. We will work in the narrow width approximation, setting the width of the Z boson Γ Z = 0. Corrections to this approximation are trivial to implement, as they do not modify the structure of the factorization, and are expected to have a minimal effect.
A factorization theorem for the N -subjettiness observable τ (β) 1,2 [63,64,95] measured on boosted Z jets was presented in ref. [41]. This factorization theorem was obtained by boosting an appropriately chosen e + e − event shape. A factorization theorem can also be formulated using the SCET + effective theory, 8 where the collinear-soft mode, which was described in section 3.1.1, corresponds to the boosted soft mode of the e + e − event shape. We will take this second approach, as it is in line with the general spirit of this paper, of developing effective field theory descriptions of jet substructure configurations. However, the approach of relating to boosted e + e − event shape variables is useful for relating results to higher order calculations known in the literature. Despite the fact that the factorization for the energy correlation functions in the signal region follows straightforwardly from that of ref. [41], or from the SCET + factorization theorem of section 3.1.1, we will discuss it here for completeness.
We assume the process e + e − → ZZ → qqll, where l is a lepton to avoid having to describe additional jets, although the extension to two hadronically-decaying Z bosons is trivial. The factorization theorem is then similar to that presented in section 3.1.1, however, there are no global soft modes since the Z is a color singlet. The scaling of the collinear and collinear-soft modes are identical to those given in section 3.1.1, so we do not 8 Here we have slightly extended the usage of the SCET+ nomenclature beyond that which it was originally used in ref. [77]. In particular, in the case of the signal distribution, there are no global soft modes, and the matching to the effective theory proceeds in quite a different way than for the case of a two prong QCD jet as originally considered in ref. [77]. Nevertheless, because the effective theory contains a collinear-soft mode, we will refer to it as SCET+. repeat them here. The factorization theorem is given by As with the factorization theorem in section 3.1.1, we have chosen to write the factorization theorem in terms of e 3 , and the energy fraction of one of the subjets, z. A brief description of the functions appearing in eq. (3.35) is as follows: • H(Q 2 ) is the hard function describing the production of the on-shell Z bosons in an e + e − collision. It also includes the leptonic decay of the Z boson. Following ref. [41] we assume that the Z boson is unpolarized and so its decay matrix element is flat in the cosine of the boost angle. Non-flat distributions corresponding to some particular decay or production mechanism are straighforward to include.
• P Z→qq n→na,n b z; e (α) 2 describes the decay of the on-shell Z boson into a qq pair with momenta along the n a and n b axes.
• J q na z; e c 3 , Jq n b 1 − z; ec 3 are the jet functions describing the collinear radiation associated with the two collinear subjets.
is the collinear-soft function describing the radiation from the qq dipole formed by the two collinear subjets.
The basic structure of the factorization theorem, and the radiation described by the different functions, as well as their scalings, are shown schematically in figure 9. Operator definitions, and one-loop calculations for the operators appearing in the factorization theorem of eq. (3.35) are given in appendix E. Because the collinear soft modes are boosted, the collinear soft function does not require a refactorization, as was necessary for the global soft functions, in section 3.1.10.
It is important to emphasize the distinction between our treatment of a boosted Z jet, where we presented a single factorization theorem, and a massive QCD jet, where three distinct factorization theorems were required. While it is obvious that the soft haze region does not exist for a boosted Z jet, the soft subjet region does. However, unlike the case of a massive QCD jet, where the soft subjet region is enhanced by a factor of 1/z sj from the eikonal emission factor, no such enhancement exists for the Z decay. Indeed, it was shown in ref. [41] that the effect of the jet boundary, which would arise from the soft subjet configuration, is power suppressed by 1/Q. While it would be potentially interesting to analytically study the jet radius dependence for the signal distribution using the soft subjet factorization theorem, this is beyond the scope of this paper. We will therefore neglect jet radius effects and write the factorization theorem in eq. (3.35) with no R dependence.
The factorization theorem of eq.  is also required. Unlike for the case of a massive QCD jet, where this region is described by the soft haze factorization theorem, for a boosted Z boson, an accurate description of this region requires matching to the fixed order Z → qqg matrix element. Since the boost of the Z boson is fixed, this corresponds to a hard gluon emission from the qq dipole. In the numerical results shown throughout the paper, we have performed this matching to fixed order, directly within the SCET + effective theory. The fixed order cross section for D (α,β) 2 onto which the result of the factorization theorem was matched, was calculated numerically by boosting the leading order e + e − → qqg matrix element and performing a Monte Carlo integration. This allows for the consideration of general angular exponents α and β in which case the required integrals are difficult, if not impossible, to evaluate analytically.

A factorization friendly two-prong discriminant
The approach to two-prong discrimination taken in this paper is to use calculability and factorizability constraints to guide the construction of an observable. Having understood in detail the structure of the e 3 phase space, along with the effective field theories describing each parametric region, we now show how a powerful two-prong discriminant, D 2 , emerges from this analysis naturally. After defining the D 2 observable, we discuss some of its interesting properties, and show that the factorization theorems of section 3 can be combined to give a factorized description of the observable over the entire phase space.

Defining D 2
The goal of boosted boson discrimination is to define observables which distinguish between one-and two-prong jets. As a simplification, we will take the view that both collinear and soft subjets should be treated as two-pronged by the discriminant, while soft haze jets should be treated as one-pronged. Treating both the collinear and soft subjets as two-pronged immediately implies that a marginalization over the soft subjet and collinear subjet factorization theorems will need to be performed to obtain a prediction for the twoprong discriminant. This will be discussed in section 4.4. A more sophisticated observable could take advantage of the different fraction of signal and QCD jets in the soft subjet and collinear subjets regions of phase space, and we will give a simple example of such an observable in section 5.7.
We will consider discriminants, which we denote D plane, as shown schematically in figure 10. Such observables can be calculated by marginalizing the double differential cross section [70] dσ dD (α,β) 2 (4.1) For the observable D (α,β) 2 to be calculable using the factorization theorems of section 3, the curves over which the marginalization is performed in eq. (4.1) must lie entirely in a region of phase space in which there is a description in terms of a single effective field theory (up to the marginalization over the collinear and soft subjets). Stated another way, the contours of D (α,β) 2 must lie either entirely in the one-prong region of phase space, or entirely in the two-prong region of phase space. This condition is also natural from the perspective that D (α,β) 2 provide good discrimination power, a point which has been emphasized in refs. [66,67]. If the contours do not respect the parametric scalings of the phase space, the marginalization cannot be performed within a single effective field theory. A more sophisticated interpolation between the different effective field theories, along the lines of refs. [74,75] is then required.
In section 2, a power counting analysis was used to show that for 3α/β > 2, the one-and two-prong regions of phase space are parametrically separated, with the contour . This implies that, parametrically, the optimal two-prong discriminant formed from e (β) 2 and e (α) 3 is This extends the definition of ref. [66], which considered the observable D (α,α) 2 , with equal angular exponents. To simplify our notation, we will often not explicitly write the angular exponents α and β, referring to the observable simply as D 2 .
The D 2 observable takes small values for a two-prong jet and large values for a oneprong jet. Its contours in the e phase space are shown schematically in figure 10, along with illustrative Monte Carlo generated spectra for both boosted Z jets and massive QCD jets in e + e − collisions. A more detailed discussion of the discrimination power of D 2 , as well as the details of the Monte Carlo generation, will be given in section 5.

Sudakov safety of D 2
One interesting feature of the D 2 observable is that it is not IRC safe without an explicit cut on e (β) 2 . For every value of D 2 , the contour over which the double differential cross section is marginalized passes through the origin of the phase space, where the soft and collinear singularities are located. This feature is shown in figure 10a. At every fixed order in perturbation theory, this gives rise to an ill-defined (divergent) cross section. However, a resummed calculation of the double differential cross section regularizes the singular region of phase space, and leads to a finite distribution for the D 2 observable. This property is referred to as Sudakov safety [70,73]. Because Sudakov safe observables are not calculable in fixed order perturbation theory, they do not generically have an α s expansion, and we will show that the D 2 spectrum exhibits a particularly interesting dependence on α s .
The regularization of the fixed order singularity in the double differential cross section is achieved by the all orders resummation of logarithmically enhanced terms in the perturbative expansion. In the effective field theory description, this resummation is achieved by renormalization group evolution, and its properties are therefore determined by the form of the SCET anomalous dimensions. To illustrate how the α s dependence arises from the structure of the renormalization group evolution in SCET, we consider the soft subjet factorization theorem of section 3.1.4 in the leading logarithmic (LL) approximation. The cusp pieces of the anomalous dimensions for the different functions appearing in the factorization are given in Laplace space by (see appendix C) to denote the Laplace conjugate to e (α) 3 , and we have kept only IR scales in the logs. Furthermore, we have kept only the terms proportional to C A so as to resum only the physics associated with the soft subjet. The hard matching coefficient for the soft subjet production is given by the tree level eikonal emission factor Solving the renormalization group equations, and running all functions to the hard scale Q, we then find that in the soft subjet region of phase space the multi-differential cross section can be written to LL accuracy as exhibiting a familiar Sudakov form.
A complete calculation of the D 2 spectrum requires marginalizing over both the soft subjet and collinear subjet configurations, which we discuss in section 4.4. However, to demonstrate the α s behavior in the simplest manner, we will consider just the soft subjet effective theory. In particular, we will fix the angle of the soft subjet, but allow it to be arbitrarily soft, so as to probe the singular region of phase space. The result is then representative of the contribution from the soft subjet region of phase space. An exactly analogous behavior occurs for the contribution from the collinear subjets region of phase space.
Fixing θ sj to satisfy n · n sj = 1/2 (and thereforen · n sj = 3/2), and restricting to α = β for simplicity, the 2-point energy correlation function in the soft subjet region of phase space is simply e The corresponding D 2 distribution is then obtained by marginalizing the multi-differential cross section of eq. (4.8) where, in the second line, we have fixed θ sj and so we do not integrate over it. Inserting the multi-differential cross section and fixing θ sj , we then have where the sj superscript denotes that this is representative of a contribution from the soft subjet region of phase space. Importantly, because the soft subjet is defined by requirements on IRC safe measurements, the cross section in eq. (4.11) is a well-defined and in principle measurable quantity. The α s dependence in this distribution of D 2 is very surprising. Because D 2 is defined with respect to the 3-point energy correlation function, one would naïvely expect that D 2 only makes sense for a jet with at least three partons. Indeed, if we make an explicit cut on z sj , for example, then D 2 is IRC safe, and first non-zero for a jet with three partons at O(α 2 s ). However, because D 2 without a cut on z sj is not IRC safe, this intuition fails, and in a fascinating way. By resumming the large logarithms of z sj to all orders and then marginalizing, the D 2 distribution calculated in eq. (4.11) actually starts at O(α s )! Including emissions to all orders has effectively generated a non-trivial distribution for D 2 at one order lower in α s than when it is first, naïvely, non-zero. Other examples of Sudakov safe observables in the literature have expansions in √ α s [70,73] or are even independent of α s [71][72][73]. To our knowledge, D 2 is the first example of a Sudakov safe observable for which all-orders resummation reduces the order in α s when the observable's distribution is first non-zero. 9 We re-emphasize that though the distribution of D 2 in eq. (4.11) is a Taylor series in α s , it is impossible in purely fixed-order perturbation theory to systematically calculate it.

Fixed-order D 2 distributions with a mass cut
Although D 2 is not IRC safe without a cut on e (β) 2 , leading to its interesting Sudakov safe behavior, in experimental analyses a jet mass cut will be always be applied. We will therefore be most interested in this case. In figure 11a we show a schematic depiction of the e (α) 2 , e (β) 3 phase space in the presence of a mass cut for α = β = 2, along with contours of the D 2 observable. As is indicated in the figure, the mass cut removes the origin of the phase space, making D 2 IRC safe and calculable in fixed-order perturbation theory. It is therefore interesting to study the singularity structure of the fixed-order perturbative expansion of D 2 in the presence of a mass cut.
In figure 11b we show both the leading order (α 2 s ) (LO) and the next-to-leading order (α 3 s ) (NLO) fixed-order distributions of the D (2,2) 2 observable as measured on the most energetic hemisphere jet in e + e − → dijets events at 1 TeV center of mass energy, and with a jet mass cut of m J ∈ [80, 100] GeV, in anticipation of our application to boosted Z boson discrimination. However, the detailed range of the mass cut window is irrelevant to the arguments of this section. NLOJet++ [114][115][116][117][118] was used to generate the distributions. The fixed-order D 2 distribution diverges at small values, and its sign in this region flips order-by-order, characteristic of the Sudakov region. This behavior makes clear the necessity of resummation in the small D 2 region. However, importantly, there is no divergence or other structure at large values of D 2 . Instead, the distribution exhibits a tail extending to large values both at LO and NLO, and this behavior is expected to persist to higher orders. This long tail arises from the fact that the upper boundary of the phase space Figure 11. a) A schematic depiction of the e 2 , e phase space in the presence of a mass cut, along with contours of the D 2 observable. b) Leading order (through α 2 s ) and next-to-leading order (through α 3 s ) distributions for the D 2 observable in the presence of a mass cut as measured on hemisphere jets in e + e − collisions.
is parametrically far, of distance ∼ 1/e (α) 2 , from the two-prong region of phase space. A schematic depiction of the singularity structure in the e 3 phase space is shown in figure 11a. The observation that a fixed-order singularity exists only at small values of D 2 is important for the resummation of the observable in the presence of a mass cut. In particular, while resummation in the soft subjet and collinear subjet factorization theorems are necessary to regulate a fixed-order singularity, the soft haze factorization theorem presented in section 3.1.7 is not.
The fixed-order behavior of the D 2 observable is in some ways much more similar to that of a traditional jet or event shape than might naïvely be expected. However, there are some important differences. In particular, a mass cut of 80 < m J < 100 GeV has been applied, which is comparable to the location of the Sudakov peak in the mass for a jet of energy 500 GeV. Therefore, unlike in the case of a traditional jet shape, where there is a transition from a region where resummation is important to a far tail region where a fixed order calculation provides an accurate description, in this case, for all values of D 2 , there is an overall Sudakov suppression due to the mass cut, in addition to the divergence at small values of D 2 . This is however, a small effect in the fixed order distribution compared to the divergence at smaller values, and most importantly, does not require regularization, as it is regulated by the mass cut.

Merging factorization theorems
A complete description of the D 2 observable for background jets requires combining the three factorization theorems presented in section 3. This involves both the merging of the soft subjet and collinear subjets factorization theorems, which must be performed before -32 -

JHEP05(2016)117
the marginalization over the D 2 contours, as well as the matching between the small D 2 description of the resolved two-prong region and large D 2 description of the unresolved region. We will discuss how the matching is accomplished for these two cases in turn.

Merging soft and collinear subjets
The region of phase space in which two subjets are resolved by the measurement is described by two distinct factorization theorems. These two regions of phase space are separated by the measurement of the two 2-point energy correlation functions, e 2 . However, in the calculation of D 2 , both regions are treated as two-pronged, and the additional 2-point energy correlation function must be marginalized over. Since each effective theory can only be used within its regime of validity, a merged description, valid in both the soft subjets and collinear subjets region of phase space, is required. To accomplish this, we introduce a novel procedure for merging the two factorization theorems.
At a fixed e 2 phase space, which was shown in figure 5c. This phase space has also been studied in the context of two angularities measured on a single jet in refs. [74,75]. In this case factorization theorems involving only collinear and soft modes exist on the boundaries of phase space, and an additional collinear-soft mode is required in the bulk of phase space. New logarithms exist in the bulk of the phase space, so called k T logarithms [74], which can either be captured by the additional collinear-soft mode proposed in ref. [75], or by the interpolation procedure of ref. [74]. In this case, the factorization theorems involving only the collinear and soft modes do not extend beyond the boundaries of the phase space, and they cannot be directly matched onto one another, as this would neglect the resummation of the k T logarithms, which are not present in either factorization theorem. We will now argue that the case of interest in this paper, namely of two resolved subjets, is different. In particular, the soft subjet and collinear subjets factorization theorems extend from the boundaries of phase space, and already contain all the modes required for a description in the bulk of the phase space. In particular no additional modes exist in the bulk region of the phase space. This implies in particular that a description of the entire phase space region can be obtained by a proper merging of the collinear subjets and soft subjet factorization theorems, which is the approach that we will take.
To see that no additional modes are present in the bulk of the phase space, it is sufficient to look for modes which transition between the modes present in the effective theory descriptions in the soft subjet and collinear subjets regions of phase space, and which contribute at leading power. When transitioning from the collinear subjets region of phase space to the soft subjet region of phase space, as is shown schematically in figure 12a, the collinear modes of one of the jets become the soft subjet and boundary soft modes of the soft subjet factorization theorem. On the other hand, the collinear-soft modes transition to the global soft modes. However, one could possibly be concerned that there exist additional modes which appear as collinear-soft modes on the boundary of phase space where the collinear subjets exist, but which transition to soft subjet modes instead of global soft modes. However, one can immediately see that such modes cannot exist, since the energy fraction of the soft subjet modes is set by the e 2 measurement, while the energy fraction of Figure 12. a) A schematic depiction of the transition between the soft subjet and collinear subjets regions of phase space. b) Distribution of the energy fraction of the gluon subjet as predicted by the collinear subjets effective theory, the soft subjet effective theory, and the merged description.
The collinear zero bin of the soft subjet is also shown.
the collinear-soft modes is set by the e 3 measurement. Since e 3 is fixed, and the transition is occurring only in the e 2 phase space, such modes cannot exist. This implies that all contributing modes already exist in either the soft subjet, or collinear subjets factorization theorems. This is a crucial difference from the case of the double differential angularities, which in some sense simplifies the analysis. Since no additional modes exist in the bulk of the phase space, the factorization theorems can be extended from the boundaries, and can be matched onto each other. This will allow for the resummation of all large logarithms. We will now discuss in more detail our implementation of this matching, after which we will see that our argument, presented here based on power counting, for the absence of additional modes, is explicitly realized through our merging procedure.
This suggests then the procedure we will use for interpolating between the collinear subjets and soft subjet factorization theorem, as sketched in ref. [76], where the soft subjet factorization theorem was originally introduced. It proceeds by implementing a zero bin subtraction [119] in factorization theorem space (the meaning of this will become clear shortly) to remove double counting in the overlapping region between the effective theories. This is a non-trivial and novel example of the zero bin procedure, and demonstrates the general utility of its approach.
Recall that in a standard SCET factorization, the cross section is written as a convolution of a jet function, which describes the collinear physics, and a soft function, which describes the soft physics. To achieve this mode separation without introducing a double counting, the soft limit of the jet function must be subtracted, which is referred to in the literature as a zero bin subtraction. Here we extend this approach to the case of two distinct factorization theorems which describe different regions of a multi-differential phase space, the soft subjet and collinear subjets effective field theories, but which overlap in the bulk of -34 -

JHEP05(2016)117
the two-prong phase space. It is important that here we only focus on the two-prong region of phase space; the matching to the one-prong region of phase space will be discussed in section 4.4.2. To perform the matching in the two-prong region of phase space, inspired by the zero-bin procedure, we will write the cross section as a sum of the contributions from the soft subjet factorization theorem and the collinear subjets factorization theorem, with a zero bin contribution to remove the overlap between the effective theories. Explicitly, we write where we have suppressed that at this stage the cross section is still differential in the kinematics of the subjets, so that our notation is not overly cumbersome. The cross section in the soft subjet or collinear subjets regions of phase space are denoted by sj and cs subscripts, respectively. Here the zero bin contribution, which removes the double counting, is given by σ sj | cs . Explicitly, σ sj | cs is obtained by taking the limit of the soft subjet factorization theorem in the power counting of the collinear subjets factorization theorem.
The anomalous dimensions and one-loop matrix elements for the collinear zero bin of the soft subjet factorization theorem are given in appendix D. Each of the three contributions to the cross section given in eq. (4.12) are associated with their own factorization theorem. However, the contributions to the cross section with the clearest physical interpretation are σ cs and the combined term (σ sj − σ sj | cs ), which we will refer to the as the zero bin subtracted soft subjet contribution. It is the contribution which can be interpreted over the entire phase space as the contribution from a soft subjet, and all logarithms contained in this expression are of soft scales. We specifically subtract the collinear-bin of the soft subjet factorization, and not the soft-bin of the collinear factorization. This is due to the need to cancel the contributions from the boundary soft modes of the soft subjet factorization in the collinear region. Since no analogous mode to the boundary softs is found in the collinear resummation, any soft expansion would miss this contribution, resulting in a logarithm being resummed in an inappropriate collinear region of phase space. This is in contrast to what happens when comparing the two subtractions in the soft region. So long as one uses the relative transverse momentum of the subjets as the splitting scale of the collinear factorization, the collinearbin of the soft subjet does match the soft-bin of the collinear factorization in the soft region. This is the result of the merging of various soft scales. In the soft jet collinear-bin, the expanded boundary softs and global soft scales naturally merge, and in the soft-bin of the collinear jets, the global softs and collinear-softs also naturally merge in the soft region. This can be explicitly verified with the canonical scales given in appendix G. Thus the collinear-bin of the soft subjet is the appropriate subtraction throughout phase space, to remove double counting at all points.
Having defined our merging procedure, implemented through the zero bin, we can now revisit our argument for the absence of additional modes, previously given by power counting, which can be verified from an explicit calculation. Taking the collinear-bin of the soft subjet factorization, and the soft-bin of the collinear subjet factorization, one finds identical fixed order expressions, as well as a one-to-one mapping of the anomolous Figure 13. a) Distribution of the energy fraction of the gluon subjet as predicted by the collinear subjets effective theory, the soft subjet effective theory, the collinear zero bin, and the matched description. A zoomed version at small z is shown in b).
dimensions between these two re-expanded factorizations. With the merging of the soft scales in the "bins" of the primary factorizations as one enters the soft region then implies they are numerically equivalent. No new logarithms appear in the bulk of phase space, unlike the case of two angularities [74]. This emphasizes that the collinear-soft region is a genuine overlap between the factorizations, with no new structures not already found in the factorizations.
To see visually the effect that this matching has, it is interesting to look at the distribution of the energy fraction of the one of the subjets. In figure 12b, we plot the distribution of the gluon subjet's energy fraction as computed in the collinear subjets and soft subjet factorization theorems, as well as the energy spectrum for the matched cross section of eq. (4.12) and zero bin contribution. The energy spectrum is cumulative D 2 ≤ 2, which is the majority of the two-prong region, and for simplicity we have fixed the jet mass m J = m Z . The matched contribution smoothly interpolates between the spectrum for the collinear subjets at large values of z, where the collinear subjets factorization theorem is valid, and captures all logarithms of the splitting angle, and that for the soft subjet factorization theorem at small values of z, accurately resumming large logarithms of z. It is also important to note that for large z, the zero bin contribution matches exactly onto the soft subjet contribution, removing its contribution in this region. One can also see that the collinear-bin of the soft subjets cancels the collinear contribution to the soft region, up to power corrections, as argued above. We find that the collinear subjets provides a good description over a large range of values, with the soft subjet factorization theorem only required at small values of z.
In figure 13a, we show the energy spectra at cumulative D 2 ≤ 0.6, along with a zoomed version at small values of z, in figure 13b. This figures makes clear that our matched prediction, computed using our zero-bin approach, reproduces correctly the behavior of -36 -

JHEP05(2016)117
the collinear subjets at large values of z, and the soft subjet factorization theorem at small values of z. In particular, in figure 13b, we see that below z ∼ 0.05, the soft subjet and matched predictions are indistinguishable.
Although we will not study this case explicitly in this paper, we have also performed the matching for gluon jets, where the dominant contribution comes from g → gg splitting. This case is somewhat interesting due to the fact that the Bose symmetry of the final gluons guarantees that the z distribution is symmetric about z = 0.5, leading to peaks in the z distribution due to soft singularities at both z = 0 and z = 1. Nevertheless, the same matching procedure works identically in this case, and this procedure could therefore also be straightforwardly applied for studying substructure in gluon jets, as would be required for a complete calculation at the LHC.
We have shown here the matched subjet energy spectra for the particular choice of jet radius R = 1 at a center of mass energy of 1 TeV for quark jets, as this is the particular case that we will focus on throughout the rest of the paper. However, we have investigated the properties of the matching away from these parameters. It is important to note that our procedure for merging factorization theorem must be carefully treated at small R. This manifests itself as a breakdown in the zero bin procedure. In particular, for a fixed value of e (α) 2 , if R is small, then the power counting e (α) 2 ∼ z sj is invalidated. In other words, for small R there does not exist a region of phase space which contributes to e (α) 2 for which z sj is sufficiently small that the soft subjet expansion is valid.
We can bound the specific R that eliminates the soft subjet region by considering the minimum energy fraction accessible to a subjet at a fixed e (α) 2 : As a necessary condition for a soft subjet, one must fulfill the condition: and so R ∼ 1 for the soft subjet to contribute. To eliminate the soft subjet then requires R 1 and to still have valid collinear subjet regions requires that R and e (α) 2 are related as: Finally, one should distinguish a fixed mass jet from a fixed e (α) 2 . In the case α = 2, since e we can open or close the soft subjet region. This appears in the zero bin by the fact that the zero bin subtraction is greater in all regions than the soft subjet, leading to a negative total cross section. We find numerically that this occurs for R < 0.5 for the case of m J = 90 GeV, and Q = 1 TeV. This value depends fairly sensitively on m J and Q, or equivalently e (α) 2 . In this case, only the collinear subjets factorization theorem should be used, and it is valid throughout the entire available phase space. In this paper we focus primarily on the case of fat jets, defined with R = 1, and therefore it is necessary to perform the matching between the soft subjet region and -37 -

JHEP05(2016)117
the collinear subjets region for jets of energy 500 GeV. However, in section 5.3, we perform a brief survey of different R values, comparing our analytic predictions with distributions from Monte Carlo generators. A more phenomenological study of the importance of the matching for different physics processes of interest for an e + e − collider, the LHC, or even a possible 100 TeV collider, where even higher boosts can be achieved, would be interesting, but is well beyond the scope of our initial investigation and can be straightforwardly treated using our techniques.
While we have used a zero bin procedure to perform the matching between the collinear subjets and soft subjet factorization theorems, it is also possible to develop a dedicated effective field theory valid when the soft subjet becomes collinear. This effective field theory is related to our zero bin contribution, and has been developed in refs. [120,121]. While we believe that this approach is nice in principle, for the observable D 2 , we find that such an effective field theory has a vanishing region of validity, as can be seen from the zero bin contribution in figure 12b, and figures 13a and 13b. We therefore believe that our use of the zero bin, as generalized to distinct factorization theorems, represents a natural approach to the merging of the distinct factorization theorems. However, we acknowledge that this is an observable dependent statement, and there may be cases where there is a sufficiently large region of overlap between the soft subjet and collinear subjets effective theories, and in this case it might prove useful to have a separate effective field theory description which is valid in the case that the soft subjet becomes collinear.

Matching resolved to unresolved subjets
An important feature of the D 2 observable is that its contours respect the parametric scaling of the phase space, as emphasized in figure 10. This implies that the marginalization over the contours defining the observable can be performed at small D 2 entirely within the merged effective theory of section 4.4.1, and at large D 2 within the soft haze effective field theory. Hence the matching between these two different descriptions can be performed at the level of the D 2 distribution instead of at the level of the double differential cross section, which is a great simplification, and primary feature of the D 2 observable.
The soft haze factorization theorem presented in section 3.1.7 first contributes to the shape of the D 2 distribution at two emissions, the first order at which e (α) 3 can be non-zero (technically at next-to-next-to-leading logarithmic prime order, NNLL , in the logarithmic counting). Since our focus is on an initial investigation of the factorization properties of two-prong discriminants, the necessary two-loop calculation is beyond the scope of this paper. Naïvely, this implies that since the merged effective field theory describing the two-prong region of phase space is only valid for D 2 1, our predictions should not be extended beyond D 2 1. However, we will argue that because of the structure of fixed order singularities for the D 2 observable, extending our two-prong factorization theorems to large D 2 will provide an accurate description of the D 2 distribution for a wide range of E J and R.
As shown in section 4.3, there does not exist a fixed order singularity at large D 2 . In particular, this implies that if extended into this region, the factorization theorems valid at small D 2 will not diverge. Furthermore, one in fact expects that they provide a reasonable -38 -JHEP05(2016)117 description of the shape. They contain both an overall Sudakov factor for the e (β) 2 scale of the jet, and also provide a description of the internal structure of the jet in terms of splitting functions (in the case of the collinear subjets factorization). While the splitting function does not exactly reproduce the matrix elements in the soft haze factorization theorem, it provides a good description of them. We believe that this is a consistent approach which suffices for this initial investigation.
Perhaps the most important fixed order correction not captured in the subjet factorition for D 2 is simply the endpoint of the distribution, which arises from the kinematic boundaries of the phase space. Since we will normalize our distributions to 1, in order to compare to the Monte Carlo generators, the height of the peak is correlated with the endpoint. Matching to the soft haze region would give the resummed distribution the correct endpoint in the tail, and thus can shift the peak up in general. This endpoint is sensitive to the specific R and E J of the jet, as well as to the values of the angular exponents α and β. Recall that since the Monte Carlo generators respect momentum conservation, they always terminate their distributions before the physical endpoint of the spectrum. We will also see how this disagreement in the tail region changes as a function of R and E J in sections 5.3 and 5.4 respectively. However, for the case of dijets produced at a center of mass energy of 1 TeV, with a jet mass cut of 80 < m J < 100 GeV, as is relevant for boosted boson discrimination, and on which we primarily focus throughout this paper, we will see that this discrepancy in the tail region is minimal, and we will find good agreement between our analytic calculations and the Monte Carlo predictions. It would of course be interesting to perform the complete two-loop calculation in the soft haze region of phase space; however, we believe that this would have a minor effect for a substantial range of parameter space. Nevertheless, the proper inclusion of this region of phase space would also be interesting from a resummation perspective, as it would require matching between two distinct factorization theorems involving a different number of resolved jets, instead of the more familiar case of matching a resummed distribution to a fixed order calculation. We leave further investigations of this to future work.

Numerical results and comparison with Monte Carlo
We now present numerical results for signal and background distributions for the D 2 observable in e + e − collisions. We give a detailed comparison with Monte Carlo, at parton level in sections 5.1 through 5.4 and including hadronization in section 5.5. We then study the discrimination power of D 2 analytically in section 5.6, and comment on the optimal choice of angular exponents. In section 5.7 possible observables which go beyond D 2 , and separately resolve the soft subjet, and collinear subjets region of phase space, and how these could be used for possible improvements to boosted boson discrimination.
Throughout this section we use FastJet 3.1.2 [93] and the EnergyCorrelator Fast-Jet contrib [93,94] for jet clustering and analysis. All jets are clustered using the e + e − anti-k T metric [93,99] using the WTA recombination scheme [72,100], with an energy metric. 10 10 We thank Jesse Thaler for use of a preliminary version of his code for WTA in e + e − collisions. This code is now available in the FastJet contrib.

Comparison with Parton-level Monte Carlo
Previous studies of boosted boson discrimination with ratios of IRC safe jet observables have relied entirely on Monte Carlo simulations. While the implementation of both the perturbative shower and hadronization are well-tuned to describe simple event-wide observables, jet substructure observables probe significantly more detailed correlations. For the particular case of observables sensitive to two-prong structure, their discrimination power is sensitive to the description of massive QCD jets in the phase space region where the jets are dominated by a resolved splitting. One might naïvely expect that this region of phase space is sensitive to the implementation of the parton shower model, and we will see that this is indeed the case. While a comparison to recent LHC data on jet substructure observables (for example: [11,12,14,25,26,31]) is possible, the lack of analytic calculations means that it is difficult to disentangle perturbative from non-perturbative effects. In this section we compare the results of our analytic calculation for D 2 with a number of Monte Carlo generators at parton level, focusing in particular on the small D 2 region. 11 This allows for a detailed probe of the simulation of two-prong jets in QCD by the perturbative shower (for a discussion of some other variables, see ref. [122,123]). A large number of implementations of the perturbative shower exist, and are implemented in popular Monte Carlo generators (for reviews, see e.g. [124][125][126][127][128]). Some examples include Pythia [97,98], a p T -ordered dipole shower; Vincia [85][86][87][88][89][90], Sherpa [129,130], Ariadne [131], and Dire [132], dipole-antenna showers; and Herwig++ [133][134][135][136], an angular-ordered dipole shower. 12 As representative of these different Monte Carlo shower implementations, we will use the following Monte Carlo generators throughout this section: All Monte Carlos were showered with default settings except for the caveats listed below and requiring two-loop running of α s in the CMW scheme [138,139] with α s (m Z ) = 0.118. The different shower evolution variables within the Vincia Monte Carlo enables a study of their effects. For background distributions, we generate e + e − → dijets at 1 TeV center of mass energy and study the highest energy R = 1.0 anti-k T jet in the event. For signal distributions in Pythia and Vincia, we generate e + e − → ZZ events with both Zs 11 One should always be wary of comparisons of Monte Carlo generators at parton level which employ different hadronization models. Our comparisons at parton level presented in this section are to set the stage for fully hadronized comparisons in the following section. However, we take the view that a parton shower should achieve, to the greatest extent possible, a clean separation between perturbative and nonperturbative physics, and therefore should provide an accurate description of observables both at parton and hadron level. 12 Herwig++ also has the option for a dipole-antenna shower implementation [137] though we will not use it here.  The details of the scale variations used to make the uncertainty bands will be explained in section 5.2, but the pinch in the uncertainties should not be taken as physical. The pinch comes from unit normalizing the distributions, and is common in analyses in which scale variations are applied to normalized distributions (see, e.g., refs. [61,62,110]). All Monte Carlos have similar distributions as measured on signal jets, though Herwig is more peaked at small values than the other generators. Our analytic prediction, shown with perturbative scale variation, agrees well with the Monte Carlo generators. On background jets, however, the distributions are distinct, especially at small values of D 2 . Small D 2 is the region where the jet has a two-prong structure, but unlike for signal jets, for background jets that structure is not generated by a hard matrix element. In the case of collinear subjets, it is generated by a hard splitting function, while for a soft subjet, it is generated by an eikonal emission. In the Monte Carlos, small D 2 is the region that is most sensitive to the cutoff effects and other infrared choices. As we will show in following sections, by adjusting unphysical infrared scales, differences between the Monte Carlos at parton level can be reduced and essentially eliminated.
For reference, in appendix I we show a collection of e related to the jet mass by eq. (2.8), is set by a single emission, the agreement between the different generators, particularly at parton level, is significantly better than for the D 2 observable. This further emphasizes the fact that the D 2 observable offers a more differential probe of the perturbative shower, going beyond the one emission observables on which Monte Carlo generators have primarily been tuned.
In the following sections we will study the partonic D 2 distributions in more detail. We will restrict ourselves to comparing and contrasting p T -ordered Vincia and Pythia for a few reasons. First, as exhibited in figure 14a, these Monte Carlos represent the largest spread in their predicted D 2 spectra. Herwig, while it performs very similarly to Vincia, has a different hadronization model than Pythia and Vincia. So, directly comparing Pythia and Vincia minimizes any implicit hadronization effects when comparing the Monte Carlos at parton level. There are still differences due to the cutoff of the perturbative shower, which will be discussed in section 5.2.

Monte Carlos and perturbative scale variation
The fact that, in particular, the p T -ordered Vincia distribution for D 2 as measured on background agreed with our calculation while the Pythia distribution disagreed in the small D 2 region can be understood and quantified further. The bulk of the disagreement between our analytic calculation and Pythia, illustrated in figure 14a, occurs near the peak of the D 2 distribution. It is well-known that for many observables perturbative uncertainties tend to be significant in the peak region of the distribution. Therefore, it is possible that the difference between the p T -ordered Vincia and Pythia D 2 distributions can fully be explained by large perturbative uncertainties. In this section, we will show that by adjusting the cutoff of the parton level shower, the differences between Vincia and Pythia can be significantly reduced.
To estimate perturbative uncertainties in our resummed analytic calculation, the standard procedure is to vary the scales that appear in the calculation by factors of 2. This is at the very least a proxy for the sensitivity of the cross section on these scales. Because our factorization theorems contain many functions, as well as merging of distinct factorization theorems, in principle there are numerous scales that could be varied, a complete analysis of which is beyond the scope of this paper. A complete list of the variations considered as well as the resummation procedure can be found in appendix G, while here we only summarize. In all factorizations theorems, we vary the subjet splitting scales, the in-jet soft radiation scales, the out-of-jet soft radiation scales, as well as where the freeze-out for the Landau pole occurs in the running of α s . We do not separately vary the scale in the soft subjet factorization theorem and the collinear zero bin to ensure that the zero bin subtraction is implemented correctly. The scale variation band for the total cross section is then taken as the combined band for all possible combinations of these scale variations. The soft subjet cross section displays a particular sensitivity to the out-of-jet scale setting, since the running between the boundary soft modes and the out-of-jet modes forces the soft subjet energy spectrum to vanish at the jet boundary, 13 though the fixed order cross 13 As explained in ref. [76], this is connected with the buffer region of ref. [142]. section probes the soft divergence in this region. Thus we also consider several different schemes for handling the out-of-jet scale setting. We believe that our scale variation bands are representative, and this is supported by the agreement with the Monte Carlo.
Having understood the perturbative uncertainty bands, we now discuss in more detail the discrepancy between the different Monte Carlo generators arising at small values of D 2 , as exemplified by the difference between the p T -ordered Vincia and Pythia distributions. To understand the origin of this discrepancy, we begin by understanding the effect of the soft subjet region of phase space in our analytic calculation. This is possible due to our complete separation of the phase space using the energy correlation functions. In particular, because we have formulations of distinct factorization theorems in the soft subjet and collinear subjets regions of phase space, we can make an analytic prediction for the contribution arising just from the collinear subjets region of phase space. In figure 15 we show a comparison of the D 2 distribution for background QCD jets as computed using our complete factorization theorem, incorporating both the soft subjet and collinear subjets region of phase space, as compared with the calculation incorporating only the collinear subjets region of phase space. Comparing the two curves, we are able to understand the effect of the soft subjet region of phase space. In particular, we see that the soft subjet has a considerable effect on the distribution at small values of D 2 , giving rise to a more peaked distribution, with the peak at smaller values of D 2 , as compared to the result computed using only the collinear subjets region of phase space. Although the perturbative error bands are large, the systematic effect of the soft subjet region of phase space is clear.
One further feature of the D 2 distributions, which is made clear by figure 15, is that the full D 2 distribution is not the result of a single Sudakov peak, and therefore our intuition about the behavior of different orders in the perturbative expansion, and the behavior of scale variations from traditional event shapes fails. In particular, while it is generically the -43 -

JHEP05(2016)117
case for traditional event shape distributions that lower order resummed results overshoot in the peak region, it is not at all clear that this behavior should be true for D 2 , and indeed it is not observed. Instead, the contribution from the collinear subjets alone is expected to undershoot the peak of the D 2 distribution, since it does not incorporate the soft subjet region of phase space. The final contribution is then obtained as a superposition of two distinct Sudakov peaks, and can therefore behave quite differently from traditional event shapes.
Monte Carlo descriptions of the perturbative shower should provide a similar description of collinear physics, but can differ in their description of soft wide angle radiation. Some of these differences were discussed in section 5.1. As discussed earlier, because Vincia is a dipole-antenna shower, it should accurately describe both the hard collinear and soft wide-angle regions of phase space. Because small values of D 2 are sensitive to both collinear and soft physics, the fact that the Pythia distribution at small D 2 is distinct suggests that its description of soft wide-angle physics is the reason. 14 The difference observed in our analytic calculation arising from the soft subjet region of phase space is similar to that observed between the p T -ordered Vincia and Pythia Monte Carlo distributions. It is therefore interesting to investigate whether the difference in Monte Carlo distributions can arise exclusively from different descriptions of wide angle soft radiation.
We will show that for the D 2 observable and jet samples we studied, most of the difference can be accounted for by differences in the treatment of unphysical infrared scales at parton level in the Monte Carlos. Since we perform this comparison at parton level, there is some ambiguity in effects due to the perturbative cutoff of the shower, and those arising from different descriptions of wide angle soft radiation. In particular, the Monte Carlos will in general have different low-scale p T cutoffs at which the perturbative parton shower is terminated. Varying this scale can potentially greatly increase or decrease the number of soft emissions because the value of α s in this region is large. In particular, for the versions of Pythia and Vincia that were use to generate events in figure 14, the cutoff in Pythia is 0.4 GeV, while the cutoff in Vincia is 0.8 GeV. Indeed, these are the default values for these showers. Therefore, we expect that the Pythia parton shower produces more soft emissions than Vincia, which would increase the value of D 2 , and potentially also contribute to the observed difference.
To attempt to disentangle the effects of the shower cutoff from differences in the modeling of soft radiation, in figure 16, we consider Monte Carlo predictions of the D 2 distribution as measured on QCD jets, with different jet radii, namely R = 0.5, 0.7, 1.0, 1.2. By using different jet radii, we can control the importance of the soft subjet region of phase space. With small jet radii, the soft subjet region of phase space does not exist, while it becomes increasingly important as the jet radius is increased. Analytic predictions for the D 2 dis- 14 Part of the reason for why Pythia seems to not correctly describe the soft, wide-angle region of phase space may be due to the fact that while it uses kinematics of dipoles in its shower, it still uses the Altarelli-Parisi splitting functions as an approximation of the squared matrix element. The dipole and its emission is then boosted to the appropriate frame, which may over-populate the soft wide-angle region of phase space as compared to the eikonal matrix element. We thank Torbjörn Sjöstrand and Peter Skands for detailed discussions of this point.  ciently many observables on a jet, we are able to isolate distinct phase space regions and study in detail the extent to which Monte Carlo parton showers reproduce the physics in the different regions. D 2 , or similar jet substructure observables, could therefore be powerful tools for tuning Monte Carlos, both to formally-accurate perturbative calculations, as well as data. In the remaining sections of the paper, we will use the default shower cutoffs in the Monte Carlo generators, as was done in figure 14, and will not show uncertainty bands on our Monte Carlo distributions from varying this parameter.

Analytic jet radius dependence
As demonstrated in the previous section, the region of small D 2 is a sensitive probe of the dominant soft or collinear structure in the jet. It is therefore interesting to study the jet radius dependence of D 2 analytically, because the relative size of soft subjet and collinear subjets contributions to D 2 will depend on the jet radius. At large jet radius, as shown earlier, the soft subjet region is an important contribution at small D 2 , but as the jet radius decreases, the collinear subjets should dominate. In this section, we will study the jet radius dependence of D 2 and compare our analytic calculation to Monte Carlo. This will also demonstrate that our analytic calculation accurately describes the R dependence of the D 2 distribution. As in the previous section, we will restrict this study to p T -ordered Vincia and Pythia showers, and will take the jet radius to be R = 0.5, 0.7, 1.0, 1.2, which are representative of a wide range of values of experimental interest. Larger values of R can be straightforwardly studied with our approach, but are of less phenomenological interest. It is expected that for smaller values of R logarithms of R may become numerically important [47,143,144], so we do not consider them here.
Comparisons of parton level Monte Carlo results from both p T -ordered Vincia and Pythia to our analytic calculation are shown in figure 17. Since we scan over a range of jet radii, perturbative uncertainties for each R value are not as extensively explored as earlier with R = 1, and are only meant as a rough estimate of the perturbative uncertainty. Our focus here is simply to show that the scaling behavior with R between our analytic calculation and the Monte Carlos agree. There is excellent agreement between the Monte Carlo results and our analytic calculations over the entire range of R values. For R 1, there is some disagreement in the position of the peak of the distribution between the generators, though, as shown earlier, this can be accounted for by adjusting the shower cutoffs. In the peak region, hadronization will play an important role, smearing out differences between Monte Carlos. The effect of hadronization, and its implementation in our analytic calculation, will be discussed in section 5.5. For jet radii of R = 0.7, 1.0, 1.2 our analytic calculation consists of both collinear subjets and soft subjet contributions. For R = 0.5, however, we only include the contribution from collinear subjets, which is guided by our matching procedure between the collinear subjets and soft subjet factorization theorems, as discussed in section 4.4.1. For a fixed jet mass, as the value of R is decreased, the region of validity of the soft subjet factorization theorem vanishes rapidly. For jet masses in the range 80 < m J < 100 GeV, and Q = 1 TeV, we find that between R = 0.7 and R = 0.5 the region of validity of the soft subjet rapidly shrinks to zero, and there should not be a transition between the collinear subjets factorization theorem and the soft subjet factorization theorem. Because of this, for the value of R = 0.7, our perturbative error bands are more extensive, and are taken as the envelope of curves both that include the matched soft jet, and curves that do not. While this is certainly over conservative in the error estimate, we have included this to emphasize this point. This feature is also seen explicitly in the plots of figure 17, where the region of disagreement between the different Monte Carlo generators is squeezed towards zero. A similar effect occurs as the energy (or p T ) of the jet is increased with a fixed jet mass, which will be discussed in section 5.4.
Throughout the remainder of this paper, we will study the case R = 1 exclusively, because both collinear subjets and soft subjet regions of phase space must be included and that radius is relevant to a large number of jet substructure studies using fat jets.

Analytic jet energy dependence
In addition to studying the dependence on the jet radius as a probe of the importance of the soft subjet and of the Monte Carlo description of the shower, it is also interesting to study the dependence of the D 2 distribution on the energy of the jet, with a fixed mass cut. For highly energetic jets, one expects that the soft subjet will play a negligible role, as the region of validity of the soft subjet factorization theorem shrinks as the energy of the jet is increased, as long as the mass of the jet is kept fixed. On the other hand, since we keep the jet radius used in the clustering fixed, the angular separation of the collinear particles decreases with energy, but the phase space for wide angle global soft radiation increases considerably. This radiation is present both in the collinear subjets and soft haze factorization theorems. It is also of course present in the soft subjet factorization theorem, although we have argued that we expect this to give a small contribution. Studying the jet energy dependence therefore probes the behavior of the generators in a fashion complementary to the R dependence.
In this section, we study the perturbative D 2 distribution for center of mass energies ranging from 500 GeV to 2 TeV, for a fixed jet radius of R = 1, and with a fixed mass cut of 80 < m J < 100 GeV. This region of energies covers the majority of the phenomenologically interesting phase space available at the LHC. We will also perform a more detailed study at LEP energies in section 6. For our resummation, we require (amongst other things), that e (α) 2 1. For the case of α = 2 for which we will be most interested, this corresponds to the assumption e For a mass cut around the Z pole mass, this expansion is valid throughout the range of energies we consider. The case when e (2) 2 1, but not parametrically so, is outside the scope of this paper.
In figure 18 we show distributions for the D 2 observable as obtained from Monte Carlo simulation, and compared with our analytic calculation. As in section 5.3, we restrict to p T -ordered Vincia and Pythia at parton level. The perturbative scale variations for each energy value are less extensively explored and are only meant to provide a rough estimate of the perturbative uncertainty. The evolution of the difference between the Vincia and Pythia generators is again quite fascinating, with the discrepancy between the generators increasing significantly with energy, to the point that at 2 TeV the qualitative shape of the distributions doesn't agree. In particular, the behavior at small D 2 is completely different between the two generators, with Vincia having a large peak, which is not present in

Pythia.
As discussed in section 5.2, this discrepancy between the generators is dominantly due to differences in the treatment of the parton shower cutoff. As the energy is increased with a fixed jet mass and jet radius, emissions that contribute to D 2 are forced to have smaller and smaller energy. As evidence that this is indeed the cause of the discrepancy, we have checked that the conclusions of section 5.3 remains true at higher energy, as long as the jet radius is taken to scale as R ∼ 2m J /p T , so that it constrains the wide angle soft radiation. For example, for R = 0.2 at 2 TeV, we find excellent agreement between the D 2 distributions as generated by Pythia and Vincia. 15 Because the fact that the 15 We include this plot in appendix I, figure 35, for reference. disagreement is so large between the generators, and is arising from the modeling of soft radiation, this may be an excellent observable to study soft radiation and color coherence in parton showers. As a reference, in appendix I we show distributions of the e 2 observable, measured at both 500 GeV, and 2 TeV for both the Vincia and Pythia Monte Carlos, and at both parton and hadron level. Unlike for the D 2 observable, since e (2) 2 is set by a single emission, excellent agreement is observed for the e (2) 2 observables between Pythia and Vincia both at parton level, emphasizing that D 2 offers a more differential probe of the perturbative shower than single emission observables.
Our analytic predictions at 2 TeV, as shown in figure 18, are intermediate between the Pythia and Vincia results. They exhibit a peaked structure at small values of D 2 , but not to the extent seen in the Vincia distribution. We believe that this is largely due to the normalization of the distributions, and the fact that we do not match to fixed order in the tail of the distribution. Since this tail becomes longer at higher energies, a larger disagreement in the peak region is also seen. On the other hand, at 500 GeV, our analytic prediction has a large peak. This is evidence that because the D 2 spectrum is much more sharply peaked at 500 GeV, higher order resummation may be more important in the peak region. However, the relatively good agreement between analytics and Monte Carlo shows that our factorization theorem is able to accurately capture the energy dependence over a large range of energies. It is important to note that hadronization will remove some of the discrepancies in the D 2 distributions between the Vincia and Pythia generators, especially at high energies, where it will smear out the peak at low values of D 2 . While this improves qualitatively the behavior of the distributions, discrepancies in the shape still remain. This will be discussed in detail in section 5.5, along with its incorporation into our analytic calculation.

JHEP05(2016)117
For comparison to precision analytic calculations and interpreting data, it is vital that Monte Carlo generators provide accurate descriptions of both the perturbative and nonperturbative aspects of QCD jets, and not compensating for perturbative discrepancies by the tuning of non-perturbative parameters. This is especially important for disentangling non-perturbative effects from perturbative effects, the latter of which should in principle be under much better control, and for extracting reliable information about non-perturbative QCD from jet physics.
Throughout the rest of the paper we will focus on jets with radius R = 1 at a center of mass energy of 1 TeV.

Impact of hadronization
Hadronization plays an important role in a complete description of any jet observable, and a description of non-perturbative effects, preferably from a field-theoretic approach, is required to compare with experimental data. An advantage of the factorization approach taken in this paper is that it allows for a clean separation of perturbative and nonperturbative physics. Non-perturbative effects enter the factorization theorems presented in section 3 through the soft function, which describes the dynamics of soft radiation, both perturbative and non-perturbative, between the jets. For a large class of additive observables, the treatment of non-perturbative physics in the soft function is well-understood, and can be incorporated using shape functions [83,84,[145][146][147]. Shape functions have support over a region of size Λ QCD , and are convolved with the perturbative soft function. In the tail region of the distribution, where the observable is dominated by perturbative emissions, they reduce to a shift. For a large class of observables, this shift is determined by a universal [148,149] non-perturbative parameter multiplied by a calculable, observable dependent number [149][150][151]. Similar shape functions have also been used to incorporate the effects of pile-up and the underlying event at hadron colliders [152].
The effect of non-perturbative physics on multi-differential cross sections has not been well-studied. For the double differential cross section of two angularities, ref. [70] considered using uncorrelated shape functions for each angularity individually, but it is expected that a complete description would require a shape function incorporating non-perturbative correlations between observables. For the particular case of the D 2 observable, we will argue that a single parameter shape function can be used to accurately describe the dominant non-perturbative effects, and in particular, that a study of multi-differential shape functions with non-perturbative correlations, is not required. Of course, to justify the use of a shape function requires the observable in question to be infrared and collinear safe. Therefore, we will only consider non-perturbative corrections to D 2 in the presence of a mass cut on the jet.
In section 4.3 we performed a study of the fixed order singular structure of the D 2 observable in the presence of a jet mass cut. Importantly, we showed that D 2 only has a singularity at D 2 = 0, with its behavior at all other values regulated by the mass cut. Non-perturbative corrections to the D 2 observable will play an important role only when the soft scale becomes non-perturbative, which as just argued, for a perturbative mass cut of the form studied in this paper, only occurs as D 2 → 0. Recall that the D 2 observable is which is not additive. However, in the two-prong region of phase space, namely D 2 → 0, the value of e (β) 2 is set to leading power by the hard splitting, and so D 2 effectively reduces to an additive observable. In this region of phase space the description of non-perturbative effects in terms of a shape function can therefore be rigorously justified from our factorization theorem, and it can be applied directly to the D 2 distribution. For large values of D 2 , it is not additive, and the use of a shape function cannot be formally justified. However, in this region, a shape function is not required, as any singular behavior is regulated by a mass cut. We therefore will use a shape function that falls off exponentially at large values of D 2 . We believe that this is a self-consistent approach until non-perturbative corrections to multi-differential cross sections are better understood.
In the two-prong region of phase space, we have shown that two distinct factorization theorems, namely the soft subjet and collinear subjets, are required, and in section 4.4.1 we showed how these two descriptions can be merged to provide a complete description of the two-prong region of phase space. Importantly, the two factorization theorems describing the two-prong region of phase space have soft functions with different numbers of Wilson lines. The collinear subjets soft function is a two-eikonal line soft function, while the soft subjet soft function has three eikonal lines. Since the shape function describes the nonperturbative contribution to the soft function, in general we should allow for two distinct shape functions, with independent parameterizations. The zero-bin merging procedure in section 4.4.1 would then be performed on the non-perturbative cross sections, after convolution with the appropriate shape function. However, at the level of perturbative accuracy which we work, and because we will simply be extracting our shape function parameters by comparing to Monte Carlo, the use of distinct parameterizations of different shape functions for both the soft subjet and collinear subjets soft functions would introduce many redundant parameters. To simplify the situation in this initial investigation, we will choose to use the same parametrization of the shape function, and the same nonperturbative parameters for both soft functions. This allows for the non-perturbative corrections to be described by a single parameter, and as we will see provides an excellent description of the Monte Carlo data. Because we use the same shape function for both the soft subjet, and collinear subjets soft functions, it also implies that the shape function can be applied after the zero bin merging procedure, namely, directly at the level of the D 2 distribution.
As a simple parametrization of a shape function for D 2 , we follow ref. [152] and consider where is the energy and Ω D ∼ Λ QCD is a non-perturbative scale. Note that while we will use the same value of Ω D for the signal and background distributions, it will have very different effects on the two distributions, which will arise naturally from the power counting in the different factorization theorems, as will be shown in this section. The function of -51 -

JHEP05(2016)117
eq. (5.2) satisfies the required properties that it is normalized to 1, has a finite first moment Ω D , vanishes at = 0, and falls off exponentially at high energies [146]. More general bases of shape functions are discussed in ref. [147], although we find that the single parameter shape function of eq. (5.2) is sufficient to describe the dominant effects of hadronization.
As discussed above, we will use the shape function of eq. (5.2) for both the collinear subjets and soft subjets factorization theorems, with the same value of Ω D in both cases. Because we have enforced this simplification to reduce the number of parameters, it is then most interesting to focus on Ω D for the collinear subjets factorization theorem, which has two eikonal lines. In this case, we can show that we can relate the Ω D parameter to universal non-perturbative parameters appearing in e + e − → dijet factorization theorems, which have been measured in experiment. Therefore, throughout the rest of this section, we will focus on deriving scaling relations for Ω D , assuming we are working in the collinear subjets factorization theorem. Again, we wish to emphasize that this is merely a simplification we have made to reduce the number of parameters, and a more general treatment could be performed, but we will see that with only the single Ω D , with properties derived assuming the collinear subjets factorization theorem, excellent agreement with Monte Carlo is observed.
The effect of non-perturbative physics as modeled by the shape function is very different for background or signal distributions. For background, when D 2 is small, the contribution to e (α) 3 from a non-perturbative soft emission is where is the energy of the non-perturbative emission and E J is the energy of the jet, as shown in eq. (3.5). The non-perturbative contribution to D 2 is therefore 2 ) 3α/β .

(5.4)
In terms of the shape function, the non-perturbative distribution of D 2 for background jets can then be written as a convolution: 16 where σ np and σ p denote the non-perturbative and perturbative cross sections, respectively. We can estimate the scale at which the global softs of the collinear subjets factorization theorem become non-perturbative from the scaling of the modes given in eq. (3.4). Rewriting this scaling in terms of the center of mass energy of the e + e − collision, Q, and 16 In this initial investigation we do not include a gap in our shape function, which would implement a minimum hadronic energy deposit, as expected physically [146]. Such gapped shape functions, and their associated renormalon [153] ambiguity [154] have been studied for arbitrary angular exponents [155], and could be straightforwardly incorporated in our analysis. However, we observe excellent agreement with our single parameter shape function, which we therefore find to be sufficient for our purposes.
where we have assumed a jet mass, m J = m Z , as relevant for boosted Z discrimination. Taking Λ QCD = 500 MeV, we find that the global soft scale enters the non-perturbative regime at D 2 1. Restricting to β = 2, in the collinear subjets region of the background jet phase space, the non-perturbative distribution of D (α,2) 2 is then where we have used Because we consider fixed-energy jets with masses in a narrow window, e 2 is just a number and can be removed by appropriate change of variables. Making this change, we then have where the non-perturbative parameter in the shape function is effectively modified tõ The non-perturbative parameter Ω D still has implicit dependence on the angular exponent α. Because the global soft modes have the lowest virtuality and can only resolve the back-to-back soft Wilson lines in the n andn directions, we can use the results of refs. [150,151] to extract the α dependence. By the boost invariance of the soft function 17 along the n −n directions and the form of the observable e

JHEP05(2016)117
where Σ is a universal non-perturbative matrix element of two soft Wilson lines and all dependence on α has been extracted. 18 We have normalized the matrix element such that the coefficient is unity for α = 2. We will shortly discuss the extent to which the values of Ω D we obtain from comparison with the parton shower agree with the known values of this universal non-perturbative matrix element. For signal jets, the lowest virtuality mode in the jet are the collinear-soft modes. Unlike the global soft modes of the collinear subjets factorization theorem, which did not resolve the substructure of the jets, allowing us to relate the non-perturbative parameter appearing in the shape function to that appearing in dijet event shapes, the collinear soft modes in the signal factorization theorem resolve the jet substructure. However, since the decaying boson is a color singlet, there are still only two eikonal lines present in the factorization theorem. Boost invariance of the soft function will therefore again allow us to relate the non-perturbative parameter for the signal distribution to that appearing in dijet event shapes. This is similar to the argument used in ref. [41] to calculate the signal distribution for 2-subjettiness.
A non-perturbative collinear-soft emission contributes to e (α) 3 as where now is the energy of the non-perturbative collinear-soft emission, as shown in eq. (3.6). The non-perturbative contribution to D 2 for signal jets is therefore where in the second line we have used the parametric relationship between e in the collinear subjets region. Convolving with the shape function, the non-perturbative distribution for signal jets is then It is important to note how the different scales for the soft radiation in the case of the signal and background jets leads to different behavior of the D 2 distributions after 18 In this section we ignore the effects of hadron masses, and their associated power corrections of O(mH /Q), where mH is the mass of a stable hadron in the jet. While these power corrections can also be incorporated through the shape function, in general, they break the universality of the non-perturbative matrix element, Σ [91,92]. In particular, eq. (5.12) is no longer in general true, for a Σ that is independent of the angular exponent α [91,92]. This depends on the precise definition of the energy correlation functions for massive particles. However, the value of Σ can still be extracted from dijet event shapes in the same universality class as a particular angularity [92]. Furthermore, ΩD has a scale dependence from renormalization group evolution, ΩD = ΩD(µ), although this dependence is logarithmic, and is therefore small compared to our uncertainties. We will discuss briefly the impact of hadron masses and the renormalization group evolution of ΩD in section 6, and in appendix H.

JHEP05(2016)117
hadronization. In particular, from eqs. (5.10) and (5.15) one can determine the shift in the first moment of the D 2 distribution caused by hadronization, which we will denote by ∆ D . Restricting to the case α = β = 2 for simplicity, we find that for the background distribution, whereas for the signal jets, we have Since Ω D should be of the scale 1 GeV, we see that for signal jets, the shift in the first moment due to hadronization is highly suppressed, and behaves differently than a traditional event shape due to the boost factor, while for background jets, since m J E J , the effect of hadronization is significant. We will see that both of these features, which are consequences of the power counting of the dominant modes, are well reproduced in the Monte Carlo simulations.
Comparisons between the hadron-level distributions of D (2,2) 2 from our analytic calculations and the Monte Carlos are presented in figure 19 for background and figure 21 for signal jets. For background distributions, we compare our perturbative calculation convolved with the shape function, as defined in eq. (5.5). Both Vincia and Pythia use the same hadronization model, but Herwig++ uses a distinct hadronization model, and therefore we allow for a different shape parameter, Ω D , for the two cases. For the case of Pythia and Vincia, we choose to extract the value of Ω D by fitting to the hadronized distribution for p T ordered Vincia. However, we will shortly discuss the level of ambiguity in Ω D arising from this extraction. For jets with an energy of 500 GeV and mass of 90 GeV, we find that the choice Ω D = 0.34 ± 0.03 GeV provides the best agreement of our perturbative calculation with p T ordered Vincia, while Ω D = 0.41 ± 0.03 GeV provides the best agreement with Herwig++. The errors assigned here come only from the fitting itself, and are due to the statistical uncertainties of the Monte Carlo distributions due to the finite width of the histogram bins. These errors do not take into account any other uncertainties; for example, whether one should perform the fit to hadron level Vincia or Pythia. This level of agreement between the non-perturbative parameters extracted from Pythia and Herwig++ is comparable to more detailed studies, such as ref. [92]. A comparison of the distributions of figure 19 before and after hadronization shows that hadronization has a considerable effect on the background distributions, particularly at small values of D 2 , as expected from eq. (5.16). This effect, which in the Monte Carlos is realized through tuned hadronization models, is well described by the single parameter shape function. Importantly, as discussed above, if different shape parameters were used for the collinear subjets and soft subjets factorization theorems, they would be nearly degenerate in the fit at the level of perturbative accuracy that we work, which is why we have made the simplification of working with a single non-perturbative shape parameter.
We have argued that the non-perturbative parameter Ω D in the collinear subjets factorization theorem can be related to a universal non-perturbative matrix element of two

������++ ������ �����
(d) Figure 19. A comparison of the D (2,2) 2 distributions for background QCD jets from our analytic prediction and the various hadron-level Monte Carlos. σ p denotes the parton level perturbative prediction for the distribution and σ np = σ p ⊗ F D is the perturbative prediction convolved with the non-perturbative shape function. The values of the non-perturbative parameter Ω D used are also shown.
soft Wilson lines. Such non-perturbative matrix elements appear in the factorization theorems of a large class of e + e − event observables, and has therefore been measured from data at LEP. 19 While the value of Ω D that we have determined for the two parton showers is by no means precise, it is interesting to compare our value with those extracted from precision studies of e + e − collider observables which have been performed in the literature. Using the particular case of α = β = 2, and converting to our normalization, a recent extraction of the non-perturbative parameter from an N 3 LL analysis of the C-parameter event 19 An extremely large literature exists on such measurements, and their theoretical interpretation, to which we cannot do justice in this brief section. We refer the reader to, for example, refs. [83,84,[156][157][158][159][160][161] and references therein. shape using LEP data, and including power corrections and hadron mass effects [91,92], gives a value of Ω D = 0.28 GeV [160,161]. This agrees well with our values extracted through comparison with Monte Carlo. Going forward, with the goal of increasing both the precision and understanding of jet substructure, the ability to relate the dominant non-perturbative corrections to the D 2 observable to known non-perturbative parameters measured in e + e − is a valuable feature, and that further study on the non-perturbative corrections to multi-differential cross sections is of great importance.
Many of the features of the background distributions which were present before hadronization in figure 14a persist after convolution with the shape function. However, they are greatly reduced, and they become difficult to disentangle from modifications to the nonperturbative shape parameter at the order we work. In particular, from figure 19, we see that for the choices of Ω D that we have used, both Vincia showers agree well with our analytic calculation. On the other hand, with the chosen value of Ω D , the D 2 distribution in Pythia is systematically pushed to higher values as compared with our calculation.
To try and asses the extent to which this can be accommodated for by adjusting the value of Ω D , in figure 20 we show plots of both Pythia and Vincia with p T ordering compared with our analytic results for two different values of the shape parameter. The values Ω D = 0.34 GeV and Ω D = 0.47 GeV were chosen to give best agreement with the Vincia and Pythia distributions, respectively. This figure makes clear that the disagreement between the D 2 distributions as generated by the two Monte Carlo generators can largely be remedied by using different values of the non-perturbative parameter. We note also that the effect of changing the non-perturbative parameter is of course similar to that of changing the perturbative cutoff of the shower, as was discussed in section 5.2, making it difficult to disentangle these two effects.
This plot also gives a feel for the extent to which Ω D can be varied before significant disagreement is seen between the analytic calculation and a given Monte Carlo distribution.

������++ ������ �����
(d) Figure 21. A comparison of the D (2,2) 2 distributions for signal boosted Z jets from our analytic prediction and the various hadron-level Monte Carlos. σ p denotes the parton level perturbative prediction for the distribution and σ np = σ p ⊗ F D is the perturbative prediction convolved with the non-perturbative shape function, although for the signal this has a negligible effect. The values of the non-perturbative parameter Ω D used are also shown.
Performing the perturbative calculation to higher accuracy would help to resolve some of these ambiguities in the value of the shape parameter, by reducing the perturbative uncertainty on the shape of the distribution, as well as its normalization. Throughout the rest of this paper, when comparing our analytic predictions with Vincia or Pythia, we will use the value Ω D = 0.34 GeV as obtained from our fit to hadron level p T -ordered Vincia. However, one should keep in mind the level of sensitivity to this parameter. In particular, for the application of boosted Z discrimination, we will see that the discrimination power of the observable will depend sensitively on the shape of the D 2 distribution below the peak, and will therefore exhibit great sensitivity to the value of the shape parameter.

JHEP05(2016)117
For the signal distributions, shown in figure 21, we use the same choice of nonperturbative parameters as for the background distributions. From eq. (5.17), we have seen that for the jets with E J = 500 GeV, the non-perturbative shift is expected to be of the order 1/500, and is therefore completely negligible to the level of accuracy that we work, and the equality of the non-perturbative parameters between the signal and background distributions is not tested. For the signal distributions, we see excellent agreement between the theory prediction and all the Monte Carlo generators. Due to the sharp peak in the distribution, we expect higher order resummation is necessary to provide a more accurate description right in the peak region, where the perturbative uncertainty in our calculation becomes large. Due to the fact that the distributions are normalized, this uncertainty also manifests itself in the tail of the distribution. It is known how to calculate the signal distribution to higher accuracy [41], and so we do not consider this issue further here. The effect of the shape function on our analytic results are consistent with all of the Monte Carlos, whose signal D 2 distribution is changed only slightly (i.e., only in the lowest bins) after hadronization.
We conclude this section by emphasizing how the choice of variable can greatly facilitate comparisons with Monte Carlos. An important feature of the D 2 observable is that it cleanly separates phase space regions dominated by different physics. In particular, it separates the region of phase space where a subjet is formed from that where no subjet is formed, as well as separating the regions of phase space where hadronization is important from those where it plays a minor role. This enables these effects to be cleanly disentangled, and provides a sensitive probe of their modeling. We therefore believe that the observable D 2 could play an important role in the tuning of Monte Carlo generators for jet substructure studies, and could be used to complement some of the observables proposed and studied in refs. [122,123]. 20 Furthermore, the observable D 3 [67], which is sensitive to three-prong substructure within a jet also provides a clean separation of two-and three-prong regions, and could be used to provide an even more detailed understanding of jet substructure and the perturbative shower evolution.

Analytic boosted Z discrimination with D 2
In this section, we use our analytic calculation, combined with the non-perturbative shape functions of section 5.5, to make complete predictions for the discrimination power of D (2,2) 2 for hadronically-decaying boosted Z bosons versus QCD quark jets at an e + e − collider. We present comparisons of our calculation to the results of fully hadronized Pythia, Vincia, and Herwig Monte Carlos. Here, we also present Monte Carlo results from scanning over a range of values for the angular exponent α that is consistent with our factorization 20 Note that refs. [122,123] used the observable C2, also formed from the energy correlation functions, which was proposed in ref. [65]. Unlike D2, C2 does not cleanly separate the two-prong region of phase space from the one-prong region of phase space. A detailed discussion of this point can be found in ref. [66]. The clean separation of the one-and two-prong regions of phase space is the essential feature of the D2 observable, which allows for its precise theoretical calculation and its sensitivity to the shower implementation.   distributions for the four different Monte Carlo generators and our analytic calculation, including hadronization. Here we show the complete distributions, including the long tail for the background distribution. Although we extend the factorization theorem beyond its naive region of applicability into the tail, excellent agreement with Monte Carlo is found. theorem. Analytic results for boosted boson discrimination were also presented recently in ref. [46] for groomed mass taggers, as well as an analytic study of the optimal parameters.
In figures 22 and 23 we overlay the distributions for D (2,2) 2 as measured on signal and background for each Monte Carlo sample, and compare with our analytical calculations including the non-perturbative shape function contributions. Figure 22 shows the complete D 2 distributions, including the long tail of the background distribution, while figure 23 shows a zoomed in version, focusing on small values of D 2 , as is most relevant for signal versus background discrimination. A representative cut on the D 2 distribution, as could be used to select a relatively pure sample of boosted Z bosons, is also indicated. In general, the agreement between the Monte Carlos, for both signal and background distributions, and our calculation is impressive. This holds true both for the overall shape of the distributions,  including the long tail of the background distribution, and for the detailed shape at small values of D 2 . It is also important to note that the perturbative uncertainties remain under control, even in the small D 2 region, as seen in figure 23. The uncertainty bands do not incorporate variations in the non-perturbative parameter Ω D . There are however, some small deviations between the analytic predictions and the Monte Carlo distributions. The background distribution in Pythia is pushed to slightly higher values than our calculation. This implies that the signal versus background discrimination power as predicted with Pythia will be overestimated. The most conservative prediction for the signal versus background discrimination power is from Herwig, whose background distribution is nearly identical to our calculation. That Pythia tends to be optimisitic and Herwig tends to be pessimistic with respect to discrimination power has been observed in several other jet substructure analyses [23,[65][66][67].  An important feature of the D 2 distributions, made clear by figure 23, is that in the region of interest relevant for boosted Z discrimination, the background distribution is deep in the non-perturbative regime. Therefore, although the perturbative uncertainties are small, the effect of the shape function, and variations of the non-perturbative parameter Ω D , is large. Estimates of the uncertainties due to the form of the shape function, or the use of more complicated functional forms, along the lines of ref. [147] are well beyond the scope of this paper. An advantage of our factorization approach is that we are able to achieve a clean separation of perturbative and non-perturbative effects, and demonstrate relations between the non-perturbative matrix elements appearing in our factorization theorems and non-perturbative matrix elements which have been measured with other event shapes, by using their field theoretic definitions. This separation is essential for understanding discrimination performance in the non-perturbative region, which we see is required for jet substructure studies related to boosted boson discrimination. Importantly, though, D 2 seems to take advantage of the different hadronization corrections to signal and background jets, and the overlap of the signal and background regions of D 2 decreases significantly in going from parton-level to fully hadronized jets.

JHEP05(2016)117
In figure 24, we have used these raw distributions to produce signal versus background efficiency curves (ROC curves) by making a sliding cut in D 2 . The ROC curve from each Monte Carlo sample as well as our analytic prediction from our calculated signal and background distributions are shown in both logarithmic plot and linear plot in figures 24a and 24b, respectively. The band around our analytic prediction should be taken as representative of the signal versus background efficiency range from varying the perturbative scales. 21 For the analytic predictions, we use Ω D = 0.34, as obtained from our fit to the p T

JHEP05(2016)117
ordered Vincia shower. Consistent with the distributions in figure 22, the Monte Carlos are in qualitative agreement with our analytic prediction for the ROC curve. In general, our analytic prediction seems to give an optimistic prediction for the discrimination power, however, this is driven by the fact that our resummed prediction for the signal distribution is more peaked. It would be interesting to perform the NNLL resummation for the signal, which should significantly reduce the uncertainty in the signal calculation, particularly in the peak region, where the perturbative uncertainties in our present calculation are quite large. Because of the fact that the distributions are normalized, an improved behavior in the peak of the distribution could also improve the agreement in the tail of the signal distribution, which is currently systematically low, due to the fact that the peak is systematically high. This could enable a conclusive understanding as to the discrepancy between the different Monte Carlo generators for both signal and background distributions. In particular, our analytic calculations suggest that the Herwig++ generator provides pessimistic predictions for the discrimination power of the D 2 observable due to the underestimation of the peak height for the signal distribution, and it would be interesting to understand this further. Due to the importance of analytically understanding the discrimination power of jet substructure observables, such a calculation is well motivated. For the case of α = β = 2, the required perturbative components could be obtained following relations to e + e − event shapes as were used in ref. [41].
One feature made clear by the linear ROC curve in figure 24b is the increase in perturbative uncertainty with increasing Z efficiency. As emphasized earlier, this is due to the fact that for the region of interest for Z discrimination, one is probing values of D 2 which are below the peak of the background distribution, and therefore in the non-perturbative regime. As the Z efficiency is increased, one enters the peak region of the background distribution, where the perturbative uncertainty is largest, causing a corresponding increase in the uncertainty band for the ROC curve. However, we do not include uncertainties due to the non-perturbative parameter Ω D or from the shape function, in figure 24b, which are the dominant sources of uncertainty in this region.
To demonstrate that is indeed the case, in figure 25 we show ROC curves in both linear and log scales for two different values of the non-perturbative shape parameter. The values of Ω D where chosen by varying our central value of Ω D = 0.34 GeV by ±0.15 GeV (and rounding to nice numbers). We have also shown the distributions from the Herwig++ and Pythia generators as representative of the ROC curves generated by the Monte Carlo generators. This figure makes clear that in the region of efficiencies of interest for boosted Z tagging, one is extremely sensitive to the D 2 distribution in the deeply non-perturbative region, and this uncertainty swamps the perturbative uncertainty. To be able to improve the accuracy in this region will require detailed comparisons with Monte Carlo, data, and analytic calculations, to allow for a clean separation of the non-perturbative parameter from perturbative modifications to the shape of the distribution.
To further understand the discrimination power of the D 2 observable, in figures 26a and 26b we show the background rejection rate at 50% and 75% signal efficiency as a function of α, the angular exponent of the 3-point energy correlation function in D 2 . Below about α = 4/3, all rejection rates dramatically decrease as α decreases, while above about α = 4/3, the QCD rejection rate in all Monte Carlo samples is impressively flat. This is consistent with our power counting analysis of the e 2 , e tion remains accurate (indeed our power counting becomes more valid in this region), the signal distribution becomes extremely sharply peaked, which is difficult to describe, and sensitive to normalization. Due to the fact that this region is also of less phenomenological interest, both because the large angular exponent makes the observable sensitive to pile up contamination, and because both power counting and Monte Carlo analyses indicate that optimal performance is achieved for α = 2, we have decided not to focus on this region. It would be potentially interesting to see if higher order resummation would be sufficient to describe the sharply peaked signal distribution in this region, as well as to test the universality of the non-perturbative power corrections.
One further interesting feature of figures 26a and 26b is the correspondence between the perturbative scale variations, and the spread in the curves from the different Monte Carlo generators, which agree well at both 50% and 75%. For the case of p T -ordered Vincia as compared with virtuality ordered Vincia, this correspondence is precise, as the difference between the Monte Carlos can be viewed as a scale variation, and identical hadronization models are used.

Discrimination in the two-prong regime
Throughout this paper, we have emphasized that the discrimination of boosted hadronically decaying Z bosons (or W or H bosons) from massive QCD jets is effectively a problem of discriminating one-from two-prong jets. We have demonstrated that the observable D 2 is powerful for this goal. However, in the formulation of our factorization theorem for calculating the distribution of D 2 , we needed to perform additional 2-point energy correlation function measurements on the jet to separate contributions from soft subjet and collinear subjets contributions to background. While indeed the signal jets are dominantly two-pronged, we further know that those prongs are dominantly collinear, and do not have parametrically different energies. Therefore, we are able to further discriminate signal from background jets in the two-prong region of phase space by exploiting additional measurements that can isolate the soft subjet and collinear subjet configurations. A detailed analysis of this is beyond the scope of this paper, but here, we will demonstrate in Monte Carlo that such a procedure is viable.
To investigate this, we measure the observable D (2,2) 2 on jets on which a tight mass window cut has been applied. Other angular exponents for D 2 can be used also, but here we only measure D 2 to define two-prong jets. We restrict to the two-prong region of phase space by requiring that D (2,2) 2 < 2.5. Then, on the jets that pass these cuts, we measure two, 2-point energy correlation functions, e  is undefined. We will study the distribution of e 2 ) β/2 . In the two-prong region, the lower bound is set by the soft subjet while the upper bound is set by collinear subjets. Therefore, e (β) 2 for signal jets will peak near 2 β−2 (e (2) 2 ) β/2 , while background QCD jets will fill out the full range. We illustrate this in figure 27 on the hadronized Pythia sample with the appropriate cuts applied. We show plots of the distributions of e This demonstrates a simple example of an observable which goes beyond the simple one vs. two prong picture of jet substructure, asking more differential questions about the subjets themselves. In particular, it could be used both to further improve the discrimination power of boosted boson discriminants, and to study in detail the QCD properties of subjets.

Looking back at LEP
In this section, we consider the D 2 distribution for QCD jets in e + e − collisions at the Z pole at LEP, for which a large amount of data exists. While the use of D 2 for boosted boson discrimination is not possible, nor relevant, at LEP, this will emphasize the sensitivity of D 2 as a probe of two-prong structure in jets. We will suggest the importance of using variables sensitive to two emissions off of a primary quark in tuning Monte Carlo generators to LEP data.

JHEP05(2016)117
Our definition of the energy correlation functions in eq. (2.1) makes implicit assumptions about the treatment of hadron masses, which we have ignored to this point. The definition given there is an E-scheme treatment of hadron masses [91,92], but we could equally well define p-scheme energy correlation functions as: where p i denotes the three-momenta of particle i. For massless particles, this definition is identical to that of eq. (2.1), and so our perturbative analytics would be unchanged by using this definition or the definition of eq. (2.1). 22 The definitions of eq. (2.1) and eq. (6.1) differ for massive particles. In particular, the energy correlation functions as defined in eq. (6.1) have the advantage that they vanish for low momentum or collimated particles regardless of whether these particles are massless or massive, which is not true of the definition in eq. (2.1). Because of this, we expect that the energy correlation functions as defined in eq. (6.1) are less sensitive to hadron mass effects and that kinematic restrictions on the energy correlation functions remain the same before and after hadronization, so that the phase space studied in section 2.2 assuming massless particles is not significantly modified. At LEP energies, hadronization will also have a larger effect on the D 2 spectrum than at 1 TeV. However, a particularly important aspect of our all orders factorization theorem is that it isolates perturbative and non-perturbative physics contributions. In this section we will again implement non-perturbative effects into our analytic calculation using the shape function defined in eq. (5.2). There are two effects which determine how the shape function depends on the jet mass, m J , and the center of mass energy, Q. First, for a fixed valued of Ω D , the shift in the first moment of the D 2 distribution was given in eq. (5.16), which we recall here for convenience, by This has dependence on both m J and Q (through E J ), and for the jets we consider at LEP, this is a considerably larger shift than for the 1 TeV jets studied in section 5.5. This scaling is a non-trivial prediction of our factorization framework, and we will see that it is well respected when we perform a comparison of our analytic results with Monte Carlo. Furthermore, the parameter Ω D has a logarithmic dependence on a renormalization scale, Ω D = Ω D (µ), through renormalization group evolution [92], which is briefly reviewed in appendix H. However, this effect is small compared with the linear change in the first moment with E J for a fixed m J /E J . A numerical estimate for the effect of the running of Ω D (µ) is given in appendix H. At the level of accuracy to which we work in this paper, we cannot probe this logarithmic running, although we will see that our results are consistent with it. 22 As will be discussed shortly, the differences in our analytic calculation due to hadron masses will arise through non-perturbative effects, namely the shape function.

JHEP05(2016)117
The definition of the energy correlation functions given in eq. (6.1) also has an effect on the universality of the non-perturbative parameter Ω D , when hadron mass effects are included. Power corrections due to hadron mass effects are of order O(m H /Q), where m H is a light hadron mass, and are therefore of the same order as the leading O(Λ QCD /Q) power corrections. In the p-scheme definition of the energy correlation functions which we have chosen in eq. (6.1), it is no longer possible to extract the dependence on the angular exponent alpha from Ω D , as was done in eq. (5.12). However, to the accuracy to which we work, we expect this to be a negligible effect, and furthermore, the case α = 2 is of most phenomenological interest, and is the case we have focused on exclusively in this paper. Furthermore, even in the presence of hadron mass effects, it is still possible to extract the parameter Ω D from dijet event shapes in the same universality class [92]. This exhibits the benefits of the factorization approach both for separating perturbative and non-perturbative effects, and for relating non-perturbative parameters to maintain predictivity.
One further distinction between the case of boosted Z discrimination and the measurement of QCD jet shapes at the Z pole is that while a tight mass cut is natural for boosted Z discrimination, it is not natural in jet shape analyses. However, our shape function analysis, as derived in section 5.5, is valid at a fixed jet mass (or correspondingly fixed value of e (β) 2 ). This is clear from both eq. (5.7) and from the equation for the shift in the first moment in eq. (5.16). However, we emphasize that the non-perturbative parameter Ω D is unique, and the scaling of the non-perturbative shift with the jet mass is fully determined. To achieve an analytic prediction for the non-perturbative D 2 spectrum inclusive over the jet mass mass, one must calculate the perturbative D 2 spectra differentially in the jet mass, convolve with a shape function for each value of the jet mass, and then integrate over the jet mass. While this is in principle straightforward, it is computationally intensive, and is beyond the scope of this paper. Instead, we will enforce a jet mass cut of 8 < m J < 16 GeV. This mass cut was chosen because it is near to the Sudakov peak of the jet mass distribution for this jet energy and the m J in this range are set by low scale, but still perturbative, emissions.
Similar to what we did in our numerical analysis at 1 TeV, we begin in figure 28 by comparing our analytic prediction for the D 2 spectrum with the distributions from parton level Monte Carlo. In figure 28a, we show a comparison of our complete analytic calculation, including perturbative scale variations, along with Monte Carlo predictions from both Pythia and p T -ordered Vincia, which we take as representative of the different Monte Carlo generators. We use a jet radius of R = 1.4 to approximate hemisphere jets. We find good agreement with the predictions of the Vincia Monte Carlo. It is important to emphasize, however, that at LEP energies, non-perturbative effects are large, and therefore a comparison with parton level Monte Carlo is difficult due to large uncertainties in the treatment of the shower cutoff. We also show, in figure 28b, a comparison of our analytic prediction, including only the collinear subjets region of phase space, with both Monte Carlo generators. The difference between the analytic predictions in figure 28a and figure 28b emphasizes the large effect played by the soft subjet at LEP energies. Unfortunately, due to large hadronization corrections, the treatment in Monte Carlo of the soft subjet region is difficult to disentangle from the treatment of non-perturbative physics. In figure 29b we show our analytic prediction for the non-perturbative spectra using the shape function. An alternate view of the perturbative spectrum is shown in figure 29a for reference, and to show the overall shape of the perturbative distribution. We have used a valued of Ω D = 0.50 GeV, which was obtained by fitting to the Vincia Monte Carlo. There is considerable uncertainty on this value, probably of the order ±0.3 GeV due to the wide mass window, which is probably slightly large for the naïve application of our shape function. Furthermore, as demonstrated in section 5.5, there is some ambiguity in the value of Ω D , depending on whether it is extracted from hadron level Pythia or Vincia, which is of this same order. However, this value is consistent with Ω D = 0.34 GeV as extracted from our analysis at 1 TeV. Although it is expected that the logarithmic running of the Ω D parameter will decrease its value slightly, this effect is expected to be small. The amount by which it is expected to decrease depends on another non-perturbative parameter, but is estimated in appendix H that Ω D should decrease by approximately 0.1 GeV between our predictions at 1 TeV and those at LEP energies. This is an important consistency check on our results, but due to the large uncertainty, we cannot claim to probe this running over the scales that we have considered. The analytic perturbative spectrum is also shown for reference. Good overall agreement with both Monte Carlo generators is observed, and the discrepancy between the Pythia and Vincia generators which was present at parton level is reduced, although still non-negligible. As was discussed in section 5.5, it could also be compensated for by a modification of the non-perturbative shape parameter. In particular, the effect of hadronization is well captured by non-perturbative shape function. Hadronization has a significantly larger effect on the D 2 observable at Z pole energies than at 1 TeV. This demonstrates the consistency of our implementation of the non-perturbative corrections through the shape function, which predicts the scaling of the shift in the first moment through eq. (6.2).
Unlike for the D 2 distributions at 1 TeV, where the effect of hadronization was well described only by a shift in the first moment, at LEP energies the hadronization also has a non-trivial effect on the shape of the distribution. This can clearly be seen by comparing the dashed perturbative spectrum and the non-perturbative results in figure 29b. While our factorization of non-perturbative effects in terms of a shape function is completely generic, it is only the first moment of the shape function which is universal, with the full non-perturbative shape function being in general observable dependent. However, the modification in the shape of the D 2 spectrum due to hadronization effects seems to be quite well captured by the shape function of eq. (5.2). In our plots we do not include any uncertainties due to the form of the non-perturbative shape function, despite the fact that they are the dominant effect throughout most of the hadronized distribution. More general shape functions, and a study of their associated uncertainties could be studied along the lines of ref. [147], although this is beyond the scope of this paper, and could only be justified if the perturbative components of our calculation were computed to a higher level of accuracy.
Since the D 2 spectrum is sensitive to the emissions from the gluon subjet, it is sensitive to the radiation pattern generated by a gluon, and could potentially be used to improve the Monte Carlo description of gluons and the modeling of color coherence effects. In contrast to most observables which have been used for tuning Monte Carlos to LEP data, such as -70 -

JHEP05(2016)117
the jet mass which is set by a single emission, D 2 requires two emissions off of the initiating quark to be non-zero, and therefore can be used as a more detailed probe of the perturbative shower. Although non-perturbative effects play a large role for jets in this energy range, we have shown that our factorization theorem allows us to cleanly separate perturbative from non-perturbative effects, which could be useful when tuning Monte Carlo generators, allowing one to disentangle genuine perturbative effects which should be well described by the Monte Carlo shower, from effects which should be captured by the hadronization model. We believe that higher order calculations of QCD jet shapes sensitive to three particle correlations, such as D 2 , and their use in Monte Carlo tunings is therefore well motivated.
For reference, in appendix I we show a collection of e 2 distributions measured at the Z pole, at both parton and hadron level for both the Vincia and Pythia event generators. Unlike for the D 2 observable, the Vincia and Pythia generators agree both at parton and hadron level to an excellent degree. This is of course expected due to the fact that these Monte Carlos have been tuned to LEP event shapes, but further emphasizes the fact that D 2 , and other observables sensitive to additional emissions, provide a more detailed probe of the perturbative shower.
7 Looking towards the LHC Throughout this paper, we have restricted our analysis to e + e − colliders so that we could ignore subtleties with initial state radiation, pile-up and other features important at hadron colliders. However, it is precisely for including these effects that a rigorous factorization based approach to jet substructure, such as that presented in this paper, will prove most essential. In this section, we discuss the extension to the LHC and in particular to what extent conclusions for e + e − colliders holds for the LHC.
The energy correlation functions have a natural longitudinally-invariant generalization relevant for pp colliders, which is given by [65,66] e (β) Here p T J is the transverse momentum of the jet with respect to the beam, p T i is the transverse momentum of particle i, and n J is the number of particles contained in the jet. The boost-invariant angle R 2 ij = (φ i −φ j ) 2 +(y i −y j ) 2 is defined as the Euclidean distance in the azimuth-rapidity plane. For central rapidity jets, which we will restrict ourselves to in this section, the power counting discussion of section 2 is unmodified. Therefore, the same conclusions for the form of the optimal observable, D 2 , as well as the range of angular exponents, apply. A simplified version of the D (α,β) 2 variable, restricted to have equal angular exponents α = β, was used in ref. [66], for jet substructure studies at the LHC.
It is in principle straightforward to extend the factorization theorems for D 2 to hadronic colliders, where D 2 is measured on a single jet in an exclusive N -jet event. Factorization -71 -

JHEP05(2016)117
theorems for exclusive N -jet production defined using N -jettiness [95,162] or with a p Tveto [163,164] on additional radiation exist and could be combined with the factorization theorems of section 3 to describe the jet substructure. We now briefly discuss how each of these factorization theorems can be interfaced with the presence of additional eikonal lines, representing either additional jets or beam directions in pp collisions.
Recall from section 3.1.1, that the collinear subjets factorization theorem is formulated as a refactorization of the jet function for a particular jet in the n direction, and it is therefore insensitive to the global color structure of the event, seeing only the total color. Intuitively, the collinear-soft modes are boosted, and therefore all additional Wilson lines in the event are grouped in then direction. Furthermore, the global soft modes, which resolve the global color structure of the event do not resolve the jet substructure. This property of the collinear subjets factorization theorem has the feature that it can be trivially combined with a factorization theorem with an arbitrary number of eikonal lines, without complicating the color structure. All that is then required, apart from the substructure components, is the addition of an additional measurement function in to the global soft function. Indeed, this extension has been discussed in detail in ref. [77]. This same property is of course also true for the soft haze factorization theorem, as no additional Wilson lines are required to describe the jet substructure in the first place.
However, for the soft subjet factorization theorem, the presence of additional Wilson lines does significantly complicate the factorization from a calculational perspective. In particular, since the subjet is soft, arising from a refactorization of the soft function, it is emitted coherently from the N -eikonal line structure as a whole, requiring a proper treatment of all color correlations, which becomes complicated with even a few additional Wilson lines. A conjectural proposal for the all orders soft subjet factorization theorem with N -eikonal lines was given in ref. [76], where the soft subjet factorization theorem was first proposed and studied in the large N c limit. However, more work is required to understand its structure, and an efficient organization of the color correlations at finite N c . Furthermore for the soft subjet factorization theorem, the final soft function has an additional eikonal line, since the jet substructure is resolved by the long wavelength global soft modes, further complicating the calculation (although there has recently been some progress in the computation of soft functions [165,166]). We emphasize however, that these are purely technical complications, and believe that the extension to a calculation of jet substructure in pp would be well worthwhile for improving our understanding of analytic jet substructure. Furthermore, depending on the relevant boosts and jet radii, the techniques of this paper could be used to identify whether the soft subjet factorization theorem plays an important role, or could be formally neglected, simplifying the calculation in more complicated cases.
For these reasons, a full calculation in pp is well beyond the current scope of this initial investigation. We will instead restrict ourselves to a brief Monte Carlo study comparing the properties of D 2 in e + e − and pp to show that the distributions exhibit similar features. In figure 30 we compare the Monte Carlo predictions for D parton-level process pp → qq and signal events from pp → ZZ → qqqq events, where q denotes a massless quark, with Pythia 8.205 at the 13 TeV LHC. 23 Jets are clustered with the anti-k T algorithm with radius R = 1.0, and using the WTA recombination scheme, with a p T metric. We cut on the transverse momentum of the hardest jet, requiring p T ∈ [450, 550] GeV, and on the jet mass requiring m J ∈ [80,100] GeV. These are chosen to be similar to the cuts on the jets for the case of e + e − , although they are of course not identical, and strict comparisons should not be made between the two cases. The shapes and general features of the D 2 distributions at the two colliders are very similar. There is a relative scaling between the D 2 distributions in e + e − and pp due to the different observable definitions. The e + e − definition uses the 1 − cos(θ ij ) measure of eq. (2.1), while the pp definition uses the boost invariant definition in terms of R ij , as in eq. (7.1). Since the e (α) 3 observable correlates particles of separation up to 2R, where R is the jet radius, for α = β = 2, this gives an expected factor of 4 difference between the two cases, as is approximately observed in figure 30.
The similar behavior of the e + e − and pp distributions suggests that a complete a calculation using our techniques would provide an excellent description of the D 2 distribution at a hadron collider, as we have found for e + e − . Such a calculation would also be interesting to better understand the effects of initial state radiation on the D 2 distribution. A simple setting where this calculation would be feasible, for example, would be to consider measuring the D 2 distribution on a jet recoiling against a color-singlet such as a W , Z 23 Since we only briefly mention the case of pp colliders, we do not perform a systematic study of the variation of the D2 distribution in pp with different Monte Carlo generators, as we did for the case of e + e − . However, we believe that this is essential in any jet substructure study at pp, as we expect variations will be present, as in the e + e − case. It would be particularly interesting to compare a pT -ordered dipole-antenna shower, such as was recently implemented for pp in Dire [132], with the Pythia and Herwig++ generators which are more commonly used in jet substructure studies at the LHC.
- 73 -or H boson, as was used in ref. [62] to perform a NNLL calculation of the jet mass. Although the effects of non-global logarithms would need to be understood, and could play an important role, recent progress in this area suggests that this issue could be addressed, either by direct resummation of the NGLs [76,[167][168][169][170], or through the use of jet grooming algorithms which remove NGLs [43,44,71]. While it is truly uncorrelated with the jet, the effect of radiation from pile-up on D 2 could also be mitigated using similar jet grooming algorithms.

Conclusions
In this paper we have presented a novel approach to the factorization of jet substructure observables, and applied it to the identification of two-prong substructure. Instead of starting with a given two-prong discriminant, we used the energy correlation functions as a basis of IRC safe observables to isolate the possible subjet configurations. We then studied the phase space defined by these IRC safe observables and proved all orders factorization theorems in each region of phase space. This procedure naturally identified an observable, D 2 , which we argued provided optimal discrimination power, and which preserved the factorization properties of the individual factorization theorems describing different regions of the phase space defined by our basis of observables. We showed that a factorized description of this observable could be obtained by merging the different factorization theorems, and introduced a novel zero bin procedure in factorization theorem space to implement this merging. An important benefit of this approach is that our factorization theorems are valid to all orders in α s at leading power and therefore provide a systematically improvable description of D 2 .
Using our factorized description of the D 2 observable, we presented a numerical study of our results at an e + e − collider, for both the signal and background distributions, resulting in analytic boosted Z boson versus massive QCD jet discrimination predictions. We compared with a variety of Monte Carlo generators, and demonstrated that the low D 2 region, where a hard two-prong substructure is resolved, is a sensitive probe of the Monte Carlo parton shower description. We also studied the effect of non-perturbative corrections, showing that they can be well-described using a simple shape function, and related the single parameter of this shape function to a universal non-perturbative matrix element measured at LEP. This is vital for comparing our calculation with data.
Because our calculation presents the first factorized description of a two-prong discriminant jet observable in both signal and background regions, there are a large number of directions for future study which are of great interest. First, our calculation was presented in the context of jets produced in e + e − collisions. For applications at the LHC, where jet substructure plays a vital role, it is important to extend the calculation to jets produced at a pp collider. The factorization theorem we presented straightforwardly generalizes to pp colliders with only complications due to soft radiation from the beams and the more complicated color structure of the hard interaction. The treatment of both these effects are well-understood and their inclusion in a jet substructure calculation would allow the first precision comparisons of calculations with data.

JHEP05(2016)117
An interesting potential application of our factorization theorems, and merging procedures, which describe in a more differential way the substructure of jets, is to improve jet shape based subtraction schemes for QCD calculations at NNLO and beyond. Quite recently, subtractions based on the N -jettiness observable [95] have been used to perform NNLO calculations in QCD [171][172][173]. This allowed, in particular, the calculation of W , H +1 jet at NNLO [171,172] (H +1 jet at NNLO was also calculated using more traditional subtraction techniques in [174]). The use of more differential subtractions based on more differential factorization theorems would allow for more local, and potentially numerically more efficient subtractions.
It would also be interesting to apply our calculation approach to other observables. For example, the N -subjettiness observables [63,64] are used extensively in jet substructure studies at the LHC, and it would be of significant phenomenological relevance to obtain a factorized description of these observables. The approach presented in this paper could also be extended to study more differential observables, such as those used for boosted top discrimination, which can resolve three subjets. A generalization of the D , which resolves three prong structure was introduced in ref. [67] (see also ref. [175] where it was used for boosted top discrimination at a 100 TeV collider). The D , and hence should be calculable with similar techniques. A rigorous factorization will also prove essential in this case, allowing for the separation of perturbative and non-perturbative physics, as well as effects associated with the finite top width [111,176]. More generally, we anticipate that the approach to the factorization of jet substructure observables presented in this paper will allow for the construction of more powerful jet substructure discriminants and will enable a more detailed analytic understanding of the substructure of high energy QCD jets.

JHEP05(2016)117
factorization theorems were presented in an attempt to appeal to a broader audience, and so as to not distract the reader with technical complications. In these appendices, we give the operator definitions of the functions appearing in the factorization theorems of section 3, and calculate the functions to one-loop accuracy.
In this appendix we begin by summarizing some notation and conventions. The factorization theorems presented in this paper are formulated in the language of SCET [79][80][81][82]. We assume that the reader has some familiarity with the subject, and will only define our particular notation, and review the definition for common SCET objects. We refer readers unfamiliar with SCET to the reviews [177,178].
SCET is formulated as a multipole expansion in the momentum components along the jet directions. Since we take the jet directions to be lightlike, it is convenient to work in terms of light-cone coordinates. We define two light-cone vectors with n a unit three-vector, which satisfy the relations n 2 =n 2 = 0 and n ·n = 2. We can then write any four-momentum p as A particle in the n-collinear sector has momentum p close to the n direction, so that its momentum scales like (n · p,n · p, p n⊥ ) ∼n · p (λ 2 , 1, λ), with λ 1 a small parameter. The parameter λ is a generic substitute for the power counting parameters in the different factorization theorems presented in section 3, and since our factorization theorems involve multiple scales, there are generically multiple distinct λs.
In the effective field theory, the momentum of the particles in the n-collinear sector are multipole expanded, and written as wheren ·p andp n⊥ are large momentum components, which label fields, while k is a small residual momentum, suppressed by powers of λ. This gives rise to an effective theory expansion in powers of λ. SCET fields for quarks and gluons in the n-collinear sector, ξ n,p (x) and A n,p (x), are labeled by the lightlike vector of their collinear sector, n and their large momentump. We will write the fields in a mixed position space/momentum space notation, using position space for the residual momentum and momentum space for the large momentum components. The residual momentum dependence can be extracted using the derivative operator i∂ µ ∼ k, while the large label momentum is obtained from the momentum label operator P µ n . Operators and matrix elements in SCET are constructed from collinearly gaugeinvariant quark and gluon fields, defined as [79,80]

JHEP05(2016)117
The ⊥ derivative in the definition of the SCET fields is defined using the label momenta operator as iD µ n⊥ = P µ n⊥ + gA µ n⊥ , (A. 6) and is a Wilson line of n-collinear gluons. We use the common convention that the label operators in the definition of the SCET fields only act inside the square brackets. Although the Wilson line W n (x) is a non-local operator, it is localized with respect to the residual position x, and we can therefore treat χ n,ω (x) and B µ n,ω (x) as local quark and gluon fields when constructing operators. The operator definitions for jet functions in these appendices are given in terms of these collinear gauge invariant quark and gluon SCET fields.
Our operator definitions will also involve matrix elements of eikonal Wilson lines, which arise from the soft-collinear factorization through the BPS field redefinition at the Lagrangian level [81]. The Wilson lines extend from the origin to infinity along the direction of a lightlike vector, q, specifying their directions. Explicitly Here P denotes path ordering, and A is the appropriate gauge field for any sector which couples eikonally to a collinear sector with label q (for example collinear-soft, soft, boundary soft), and the color representation has been suppressed. All Wilson lines are taken to be outgoing, since we consider the case of jet production from e + e − collisions.
Throughout this paper we have considered the production of two jets, one of which has a possible two-prong substructure, in an e + e − collider. This implies the presence of at most three Wilson lines in the soft or collinear soft function. With only three Wilson lines, all possible color structures can be written as a sum of color-singlet traces. In the more general case, with more than three Wilson lines, the soft function is a color matrix which must be traced against the hard functions, which are also matrices in color space, appearing in the factorization theorem for the cross section (see e.g. refs. [110,179] for more details).
In appendix B through appendix E we will give operator definitions for the functions appearing in the factorization theorems in terms of matrix elements of the SCET operators, χ n,ω (x) and B µ n,ω (x), as well as products of soft Wilson lines. These matrix elements can be calculated using the leading power SCET Lagrangian, which can be found in refs. [79][80][81][82], or by using eikonal Feynman rules in the soft functions, and known results for the splitting functions to calculate the jet functions [180]. We will use the latter approach, as it considerably simplifies the calculations at one-loop. In this appendix we collect the calculations relevant to the calculation in the collinear subjets region of phase space, and explicitly show the cancellation of anomalous dimensions. The calculation follows closely that of ref. [77], with the exception of the form of the measurement function. Nevertheless, the calculation is presented in detail, as the SCET + effective theory has not been widely used.

B.1 Kinematics and notation
For our general kinematic setup, we will denote by Q the center of mass energy of the e + e − collisions, so that Q/2 is the energy deposited in a hemisphere. i.e. the four-momenta of the two hemispheres are We will also denote the energy in a jet at intermediate stages of the calculation by E J , but we will write our final results in terms of Q.
We work in the region where one hemispherical jet splits into two hard subjets, assume the power counting z ∼ 1 2 , with z being the energy fraction of one of the jets. We further assume the power counting relations between the energy correlation functions valid in the collinear subjets region, as discussed in section 2.2. We adopt the following notation to describe the kinematics of the subjets In the collinear soft region of phase space, we have n a · n b 1. When performing expansions, we can work to leading order in n a · n b , and must use a consistent power counting. It is therefore useful to collect some kinematic relations between vectors which are valid at leading power. These will be useful for later evaluations of the measurement function and integrand at leading power. These kinematics satisfy the following useful relations n · n a = n · n b = n a · n b 4 (B.7) n · n a =n · n b = 2 , (B.8) n ⊥a,b ·n ⊥a,b = −n ⊥a,b · n ⊥a,b =n ⊥a,b ·n ⊥a,b = n a · n b 2 .

JHEP05(2016)117
For a particle with the power counting of collinear sector a or b, we have the following simplified relations which are true to leading order in the power counting. Finally, we label the energy fractions carried in each subjet by where the second relation is true to leading power. The value of e (α) 2 is given to leading power by the subjet splitting In the collinear soft region of phase space, the 3-point energy correlation function is dominated by the correlation between two particles in different subjets, with a third collinear, soft, or collinear-soft particle. Depending on the identity of the third particle, the power counting of the observable is different. We begin by collecting expressions for the e (α) 3 observable for a single soft, collinear-soft, or collinear emission, which will be required for the one-loop calculations.
For three emissions, with momenta k 1 , k 2 , k 3 , the general expression for the three point energy correlation function is For an emission collinear with one of the subjets, where we have the splitting p a,b → k 1 +k 2 , we can write e (α) 3 entirely in terms of k 1 ·k 2 , n a ·n b , andn a ·k 1,2 , because there is a hierarchy between the opening angle of the dipole, and the opening angle of the splitting. At leading power it is given by For a soft emission off of the dipole, with momentum k, which cannot resolve the opening angle of the dipole, we have n a · k → n · k , n b · k → n · k , (B.18) -79 -

JHEP05(2016)117
at leading power. We then find e (α) where we have used the full expression for the energy of the soft particle, as it is not boosted.
For a third collinear-soft emission k off of the p a,b partons, for which there is no hierarchy between the opening angle of the dipole and the opening angle of the emission (i.e. a collinear soft emission), e (α) 3 is given by e (α) For the SCET operators involved in the matching calculation, we follow the notation of ref. [77], defining which is the usual SCET operator for e + e − → dijets, and which is the SCET + operator describing the production of the collinear subjets. Throughout this section, we will not be careful with the Dirac structure of the operators, as it is largely irrelevant to our discussion. With this in mind, we have not made the Lorentz indices explicit on the operators. Here we have chosen to write the Wilson line corresponding to the gluon in the fundamental representation. Note that the two stage matching onto SCET + makes it clear that the partonic configuration in which the two collinear subjets are both quarks is power suppressed. In the operators O 2 , O 3 , we have used Y to denote soft Wilson lines, and X, V to denote collinear-soft Wilson lines. In the definitions of the factorized functions below, we will refer to all Wilson lines as S, as after factorization, no confusion can arise.

B.2 Definitions of factorized functions
The functions appearing in the collinear subjets factorization theorem of eq. (3.8) have the following SCET operator definitions: • Hard matching coefficient for dijet production where C Q 2 , µ is the Wilson coefficient obtained from matching the full theory QCD currentψΓψ onto the SCET dijet operatorχ n Γχn When accounting for the Lorentz structure, there is a contraction with the leptonic tensor, which we have dropped for simplicity. See ref. [110] for a detailed discussion.

JHEP05(2016)117
• Hard splitting function: 2 , z a , µ is the Wilson coefficient in the matching from O 2 to O 3 , namely the relation between the following matrix elements (B.26) • Jet function: For simplicity, we have given the definition of the quark jet function. The gluon jet function is defined identically but with the SCET collinear invariant gluon field, B n a,b ,⊥ , instead of the collinear invariant quark field.
• Soft function: S nn e (α) • Collinear-soft function: In each of these definitions, we have defined an operator, E 3 (α) , which measures the contribution to e (α) 3 from final states, and must be appropriately expanded following the power counting of the sector on which it acts, as was shown explicitly in eq. (B.16), eq. (B.19), and eq. (B.20). These operators can be written in terms of the energy-momentum tensor of the full or effective theory [181][182][183][184], but we can simply view them as returning the value of e (α) 3 as measured on a particular perturbative state. The soft function is also sensitive to the jet function definition, which is included through the operator Θ R . To simplify the notation, we have strictly speaking only defined in the in-jet contribution to the soft function. Additionally, we assume that some IRC safe observable is also measured in the out-of-jet region, although this will play little role in our discussion, so we have not made it explicit.

B.3 Hard matching coefficient for dijet production
The hard matching coefficient for dijet production, H(Q 2 , µ), appears in the factorization theorems in each region of phase space. H(Q 2 , µ) is the well known hard function for the production of a qq pair in e + e − annihilation. It is defined by where C Q 2 , µ is the Wilson coefficient obtained from matching the full theory QCD currentψΓψ onto the SCET dijet operatorχ n Γχn. This Wilson coefficient is well known (see, e.g., refs. [77,110,185,186] ), and is given at one-loop by The branch cut in the logarithms must be taken as −Q 2 → −Q 2 − i . The hard function satisfies a multiplicative RGE, given by where γ C (Q 2 , µ) is the anomalous dimension for the Wilson coefficient, which is given to one-loop by

B.4 Hard splitting function
The hard splitting function can be calculated using known results for the one-loop splitting functions [187] or from the result for e + e − → 3 jets [188]. However, since at leading power the measurement of the 2-point energy correlation functions define the energy fractions and splitting angle, it is simplest to change variables in the results of ref. [77], where the hard splitting function matching was performed for jet mass. Using the notation t = s qg , and x = s qq /Q 2 , ref. [77] gave the matching coefficient to one-loop as We can now perform a change of variables to rewrite this in terms of e (α) 2 , and the subjet energy fractions, using the leading power relation of eq. (B.13), and the kinematic relations valid in the collinear subjets region of phase space. We find Note that the hard splitting function depends on the partons involved in the split, which in our case we have taken to be q → qg, and therefore singled out z q , which is the energy fraction of the quark jet (defined identically to z a , z b ). Throughout the rest of this appendix, we will, whenever possible, write results in terms of z a , and z b for generic partons, using general Casimirs. Since we consider the case q → qg, we will calculate the jet functions for both quark and gluon jets, and therefore the results in this appendix are sufficient to treat general two-prong substructure, where the prongs are associated with generic partons by using the hard splitting function for other partonic splittings.
For completeness, we also present the one-loop results for g → gg and g → qq splittings. While one-loop, and even two-loop, splitting helicity amplitudes exist in the literature [187,189,190], to our knowledge, the one-loop unpolarized splitting functions have not not been explicitly written down before. Using the results from refs. [187,190], the one-loop function for the g → gg splitting is (B.37) Here, we have expressed the result in terms of the numbers of colors, N , of the gauge theory and number of active quarks, n F . Note that C A = N . The virtuality of the splitting is where a and b denote the final-state gluons in the splitting. Its anomalous dimension to one-loop is

JHEP05(2016)117
For the one-loop result of the g → qq splitting, we have Note that, in terms of the number of colors,

B.5 Global soft function
In this section we calculate the global soft function. The global soft modes can resolve the boundaries of the jet, so the jet algorithm constraint cannot be expanded. However, the soft modes do not resolve the dipole of the collinear splitting. The global soft function therefore has two Wilson lines in the n andn directions. A general one-loop soft function can be written as where T i is the color generator of leg i in the notation of refs. [191,192], and the sum runs over all pairs of legs. Here we have only the contribution from i, j = n,n, but we still perform this extraction of the color structure to keep the results generic. The one-loop integrand for the soft function is given by and where here we have extracted the normalization factor following the expression for the three point energy correlation function in the soft power counting, given in eq. (B.19). The first Θ-function in eq. (B.43) implements the jet algorithm constraint, which is simple for a single emission. To simplify notation, we also use the following shorthand for the measure for a positive energy, on-shell, collinear particle

JHEP05(2016)117
To perform this integral, it is convenient to make the change of variables which factorizes the jet algorithm constraint and the measurement function. The integrals can then be evaluated using standard techniques. Performing all the integrals but the u integral, and transforming to Laplace space, e 3 →ẽ This can be integrated exactly in terms of hypergeometric functions, where we have used both a Pfaff and an Euler transformation to extract the singular behavior from the hypergeometric function. We therefore havẽ Expanding in (throughout these appendices we use the HypExp package [193,194] for expansions of hypergeometric functions) and separating in divergent and finite pieces, we find where Li 2 is the dilogarithm function.

B.6 Jet function
To calculate the jet function, we use the approach of ref. [180] and integrate the appropriate splitting functions against our measurement function. In the power counting of the jet -85 -

JHEP05(2016)117
function, we can expand the jet algorithm constraint The one-loop jet function in the n a direction is then given by (B.53) The two particle collinear phase space is given by [195] and and which includes both the g → gg and g → qq contributions. Explicitly, the integrand is then given by where we have extracted the normalization factor for simplicity, following the expression of eq. (B.16) for the three point energy correlation function in the power counting for the emission of a single collinear particle. Furthermore, note that we have used Q J = z a Q in this expression.

JHEP05(2016)117
The integrals can be performed using standard techniques, and we find, after transforming to Laplace space, e (α) 3 →ẽ (α) 3 , for the jet function in the n a directioñ for gluon jets, and for quark jets respectively. The jet function for the n b direction can be trivially found from a → b.
Here we have used L J,a α ẽ (α) 3 to denote the logarithm appearing in the jet functions. The argument of this logarithm depends on the subjet energy fraction. We indicate the specific logarithm for the subjet via the notation (B.62)

B.7 Collinear-soft function
We now calculate the collinear-soft function. The collinear-soft modes couple eikonally to the collinear sector, and so the collinear-soft function has the one-loop form where T i is the color generator of leg i in the notation of refs. [191,192], and the sum runs over all pairs of legs. Since the collinear-soft modes resolves the dipole from the collinear splitting, there are three Wilson lines, n a , n b ,n to which the collinear-soft modes couple. We calculate separately the contributions arising from the pair of legs n a , n b , and from the pairs n a,b ,n. In both cases the integral involves the jet algorithm constraint. In the power counting of the collinear-soft modes, this constraint can be expanded as If this expansion was not performed, the contribution of the collinear soft modes sensitive to the jet radius R, would be removed by a soft zero bin subtraction.

JHEP05(2016)117
n a , n b contribution: we begin by calculating the contribution from the emission between the n a , n b eikonal lines. The integrand is given by where we have extracted the normalization factor for simplicity, following the expression of eq. (B.20) for the three point energy correlation function in the power counting for the emission of a single collinear-soft particle.
To perform the calculation, we go to the light-cone basis defined by n,n. We then have n a · k = n · n a 2n · k +n · n a 2 n · k + k ⊥ · n a⊥ = n · n a 2n · k +n · n a 2 n · k − (n · kn · k) 1/2 |n a⊥ |cos θ , (B.67) where θ denotes the angle between the particle k and the n axis. In the above kinematic relations, we have made use of the fact that sincen ∼n a +n b , k ⊥ · n b⊥ = −k ⊥ · n a⊥ . Rewriting the integrand for a positive energy gluon in terms of θ, we find In the collinear soft region of phase space, we power count n a ·n b 1. We can therefore work to leading power in n a · n b in the integrand. Using the relations of eq. (B.7)-eq. (B.9), and expanding to leading power in n a · n b , we have Note that in our power counting, n·k ∼ n a ·n b , so that this expression scales homogeneously.
To perform the integral, we make the change of variables

JHEP05(2016)117
We then have The one loop expression for the collinear soft function can then be written The v integral is straightforward. Transforming to Laplace space, e 3 →ẽ (α) 3 , we find The θ integral can be performed exactly in terms of hypergeometric functions using The remaining integral in w is given by (B.80) Re-mapping the integral to the unit interval, we have  The first integral can be done in terms of hypergeometric functions, while the second can be done using plus functions (for a detailed discussion of their properties, see e.g. [147]), and applying the identity = 1 (2α − 2) + αlog(2) α − 1 + log(2) + −π 2 α 2 + 36α 2 log 2 (2) + 3π 2 α − 24αlog 2 (2)−2π 2 12(α − 1) .

(B.87)
Expanding in , and keeping only the divergent piece, as relevant for the anomalous dimensions, we find 3 ) = α s π 1 (α − 1) 2 + 2 α s π 2αlog(2) + log µN CS e γ E (ẽ As with the n a · n b contribution, we expand the integrand to leading power in n a · n b using n a · k n b · k = n · k + n a · n b 8n · k 2 − n a · n b 2 (n · kn · k) cos 2 θ , (B.92) n a ·n = 2 , (B.93) n a · k = n a · n b 8n · k + n · k − (n · kn · k) 1/2 n a · n b 2 cos θ . (B.94) To perform the integral, it is again convenient to make the change of variables n · k = v, n · k = vw n a · n b 8 .

JHEP05(2016)117
We then have n a · k n b · k = v 2 n a · n b 8 2 (1 − w) 2 + 4w sin 2 θ , (B.96) n a · k = n a · n b 8 v + vw n a · n b 8 − v 2 w n a · n b 8 1/2 n a · n b 2 cos θ = v n a · n b 8 1 + w − 2 √ w cos θ . (1−2α) poles, unlike the n a n b contribution, which are evident in the u → 1 and u → 0 limits respectively. We need to do the integral to O( ) to get the finite pieces, but only O( 0 ) to get the anomalous dimensions, which is sufficient for now. We have We then have Extracting just the divergent pieces so as to get the anomalous dimensions, we find where, as for the logarithm in the n a n b contribution, eq. (B.89), the logarithm that appears is . (B.114) Note that for both then n a andn n b contributions, and unlike for the n a n b contribution, we have 1/ contributions both of the soft form 1/ (1 − 2α), and of the collinear form, 1/(1 − α). This will be crucial to achieve the cancellation of anomalous dimensions, as required for the consistency of the collinear subjets factorization theorem.
It is interesting to note that this structure is very different than that which appeared for the case of the N -subjettiness observable in ref. [77]. In this case only a single angular exponent appears throughout the calculation, unlike both the 1/(1 − 2α) and 1/(1 − α) that we find here, and the divergent pieces of then n a andn n b contributions vanish.

B.8 Cancellation of anomalous dimensions
We now review the renormalization group evolution of each of the functions in the factorization theorem, and show that sum of the anomalous dimensions vanishes, as required for renormalization group consistency.
The hard function satisfies a multiplicative RGE, given by is the anomalous dimension of the dijet Wilson coefficient. Explicitly The anomalous dimension of the hard splitting function H 2 can be extracted from ref. [77] by performing a change of variables. It satisfies a multiplicative RGE µ d dµ H 2 (t, x, µ) = γ H 2 (t, x, µ)H 2 (t, x, µ) , (B.118) with anomalous dimension Here β 0 is defined with the normalization (B.120) -95 -

JHEP05(2016)117
Converting to e Since the anomalous dimensions of the jet, soft and collinear-soft functions are written in terms ofẽ (α) 3 , z a , z b , and n a · n b , for demonstrating cancellation of anomalous dimensions, it is convenient to replace e (α) 2 in eq. (B.121) with its leading power expression from eq. (B.13). We then have Note that 1 − z q = z g . The jet functions satisfy multiplicative RGEs in Laplace space (they satisfy convolutional RGEs in e Here C g,q is the appropriate Casimir (C A for gluon jets and C F for quark jets), and with γ g,q the standard functions For subjet b, we simply have a → b. Similarly, the soft function satisfies a multiplicative RGE in Laplace space The functions appearing in the soft subjet factorization theorem of eq. (3.17) have the following SCET operator definitions: • Hard matching coefficient for dijet production H(Q 2 , µ) = |C(Q 2 , µ)| 2 , (C.1) where C Q 2 , µ is the Wilson coefficient obtained from matching the full theory QCD currentψΓψ onto the SCET dijet operatorχ n Γχn qq|ψΓψ|0 = C Q 2 , µ qq|O 2 |0 . (C.2) As before, we have neglected the contraction with the Leptonic tensor.
• Soft subjet jet function: J nsj e The definitions of these functions include measurement operators, which when acting on the final state, return the value of a given observable. The operator E 3 (α) measures the contribution to e (α) 3 from final states, and must be appropriately expanded following the power counting of the sector on which it acts. Expressions for the expansions in the power counting of the different sectors will be given shortly, after kinematic notation has been set up. The operators Θ FJ , and Θ O constrain the measured radiation to be in the jet or out of the jet, respectively, and will be defined shortly.

C.2 Kinematics and notation
For our general kinematic setup, we will denote by Q the center of mass energy of the e + e − collisions, so that Q/2 is the energy deposited in a hemisphere. i.e. the four-momenta of the two hemispheres are We are now interested in the regime where there is a wide angle soft subjet carrying a small energy fraction, and an energetic subjet, carrying the majority of the energy fraction. We will label the lightcone directions of the energetic subjet by n,n, and the lightcone directions of the soft subjet as n sj ,n sj . We will use the variable z sj to label the energy fraction of the soft subjet, namely In this region of phase space, to leading power the value of the two point energy correlation function is set by the two subjets, and is given by e (α) 2 = 2 α/2 z sj (n · n sj ) α/2 . (C.10) The action of the measurement function E 3 (α) on a arbitrary state for each of the factorized sectors contributing to the 3-point energy correlation function measurement is given by BS X bs = k∈X bs N BSn sj · k Q n sj · k n sj · k α 2 X bs , (C.13) where, for simplicity, we have extracted the normalization factors N SJ = 2 5α/2 (n · n sj ) α , N HJ = 2 5α/2 z sj (n · n sj ) α , (C.15) N BS = 2 2α z sj (n · n sj ) α , N S = 2 1+3α/2 z sj (n · n sj ) α/2 . The in-jet restriction, Θ FJ , is given by Θ FJ (k) = Θ tan 2 R 2 − n · k n · k . (C.17) The jet restriction must also be expanded following the power counting of the given sector.
We will see that this is actually quite subtle for the soft subjet modes, since the angle between the soft subjet axis and the boundary of the jet has a non-trivial power counting.
In particular, the expansion of Θ FJ (k) is different for the soft subjet jet and boundary soft modes, and will demonstrate the necessity of performing the complete factorization of the soft subjet dynamics into jet and boundary soft modes. Finally, since we are considering the case where the out-of-jet scale B is much less than the in-jet scale, the operator must also be included in the definition of the soft subjet functions. This operators vetoes out-of-jet radiation above the scale B. The explicit expression for Θ O (B) expanded in the power counting of each of the factorized sectors can be found in ref. [76].

C.3 Hard matching coefficient for dijet production
The hard matching coefficient for dijet production, H(Q 2 , µ), is identical to that for the collinear subjets factorization theorem by hard-collinear-soft factorization, and is given in eq. (B.30).

C.4 Hard matching for soft subjet production
The hard matching coefficient H sj (z sj , θ sj ) is determined by the finite parts of the logarithm of the soft matrix element for a single soft state H sj (z sj , n sj ) = tr 0|T {S n Sn}|sj sj|T {S n Sn}|0 fin .
(C. 18) The virtual corrections of the effective theory cancel the IR divergences of this matrix element, giving a finite matching coefficient. This matrix element can be calculated from the square of the soft gluon current [196,197], which is known to two loop order [198,199]. The tree level hard matching coefficient for the soft subjet production is given by H sj(tree) nn (z sj , n sj ) = α s C F πz sj n ·n n · n sj n sj ·n .
(C. 19) The results of ref. [197] can be used to determine the soft subjet production matching from an arbitrary number of hard jets at one loop.

JHEP05(2016)117
phase space. This is because the power counting is identical in the two cases and the jet functions are only sensitive to the color of the jet that they describe. Therefore we do not repeat them here.

E.5 Collinear-soft function
The power counting for the signal is identical to the power counting for the collinear subjet region for the QCD background. However, the collinear-soft function contains only Wilson lines along the collinear subjet directions. The collinear-soft function for the QCD background was calculated in pairs of dipoles in appendix B.7, and therefore the contribution from a collinear-soft exchange between the n a and n b Wilson lines can simply be extracted from that calculation. The result for this contribution is given by (2) α − 1 +log(2)+ −π 2 α 2 +36α 2 log 2 (2)+3π 2 α−24αlog 2 (2)−2π 2 12(α − 1) , where we recall that the normalization factor is given by as defined in eq. (B.66). Also note that we have factored out the color generators, so that the collinear-soft function is defined as Since there is no global-soft function the cancellation of anomalous dimensions, to be discussed shortly, requires that only 1/(1 − α) contributions appear in the collinear soft function, as is observed.