Jet shapes for boosted jet two-prong decays from first-principles

Several boosted jet techniques use jet shape variables to discriminate the multi-pronged signal from Quantum Chromodynamics backgrounds. In this paper, we provide a first-principles study of an important class of jet shapes all of which put a constraint on the subjet mass: the mass-drop parameter ($\mu^2$), the $N$-subjettiness ratio ($\tau_{21}^{(\beta=2)}$) and energy correlation functions ($C_2^{(\beta=2)}$ or $D_2^{(\beta=2)}$). We provide analytic results both for QCD background jets as well as for signal processes. We further study the situation where cuts on these variables are applied recursively with Cambridge-Aachen de-clustering of the original jet. We also explore the effect of the choice of axis for $N$-subjettiness and jet de-clustering. Our results bring substantial new insight into the nature, gain and relative performance of each of these methods, which we expect will influence their future application for boosted object searches.


Introduction
In recent years jet substructure studies have received unprecedented attention and have been the focus of many theoretical and experimental studies. Most of this research has been carried out in the direct context of boosted new particle searches at the LHC. For reviews and detailed studies we refer the reader to Refs. [1][2][3][4] and references therein. The basic ideas that underpin such studies are simple to understand. A high p T resonance with a mass m p T will exhibit collimated decays where in a significant fraction of events the decay products would be reconstructed in a single "fat" jet. Tagging signal jets and removing jets arising from QCD background will thus rely crucially on detailed information about the jets themselves. In this context it is clear that valuable information will be obtained by studying the internal structure of jets in some detail.
Let us for example contrast the two-pronged hadronic decays of an electroweak boson (W/Z/H) with 1 → 2 QCD splittings. QCD emission probabilities are infrared enhanced, favouring soft splittings, and hence a QCD jet would typically consist of a single hard prong. On the other hand decays of electroweak bosons show no preference for soft splittings and this results in a more symmetric energy sharing which gives rise to jets with a characteristic two-pronged internal structure. Another important difference results from the colour neutral nature of electroweak bosons which results in a strong suppression of radiation at angles that are large compared to the opening angle between the hard decay products. Soft large-angle radiation in a signal jet would thus typically arise from emissions that are uncorrelated with the decay of the electroweak boson in question i.e. from initial state radiation (ISR) and underlying event (UE) as well as from pile-up. Such radiation serves to degrade signal peaks making them less visible and also pushes up the masses of background jets. It is therefore also desirable to eliminate this radiation. In the above context the two principal aims of a substructure analysis therefore emerge as identification of two hard prongs (tagging) and removal of uncorrelated soft radiation (grooming).
In recent years there have been many tools developed to achieve the above aims of tagging and grooming jets. These include the mass-drop+filtering methods [5], trimming [6] and pruning [7,8] amongst a whole host of other techniques. Monte Carlo event generator studies involving several of these techniques can be found in Refs. [1][2][3][4] and the original references.
Somewhat more recently there has also been the emergence of jet shape variables that directly attempt to quantify the N -pronged nature of a fat jet. Examples include the N -subjettiness variables [9][10][11] and the N -point energy correlation functions (ECFs) [12,13], both of which are designed to take on small values for particle configurations corresponding to N collimated subjets of a fat jet, which one can naturally associate to an N -pronged decay. These techniques typically put constraints on the gluon radiation patterns in a jet. We expect this to have a good discriminating power both at small and large angles because gluon radiation is different for colour-neutral bosons compared to coloured QCD jets. At small angles, gluon radiation tends to be larger in QCD jets, made of a mixture of quarks and gluons, than in resonances, which decay mostly into quarks. At large angles, this is an even bigger effect since one expects a strong suppression of the radiation from collimated colour-neutral resonance decays compared to QCD jets. It is interesting to notice at this stage that the large-angle region, which shape variables try to constrain, is also the region that is sensitive to initial-state radiation and the underlying event. One typically uses grooming techniques to mitigate these effects and, therefore, one may wonder about the effectiveness of shape variable constraints when combined with grooming.
For studies involving two-pronged (W/Z/H) signal jets the N -subjettiness ratio τ are known to provide good discrimination between signal and background, where β is a parameter (angular exponent) that enters the definition of both variables. We shall provide precise definitions of these variables in the following section. 1 There have also been several detailed studies carried out for both τ 21 and C 2 in the literature. Again, nearly all of these studies have been done using Monte Carlo event generator tools. As examples we refer the reader to the work carried out in the original references [9,11] while for more recent studies also including the implementation of these variables in multivariate combinations we refer to Ref. [4].
In contrast our principal aim here is to carry out analytical calculations for the above variables, based on the first principles of QCD. Such calculations have, for instance, been carried out for the mass-drop, pruning and trimming methods [14] and provided considerable new insight into the performance of those tools over and above what could be gained purely from Monte Carlo methods. We would therefore expect a similar level of information from analytical studies of the shape variables considered here. For our calculations in this paper we shall make the choice of β = 2, i.e. focus on τ (β=2) 21 and C (β=2) 2 for which calculations are relatively straightforward to perform. Detailed numerical studies of the dependence on β have been carried out in particular for C (β) 2 , in Ref. [12]. These studies found that in the transverse momentum range p T ∈ [400, 500] GeV for jet masses relevant to W/Z/H tagging, optimal β values ranged between 1.5 and 2. For larger masses the optimal β values were found to be smaller. An analytical understanding of the β dependence of discrimination power would also be desirable but is left to future work.
As we shall show explicitly later in the article, cuts on τ (β=2) 21 and C (β=2) 2 effectively serve to constrain subjet masses. Another similar variable, that has been far less investigated in the literature, is the parameter µ 2 of the mass-drop tagger (MDT) [5]. This is obtained by declustering a jet into two subjets and taking the ratio of the squared jet mass for the heavier subjet to that for the original jet. The original mass drop tagger uses a cut on µ 2 along with an energy cut designed to discriminate against soft splittings i.e. the y cut parameter of the MDT. It was shown in Ref. [14] that in fact in the presence of the y cut condition the dependence on µ 2 could essentially be neglected. In the present article we instead study the dependence on µ 2 without any y cut requirement and compare the discriminating power it provides, to that from similar variables i.e. τ (β=2) 21 and C (β=2) 2 . Note that while the standard mass-drop tagger recurses, successively undoing the last step of a Cambridge/Aachen clustering, until the cut on µ 2 (and the y cut condition) is satisfied here we study both recursive and non-recursive variants for each of the shape variables.
We carry out analytical studies for the jet mass distributions of QCD background jets with cuts on shape variables v < v max , with v = τ 21 , C 2 and µ 2 . We also study the probability for signal jets to pass the same cuts. We define ρ = m 2 /(p 2 T R 2 ), with m being the jet mass and work in the limit ρ 1 (relevant for boosted object studies) and v max 1 which is desirable to separate two-pronged structures from QCD background. Our analytical results aim only to capture leading-logarithmic accuracy although we also retain several sources of next-to-leading logarithmic corrections. We test our analytical results by comparing to fixed-order results from EVENT2 [15,16] to results from parton shower Monte Carlos and additionally carry out pure Monte Carlo studies of the impact of non-perturbative corrections. Since non-perturbative corrections are found to be large, we further examine with Monte Carlo studies the impact of grooming with SoftDrop [17]. This shows an important reduction of the nonperturbative effects. To avoid diluting the main message of this paper with additional technical considerations, we defer the study of groomed jet shapes to a forthcoming work.
Note that some level of analytic understanding for jet shapes already exists. For example, studies of the lowest-order Energy-Correlation Functions, C β 1 , have been carried out in Ref. [12]. Also, in the framework of Soft-Collinear Effective Theory (SCET) [18][19][20] and its extension SCET + [21], results for N -subjettiness have been obtained at the N 3 LL accuracy for signal jets [22] and studies of the Energy-Correlation Functions C β 2 and D β 2 [23] appeared as the present paper was being finalised. In contrast, rather than providing a high-accuracy calculation of a given method, the main aim of our work is a transparent comparison of different shapes for both signal and background jets with phenomenological applications in mind.
With that in mind, it is however interesting to compare our approach and results to what is obtained for D 2 in Ref. [23]. Besides using different approaches (SCET-based v. more standard pQCD language), the main difference between this work and Ref. [23] is that, to the best of our understanding in terms of the variable ρ and D 2 , the latter provides a NLL resummation 2 in ρ, regardless of the value of D 2 while our approach assumes small D 2 and treats log(D 2 ) and log(ρ) on an equal footing. 3 Therefore, the calculation in Ref. [23] has likely a higher accuracy, at least in the region used in many phenomenological applications. However, it is limited to D 2 while our main goal here is to discover the source of and address the main diffferences between various shapes. The results of Ref. [23] require at least four numerical integration (compared to a single one for our results), which, keeping in mind our purposes, makes a physical interpretation more involved.
This article is organised as follows: In the next section we provide detailed definitions of the shapes mentioned above. Following this, in section 3, we discuss the general form of the results obtained for all the shapes under consideration, both for signal and background jets. In section 4 we perform the detailed calculations for background jets for both non-recursive and recursive variants for each shape variable. In the same section we compare the expansion of our results to fixed-order results from EVENT2, as a check on our calculations. We also carry out comparisons to results from Pythia with only final state radiation (FSR) turned on, to give a direct comparison against our calculations. In section 5 we perform the calculations, checks and comparisons to Monte Carlo for signal jets. Following this, in section 6 we study the impact of non-perturbative corrections where we note the significant contributions from initial state radiation and the underlying event in particular. In order to obtain better control over such effects we combine shape variable studies with grooming using SoftDrop and study the impact on both signal and background efficiencies. In section 7 we discuss our findings in detail including an assessment of the comparative performance of all the shapes studied here. Finally we present our conclusions.

Radiation-constraining jet shapes
Among a large family of jet shapes, this paper will identify and focus on a series of variables all of which place constraints on the subjet mass. In this category, we will study the following three variables: where the sum runs over all the constituents of a given jet and a 1 , . . . , a N denote the partition axes. While the choice β = 1 is more common in experimental studies at the LHC -likely because of an expected smaller sensitivity to nonperturbative effects -, analytic studies have thus far mostly focused on β = 2. As argued eariler, the latter is expected to give better discriminative power. We decided to choose β = 2 for the present study because in that case, τ N acts like a measure of the subjet mass which allows for a direct comparison with the massdrop µ 2 cut. 4 w To fully define τ 21 , we still need to specify our choice for the partition axes a 1 , . . . , a N in (2.1). We shall consider the following three options: 5 the optimal axes which should minimise τ N ; the k t axes obtained by clustering the jet with the k t algorithm [29][30][31] and taking the N exclusive subjets; the generalised-k t axes with p = 1/2 (gen-k t (1/2)) obtained by clustering the jet with the generalised-k t algorithm (see Section 4.4 of [32]), with its extra parameter p set to 1/2, and taking the N exclusive subjets. 4 The choice β = 1 would fall in another category of observables, together with energy-correlation functions with β = 1 and Y-splitter [25]. A calculation similar to the one in this paper can be performed, although the situation is often more complicated. We leave the study of these variables for future work together with a comparison of the performance of the "β = 1" and "β = 2" shapes. 5 See also Refs. [26][27][28] for recent studies of axis choice for N -subjettiness.
The third option is new and leads to similar performance to the optimal axes at much smaller computational cost. The motivation to look into gen-k t (1/2) axes is that its distance measure behaves again like a mass, as does τ β=2 21 , and we can expect the resulting axes to be very close to the optimal axes. More generally, for τ β 21 with a generic β, we would expect the generalised-k t axes with p = 1/β to give a close-to-optimal result.
• a version of the mass-drop parameter [5], µ 2 which, given two subjets j 1 , j 2 in a given jet j is defined as µ 2 = max(m 2 j 1 , m 2 j 2 )/m 2 j . In its original formulation, the cut on µ 2 was applied in a recursive de-clustering of a jet obtained with the Cambridge/Aachen (C/A) algorithm [33,34]. The present definition of µ 2 is however defined non-recursively, i.e. as a cut that the jet j satisfies, or not, without any further de-clustering if it does not. Similarly to the definition of the N -subjettiness axes, we need to specify the procedure to separate the jet j into two subjets j 1 , j 2 . We will denote by µ 2 p the result obtained by undoing the last step of a generalised-k t clustering, with extra parameter p, of the jet j. We shall concentrate on µ 2 1/2 , since it follows the ordering in mass, and µ 2 0 since it corresponds to the historical choice. 6 • the energy correlation function double ratio. Here we again use β = 2, which will be kept fixed here, and define [12], and work with C 2 = e 3 /e 2 2 . Note that, at the order of accuracy targeted in this paper, we can alternatively use the recently-proposed D 2 = e 3 /e 3 2 , [13], since, up to our accuracy, they only differ by a rescaling by the total jet mass.
For any of these three shapes, v, a cut of the form v < v cut is expected to show good performance in discriminating two-pronged boosted objects from standard QCD jets. Note also that, if the cut is not satisfied, the jet is discarded.
Additionally, we shall also consider the cases where one of the three shape constraints introduced above is applied recursively. By this we mean that, for a shape v, we apply the following procedure: 1. recluster the jet j with the C/A algorithm, 2. compute v from j; if v < v cut , j is the result of the procedure and exit the loop, 3. undo the last step of the clustering to get two subjets j 1 and j 2 , define the hardest of j 1 and j 2 (in terms of their p t ) as the new j and go back to 2.
This is of course motivated by the original mass-drop tagger proposal [5], where a cut was placed on the µ 2 parameter. We have to note that, here, the recursion follows the hardest branch, as suggested in the modified version of the mass-drop tagger [14], rather than the most massive one, as in the original proposal.

Generic structure of the results
For QCD jets, there are two basic physical quantities that we will be interested in: the jet mass distribution after applying a given fixed, recursive or not, cut on one of the shapes described in the previous section; or the distribution of a jet shape for a given fixed value of the jet mass. The latter situation only applies to the non-recursive cases.
For signal jets, we are interested in jets of a fixed mass so the calculation will mostly focus on what fraction of these jets satisfy the constraint on the jet shape v, hence on the distribution of v for an object of a given mass. Jets which fail the constraint on v will be discarded.
Our calculations apply to the boosted regime, where the jet transverse momentum is much larger than its mass. In that context, it is convenient to introduce ρ = m 2 /(p t R) 2 , with R the radius of the jet. The boosted regime means that we can take the limit ρ 1. Furthermore, in this work, we shall focus on two-pronged decays, where we expect that the radiation-constraining shapes introduced above would be smaller for signal jets than for the QCD background. It is therefore natural to start the study of these shapes in the limit where they are small. In the following we shall thus also assume that the cut on the shape is small compared to 1. In this limit, we focus on the leading double logarithm 7 for which soft and collinear emissions can be considered as strongly ordered and the mass of the jet is dominated by the strongest of these emissions. Throughout the paper, we will therefore assume that this emission, dominating the mass of the jet, occurs at an angle 8 Rθ 1 and with a fraction z 1 of the jet transverse momentum p t . This has to satisfy the constraint for QCD jets we can neglect the (1 − z 1 ) factor which would only lead to subleading power corrections in ρ.
All the shapes, v, that we consider put constraints on additional emissions. This means that we can always consider, as a starting point, a system made of two partons -the "leading parton p 0 " initiating the jet and the "first, leading, emission p 1 " which sets the jet mass for QCD jets, or the two prongs of a massive boson decay for signal jets -and study additional radiation from this system.
In the leading-logarithmic approximation, the constraint on radiation will always take the form of a Sudakov suppression coming on top of the mass requirement. For QCD jets, the mass distribution with a cut on v can always be written as In the above R mass (ρ) is the Sudakov resumming the leading log(1/ρ) contributions to the plain jet mass and R v (z 1 , ρ) the extra contribution coming from the additional cut on v.
In the approximation we shall be working at, instead of P (z 1 ), it is sufficient to consider its leading logarithmic contribution from its 2C R /z 1 term and a subleading hard collinear contribution 2C R B i δ(z 1 − 1), where C R is the colour charge of a jet initiated by a parton of flavour i and B i is the integral of the non-singular part of the splitting function: Eq. (3.1) can therefore be replaced by Note however that keeping the full integration over the splitting function is sometimes useful in comparing background and signal efficiencies and can lead to potentially large subleading corrections. 9 For all the analytic plots in this paper, where the integration over z 1 is done numerically, we have decided to keep the exact P (z 1 ) splitting function and use Eq. (3.1).
If instead we want to obtain the probability to satisfy the cut on the shape v for a jet of a given mass one get (for the non-recursive versions): with R mass being the derivative of R mass wrt log(1/ρ). Note that the shapes we consider all require at least three particles in the jet to be non-zero, meaning that the distribution dσ/dρ| <v -or, equivalently, the double-differential distribution in both the mass and the shape, d 2 σ/dρdv -starts at order α 2 s . Conversely, Σ(v) will start at order α s , since it is normalised to the jet mass which itself starts at order α s .
At fixed coupling, the integration over z 1 can usually be carried out analytically. This however does not bring any additional insight on the underlying physics mechanisms and so will not be done explicitly. For the sake of clarity, we will give fixedcoupling results in the main body of the text, see Section 4, and defer the full results, including running-coupling corrections, to Appendix A (more precisely, Appendix A.2 for QCD jets). The analytic results presented for the radiator function R v in the main text therefore correspond to a fixed-coupling (modified) LL accuracy, i.e. they include the leading logarithms as well as the corrections due to the hard collinear splittings (the "B terms" in the forthcoming equations). Note that we treat logarithms of the shape and the jet mass on an equal footing. Hence, by leading logarithms, we mean, for fixed coupling, double logarithms of any kind, i.e. in either the shape or the jet mass or both. For the figures and the comparisons to Monte-Carlo simulations, we will also include the (leading order) running-coupling contributions as well as a few relevant NLL effects, discussed in Section 4.7 and Appendix A.
For signal jets, we will directly be interested in the efficiency, i.e. in the fraction of jets (of the original jet mass) that will satisfy the constraint on v. This can be written as where the signal "splitting function" P sig (z 1 ) is assumed to be normalised to unity. Again, we can either decide to keep the full integration over z 1 or, at our level of accuracy, keep only the dominant part without any z 1 dependence and the first log(1/z 1 ) and log(1/(1 − z 1 )) corrections. Note that here z 1 can no longer be neglected in the constraint on the jet mass, ρ = z 1 (1 − z 1 )θ 2 1 . For the illustrative fixed-coupling results given in Section 5, we will only keep the first corrections in log(1/z 1 ) and log(1/(1−z 1 )), while for the full results including running-coupling corrections given in Appendix A.3, we will include these factors in the resummation, mainly for simplicity reasons.
Given these basic expressions, our main task is to compute the Sudakov factors R v for all the shapes under consideration. We do that in the next two sections.

Calculations for the QCD background
The results below give the generic expression for the Sudakov form factor assuming one works in the (modified) leading-log approximation. It is helpful to clarify the notations once and for all: We assume, as stated before, that the angles are normalised to the jet radius R and we work with a jet initiated by a parton of flavour i. For a fixed mass ρ and momentum fraction z 1 , we have θ 2 1 = ρ/z 1 .

τ 21 cut (pure N -subjettiness cut)
We first consider the case where we impose a cut τ 21 < τ cut on the N -subjettiness of a jet of a given mass ρ. We are interested in the limit τ cut 1. 10 The first step is to find an expression for τ 21 in the limit where emissions are strongly ordered in angle and transverse momentum fraction. For this, let us assume that the second leading emission occurs at an angle θ 2 , wrt the leading parton p 0 , (initiating the jet) and carries a transverse momentum fraction z 2 of the leading parton.
The expression obtained for τ 21 in this limit depends on the choice of axes. It is useful to consider three specific options: • the optimal axes [11] which minimise τ 2 , • the k t axes, which take the 2 exclusive k t subjets as axes, • the gen-k t (1/2) axes, which also takes exclusive subjets as axes, except that this time, we use the generalised k t algorithm with p = 1/2.
We defer most of the technical discussions regarding how to obtain τ 21 for the above choices to Appendix B.1. In the end, the k t axes choice leads to a more complex phasespace, while the optimal and gen-k t (1/2) options are equivalent to taking the leading parton and the emission setting the mass (emission p 1 ) as axes, clustering emission p 2 with whichever axis is closest, and both lead to up to corrections which are beyond the LL accuracy we aim for here. 11 In what follows, we shall concentrate on the generalised k t axes choice since they are simpler than the optimal axes. Furthermore, we also have to consider secondary emissions, where the radiation is emitted from the gluon (z 1 , θ 2 1 ) itself. If z 2 denotes the fraction of the (first emitted) gluon energy carried by the extra emission at an angle θ 12 , with θ 12 < θ 1 due to angular ordering, we find where the different normalisation wrt Eq. (4.2) is purely due to z 2 being normalised to the gluon energy fraction z 1 .
In the limit of small τ 21 , additional emissions at smaller mass do not affect the result. The one-gluon emission will thus exponentiate according to eq. (3.1) and we get where the first line takes into account emissions from the leading parton p 0 while the second accounts for secondary gluon emissions from the first emitted gluon p 1 . The arguments of the strong coupling are given as factors multiplying the "natural" scale of the problem, p t R. The phase-space corresponding to the primary emissions is represented in Fig. 1a. For simplicity, we shall only quote results with a fixed coupling approximation in the main body of the paper. Results with a proper treatment of the running-coupling  corrections are presented in the Appendices. In this case, the final exponent does not depend 12 on z 1 and we find where, for quark jets, we have C R = C F and B i = B q = −3/4 while for gluon jets we

µ 2 cut
As for the case of N -subjettiness, we first have to find, given the emissions p 1 and p 2 with p 1 giving the dominant contribution to the mass, what is the value of the mass-drop parameter µ 2 . Since µ 2 is defined by undoing the last clustering step, it will depend on the jet algorithm we use to (re-)cluster the jet. The Cambridge/Aachen algorithm is a common choice but does not work here. Indeed, undoing the last step of a Cambridge/Aachen clustering would separate the emission at the largest angle from the rest of the jet, regardless of the transverse momentum of that emission. This is not infrared safe. We further discuss infrared-safety issues in Appendix C. Instead, we shall define µ 2 by undoing the last step of a generalised-k t clustering with p = 1/2. The motivation for this is the same as the motivation for the axes choice in the previous section: the generalised-k t algorithm with p = 1/2 follows closely the ordering in mass. To keep things unambiguous, we shall denote by µ 2 p the massdrop parameter obtained by undoing the last step of a generalised-k t clustering with parameter p. The (infrared-unsafe) case of a C/A clustering would correspond to µ 2 0 while we will be interested in µ 2 1/2 , although the calculation can be performed for any positive p.
Again, we leave the technical details of the calculation for Appendix B.2. In a nutshell, the hard parton and the first emission (setting the mass) will form two subjets, and the second emission, setting the subjet mass, will be clustered with whichever of these two subjets is closest. In the end, keeping in mind that, to our leading-logarithmic accuracy we can assume strong ordering in angle (θ 2 θ 1 or θ 2 θ 1 ), we find for θ 2 < θ 1 or (θ 2 > θ 1 and θ 2 < θ 12 ), for (θ 2 > θ 1 and θ 2 > θ 12 ), for secondary emissions.
There is a crucial difference between mass-drop and N -subjettiness: the latter can be seen as (1/p t ) j∈subjets m 2 j /p t,j which has an extra 1/p t,j compared to µ 2 1/2 . This leads to different expressions whenever the jet with the largest mass is not the one with the largest p t . The secondary emissions and large-angle radiations will therefore give additional suppressions for N -subjettiness compared to the mass-drop.
With similar arguments, it is easy to realise that additional emissions with smaller masses will not affect this calculation, so that, at leading-logarithmic accuracy, the lowest order simply exponentiates according to eq. (3.1). The vetoed phase-space for emissions is represented in Fig. 1b and we get For a fixed coupling approximation, we find

C 2 cut
For two strongly-ordered emissions p 1 (z 1 , θ 1 ) and p 2 (z 2 , θ 2 ), such that z 1 θ 2 1 z 2 θ 2 2 , one finds, for primary emissions, which is the same result as the one we obtained in the N -subjettiness case with an extra factor max(θ 2 1 , θ 2 2 ). 13 For secondary emissions, θ 12 θ 1 , hence θ 2 θ 1 and we have (with z 2 measuring the momentum fraction wrt emission 1) The corresponding phase-space is represented in Fig. 1c and gives For a fixed coupling approximation, one finds If we decide to work with D 2 = C 2 /ρ rather than C 2 , and define L d = log(1/D 2 ) = L e − L ρ , we get, assuming L d > 0,

Recursive τ 21 cut
We now move to the same calculations as above but apply the cut recursively declustering a C/A jet until the cut is met (see Sec. 2). The calculation of the shapes mostly remains unchanged but the recursion will affect the allowed phase-space for emissions. As before, let us assume that p 1 (θ 1 , z 1 ) is the emission that dominates the mass after the recursion procedure has been applied and see what constraints on the phase-space the cut imposes on additional emissions p 2 (θ 2 , z 2 ). 13 Contrary to what we have for µ 2 1/2 (see Appendix. D), Eq. (4.9) is continuous for θ 1 = θ 2 . Using the exact expression for θ 12 in the region θ 2 ≈ θ 1 will therefore not lead to (single) logarithmically enhanced terms.
For emissions at angles θ 2 smaller than θ 1 , the de-clustering will reach p 1 before p 2 , which corresponds to the same situation as for the non-recursive case. In fact it remains true for all shape variables under consideration in this paper that for such angular configurations the results from the recursive and non-recursive variants coincide.
Differences occur for emissions at angles larger than θ 1 . The physical reason for that comes from emissions at angles larger than θ 1 and which would dominate the mass, i.e. for which z 2 θ 2 2 > z 1 θ 2 1 . In the non-recursive case, these emissions are forbidden by our constraint on the jet mass and this is included in the Sudakov suppression for the jet mass R mass (ρ) in Eq. (3.1), which imposes that the mass of the jet is truly dominated by the (z 1 , θ 2 1 ) emission. In the situation where the cut on the shape is applied recursively, some extra care is needed since some of these emissions -that are vetoed in the non-recursive case because they would lead to a larger jet mass -can be simply discarded by the recursive procedure. In such a case, they should no longer be forbidden.
Compared to the non-recursive case, the vetoed region at large angle is therefore reduced.
In the above discussion, we tacitly assumed that we were working with the genk t (1/2) axes or with the optimal axes, but the argument is more general. We could also define τ 21 using the exclusive C/A axes, automatically available from the declustering procedure. Indeed, in that case, all emissions with z 2 θ 2 2 < ρ/τ would fail the cut on τ 21 and be discarded. We will come back to that point later on. Again, the lowest order result simply exponentiates and the Sudakov suppression, depicted in Fig. 2a is where we have subtracted R mass (ρ) which has already been included in (3.1). For a fixed coupling approximation, this gives The situation is mostly the same as for the recursive τ 21 cut. Here, the use of a recursive criterion allows to use either the subjets naturally given by the C/A declustering or the gen-k t (1/2) subjets. The results presented in this section are valid for both µ 2 0 and µ 2 1/2 , although, as we will see in the next paragraph, different axes choice yield the same answer for the mass distribution in different ways, and would give different answers for other observables. As before, for θ 2 smaller than θ 1 , the declustering has no effect and the results are as obtained in Sec. 4.2. The complication related to the clustering distance for θ 2 θ 1 is absent here because of the declustering, and only emissions with z 2 θ 2 2 > ρ/µ 2 have to be vetoed. In all other cases, either the mass-drop condition fails and the emission is simply discarded, or the mass-drop condition is satisfied but the mass of the jet remains z 1 θ 2 1 . 14 E.g., for the natural choice, µ 2 0 , all emissions in the region z 2 θ 2 2 < ρ/µ 2 0 will fail the condition and be discarded before the recursion continues. That said, the only remaining difference between a recursive µ 2 cut and a recursive τ 21 cut will be in the extra factor z 1 in the secondary emissions (see, e.g. Sec. 4.2) and we find For a fixed coupling approximation, we get where the C R contribution is the same as for the recursive τ 21 cut and the C A contribution is the same as for the non-recursive µ 2 1/2 cut.

Recursive C 2 cut
Again, the calculation unfolds as for the two recursive cases above with a contribution from "failed" conditions for θ 2 > θ 1 and a standard constraint for θ 2 < θ 1 . In the first case, e 2 (resp. e 3 ) is set by emission p 2 (resp. p 1 ) and θ 12 ≈ θ 2 . In the second case, e 2 (resp. e 3 ) is set by emission p 1 (resp. p 2 ) and θ 12 ≈ θ 1 , yielding 14 As for the axes choice in N -subjettiness, these regions will differ for µ 2 0 and µ 2 1/2 .
The Sudakov exponent will ultimately be given by For a fixed coupling approximation, we obtain

Towards NLL accuracy
In this article, as we have stated before, we are aiming to achieve only a (modified) leading-logarithmic description of the shape variables we study here. This level of approximation has already been demonstrated to capture the main physical features of various jet tagging and grooming tools (see e.g. Refs. [14,37] ). Nevertheless it may ultimately prove important to extend the scope of our current studies in various directions. One potential reason for this could be that here we study tools that have some broad similarities e.g. all of them place constraints on subjet masses. In order to understand in more detail the differences between these tools it would be helpful to increase the accuracy of our analytical predictions, so that differences that may arise beyond LL effects are effectively highlighted. We would also expect such differences to show up in the Monte Carlo event generator studies, like those carried out below, since event generators would partially capture many sources of subleading corrections.
Secondly we do not study here the question of optimal values of cuts on subjet variables, mainly confining ourselves to the region with both v cut and ρ 1. To meaningfully explore the dependence on v cut and ρ over a broader range of values of the variables concerned, one may need to carefully investigate effects beyond leadinglogarithmic level including the role of hard non-logarithmically enhanced contributions.
With such future developments in mind we discuss below several extra ingredients that are required to reach NLL accuracy: soft-and-large-angle contributions, multiple emissions, the two-loop β function for α s , finite z 1 corrections and non-global logarithms [38].
For the figures where we compare to Monte Carlo simulations, we will include multiple emission effects (numerically important; see below for their effect on the radiator function), two-loop running coupling corrections (trivial to add, see Appendix A.1) as well as finite z 1 corrections (important for the physics discussion; see Appendix A.4).
We have not included in our analytic results contributions which are power-suppressed in the jet radius R. Although they would be relevant for a full phenomenological prediction, and can be substantial at the peak of the distributions (see e.g. Section 5 of [39]), these are expected to have little impact when comparing the discriminative power of different jet shapes. Moreover, they would be further reduced by the combination with a grooming procedure which, as we argue in Section 6, is the natural future direction of this work.
Soft-and-large-angle radiation. A source of single-logarithmic corrections comes from radiating soft gluons at large angles. This would correspond to all the limits beyond the strict collinear ordering that we have adopted until now i.e. it can come from either The first two regions would give single-logarithmic corrections proportional to R 2 . In the small-R approximation we have adopted so far, these would further be suppressed. At the same order of accuracy, one would also have to include contributions coming from initial-state radiation and potential colour-correlation with the recoiling partonic system [39]. Taking these into account would also add single-logarithmic contributions to the mass distributions. This significantly complicates the discussion, especially for signal jets, where the mass would no longer be identical to the boosted heavy-boson mass and we would have to impose a certain window around the signal mass. In practice, therefore, one usually applies these techniques together with some grooming procedure which would drastically change this discussion. Some first results have already been obtained in [40] for grooming techniques and we reserve for future work the addition of radiation constraints to that discussion. We will comment on that a bit further in Section 6.
The situation for θ 1 ∼ θ 2 is a bit more involved and we show in Appendix D that it would only contribute to single-logarithmic corrections suppressed by θ 2 1 . These contributions are also at most proportional to R 2 , although since radiation constraints tend to take most of their discriminative power from the large-angle region θ 2 > θ 1 , it makes sense to consider a region θ 1 R. In that case, the contribution from the θ 1 ∼ θ 2 region would be even further suppressed.
Multiple emissions. Multiple gluon emissions also bring single-logarithmic corrections to our results and we briefly discuss below how to account for them for the non-recursive variants of the shapes. They correspond to cases where several gluon emissions, (z 2 , θ 2 ), . . . , (z n , θ n ), are only strongly ordered in angle and give similar contributions to the shape v, i.e. when v(z 2 , θ 2 2 ; z 1 , θ 2 1 ) ∼ · · · ∼ v(z n , θ 2 n ; z 1 , θ 2 1 ). This will come with a single-logarithmic correction α n−1 It is important to realise that we will keep working in the v 1 limit and so neglect the contribution where all the z i θ 2 i , i ≥ 2, are of the same order as z 1 θ 2 1 . This would also give a single logarithmic correction of the form α n s L n ρ f n (v). Up to power corrections, we can take f n constant and this correction would therefore simply be equivalent to the multiple-emission correction to the plain jet mass, cancelling against the corresponding normalisation in the spectrum of v. 15 So, from now on, we focus on the region where all the z i θ 2 i , i ≥ 2, are much smaller than z 1 θ 2 1 and compute the corresponding correction to R v (z 1 ) for a fixed z 1 .
The case of N -subjettiness and energy-correlation functions are mostly straightforward. In the kinematical configurations under consideration, the (optimal or gen-k t ) N -subjettiness axes will still align with the jet axis and with the emission (z 1 , θ 1 ) setting the mass. At a given z 1 , both τ 21 and C 2 will therefore be additive and the correction to The situation is a bit more involved for the mass drop parameter. Had we defined µ 2 as (m 2 j 1 + m 2 j 2 )/m 2 , µ 2 would have been additive and the similar conclusion as for τ 21 and C 2 would have been reached. Since µ 2 is defined as a maximum over the two subjets rather than a sum, we should instead use the fact that the condition µ 2 < µ 2 cut will be satisfied if both m 2 j 1 < µ 2 m 2 and m 2 j 2 < µ 2 m 2 . In practice, the emissions will either be clustered with the original hard parton or with the emission setting the mass. How exactly the particles in the jet are sifted in these two sets can depend non-trivially on the details of the clustering. If we take as an approximation, the assumption that particles behave independently, they will be clustered with the hard parton or the emission setting the mass according to which is geometrically closer, in a way similar to the heavy-jet mass in e + e − collisions [41]. If we split R µ 2 1/2 (z 1 ) in two contributions according to whether the emissions are clustered with one or the other of the subjets, and each of these two parts become additive and we obtain the following correction to R µ 2 This is however only an approximation and we leave a more precise treatment for future work. At this stage, it can also be seen as the fact that, compared to Nsubjettiness and energy-correlation functions, the mass-drop parameter is more delicate to tackle analytically.
Before going to comparisons with Monte Carlo simulations, we can observe that the two axes of 2-subjettiness can be viewed as partitioning the jet in two subjets, one with the jet constituents closer to the hard parton, one with those closer to the emission setting the mass. If instead of summing over all particles in the jet we were summing independently over the contributions of each of the two subjets and defining a modified 2-subjettiness as the maximum of these two contributions, the resummation of multiple emissions for that observable would follow Eq. (4.23). However, since Γ(1 + R 0 )Γ(1 + R 1 )/Γ(1 + R 0 + R 1 ) < 1 we should expect this variant of 2-subjettiness to perform worse than its original definition. Conversely, defining the mass-drop parameter as (m 2 j 1 + m 2 j 2 )/m 2 j would not only make its analytic behaviour simpler but could also translate into a slightly more efficient tool.
Two-loop running coupling. The inclusion of the two-loop β function is purely a technical complication. In the results presented in Appendix A, we have included their effects.
Finite z 1 corrections. Finite z 1 corrections would typically give contributions to R(z 1 ) like α s log(1/v) log(1/z 1 ) or α s log(1/v) log(1/ (1 − z 1 )). The first of these two terms, integrated over the 1/z 1 part of the splitting function corresponding to the first emission, will give a double-logarithmic contribution that we already have included. The second term, as well as the first term integrated over the non-singular contributions to the P (z 1 ) splitting function will become important at NLL accuracy. Indeed, after integration over z 1 , they would give corrections proportional to α s L v which contribute at the single-log accuracy. To properly include these corrections, it is sufficient to integrate over the full P (z i ) splitting function (rather than just including the finite piece as a B i term) and to keep the full z 1 dependence when we calculate the shapes in order to get single-logarithmic corrections to R(z 1 ).
The corresponding results are presented in Appendix A.4. It is interesting to note that their calculation allows for a nice physical discussion of similarities and differences between background and signal jets. Unless explicitly mentioned, these results will be used for the figures in this paper.
Non-global logarithms. Non-global logarithms are known to be difficult contributions to handle, especially if we want to go beyond the large-N c approximation, where a general treatment is still lacking. We will not provide an explicit calculation of their contribution in this paper. We note however that it might be beneficial to apply grooming techniques such as SoftDrop which are known to eliminate the contributions from non-global logarithms.

Comparison with fixed-order Monte-Carlo
As a partial cross-check of our results, the expressions obtained above can be expanded in a series in α s and compared to EVENT2 [15,16] simulations. Here we compare the (non-recursive) τ 21 , µ 2 1/2 and C 2 distributions at order α s . Note that since we are using the N -subjettiness implementation from FastJet contrib, we have to use pp coordinates (transverse momentum, rapidity and azimuth) rather than e + e − ones (energy and polar coordinates). 16 To maximise the efficiency and provide quark jets with a monochromatic p t , events are rotated so that their original 2 → 2 scattering gives 2 jets at y = 0. 17 After that rotation, jets are reconstructed with the standard (pp) anti-k t algorithm [42] with R = 0.4. 16 Alternatively, we could have used an e + e − implementation of the jet shapes (and clustering) together with unmodified e + e − events. Such an implementation is already readily available in the fastjet-contrib implementation of Energy Correlation Functions. This would however give the same logarithms as in our pp study so we decided to stay with a single coordinate system throughout this paper. 17 Given the block structure of EVENT2 events, each event can be uniquely associated with a corresponding event with 2 partons in the final state. The latter can be used to define the event rotation. Another approach would be to rotate the event so as to align its thrust axis at y = 0.
On the analytic side, we take the fixed-order results 18 , expand (3.5) to first order in α s , and perform the z 1 integration.
For N -subjettiness, starting from (4.5) we get For the mass-drop parameter, we use (4.8) and reach Finally, for the energy correlation function, we start from (4.12) and obtain The comparison with EVENT2 is presented in Fig. 3 where we have plotted the shape distributions at order α s together with our analytic prediction. In these plots, a constant factor α s /(2π) has been factored out. From Fig. 3, we see that this difference goes at least to a constant at large L v , meaning that we do control the leading logarithmic behaviour.
In principle, one can also wonder if the constant term can be obtained from an analytic calculation, which is, strictly speaking, beyond our leading-logarithmic accuracy. For example, we have included in equations (4.24)-(4.26) corrections coming from the hard part of the splitting function. However, we have neglected large-angle contributions proportional to R 2 and expected to be small for R = 0.4, as well as possible finite z 1 corrections. It is unclear from Fig. 3 whether or not this fully accounts from the apparent constant value observed at large L v . In this respect, it is also interesting to note that, contrary to the jet mass where besides the logarithmic and constant terms we would only have power corrections, the constant term in the L v expansion  has some corrections proportional to 1/L ρ , coming from the normalisation of the shape distributions by the jet mass cross-section (see Eq. (3.5)). These terms can make the convergence slower.
To extract more precise information, we have fitted, in each bin of the jet mass, the coefficient of L v and the constant term. This has been done in each colour channel and reported in Fig. 4. Again, we see a good agreement for the linear rise with L v as well as for the constant terms proportional to C A and N f . The slow convergence of the C F term is related to the above discussion.
More precise statements would require going to larger values of L v and L ρ . This is difficult to explore due to limited machine precision.

Comparison with parton-shower Monte-Carlo
Our resummed analytic results can be directly compared to parton-shower Monte Carlo event generators such as Pythia [43] or Herwig [44]. To do this, we have generated QCD dijet events in 14 TeV pp collisions simulated with Pythia. We have selected anti-k t (R=1) jets with a transverse momentum of at least 3 TeV.
For our analytical predictions, we have used the results from Appendix A.4, which, unless explicitly mentioned otherwise, include all the computed global NLL corrections discussed in Section 4.7. We have fixed α s (M z ) = 0.1185 with N f = 5 and frozen the coupling at µ f r = 1 GeV. 19 In Fig. 5, we compare the analytic results obtained for the distribution of Nsubjettiness, the mass-drop parameter and the energy-correlation functions, at a given jet mass, with the same distributions obtained with Pythia at parton-level, including only final-state radiation. First of all, if we look at the large L v region, where our analytic description is valid, we see that it does reproduce nicely the Pythia simulations. However, at smaller L v , Pythia tends to produce more peaked distributions than what we obtain analytically. 20 In any case, the main message that one has to take from this comparison is that the generic ordering between the different shapes is well captured by our analytic calculations.
Instead of plotting the distributions themselves, we can instead look at the mass distributions. This has the advantage that we can also consider the recursive versions of the cuts on the shapes. In Fig. 6, we plotted the ratio of the mass distribution obtained after a given cut, L v > 2.4, applied recursively (dashed lines) or not (solid lines) on our three shapes, divided by the jet mass distribution without applying any cut. Globally, our analytic calculations tends to reproduce the main features of the Monte Carlo simulations, although they show longer tails at small masses. Note that for these plots, we have used D 2 instead of C 2 since, compared to the latter, the former peaks at values of L v closer to the other two shapes. Furthermore, since we have not computed multipleemission corrections for the recursive versions of the shape constraints, we have also left aside the multiple-emission corrections to the non-recursive versions for the analytic results plotted in Fig. 6. It is interesting to notice that including the multiple-emission corrections for the non-recursive shapes tends to reduce the tails towards small mass, bringing more resemblance to the Pythia results. We could expect a similar behaviour for the corresponding recursive versions. Finally, we want to investigate how the three shapes we have considered are affected by initial-state radiation (ISR) and non-perturbative effects such as hadronisation and the Underlying Event (UE). To get an insight about the importance of these effects, we have looked, for each jet mass, at the cut on L v that has to be applied to obtain a 25% tagging rate compared to the plain jet mass. This is plotted in Fig. 7 where we see  Energy correlation FSR parton hadron hadr+UE Figure 7: As a function of the jet mass, value of the cut on a given shape, log(1/v cut ) which would correspond to a 25% tagging rate. Results correspond to dijet events obtained with Pythia with p t,jet > 3 TeV. The various curves correspond to different levels of the simulations. The three plots, from left to right, correspond to N -subjettiness, the mass-drop parameter and the energy-correlation function.
that, as expected, the cuts are quite sensitive to ISR and the UE, with hadronisation effects remaining relatively small.
We attribute this behaviour to the sensitivity of the shapes to soft and large-angle radiation. We also see that the energy correlation function tends to be more sensitive to these effects than N -subjettiness and the mass-drop parameter.
These conclusions however have to be taken with a bit of care since the mass of the jet itself will also be subject to the non-perturbative effects. In practice, one would rarely use such a cut without some additional grooming of the jet, limiting the nonperturbative effects at least on the reconstruction of the jet mass. We will come back to this point later, in Section 6.

Calculations for the signal
We now turn to the case of signal jets, i.e. jets coming from boosted colourless objects that decay into a qq pair (or a pair of gluons), like a W , Z or Higgs boson, or a photon.
As already briefly discussed in Sec. 3, the splitting of such a boosted object X into a qq pair differs from a QCD gluon emission in the sense that it does not diverge as 1/z at small transverse-momentum fraction. This means that, although we are still in the regime ρ 1 and we shall still consider the limit of small v for all jet shapes v we study in this paper, now L 1 = log(1/z 1 ) is no longer large. As for the case of QCD jets, we shall write the results as a function of z 1 , see eq. (3.6), but now we will keep the correction in z 1 and 1 − z 1 . These finite z 1 corrections would generate single-logarithmic terms under the form of contributions with one logarithm of z 1 or 1 − z 1 and one logarithm of ρ or v. It is illustrative to expand out results in series of log(1/ρ) and log(1/v) to see explicitly how these terms appear. We shall do this in this Section and use a fixed-coupling approximation to better highlight the physics behind our calculation. In Appendices A.3 and A.4, we give the results with a running coupling. In that case, we found it easier to keep the z 1 dependence without making an explicit series expansion, knowing that both results are equivalent at single-logarithmic accuracy.
Besides the careful inclusion of the z 1 and 1−z 1 dependence, the calculation follows the same logic as what has been done above and mostly consists of two copies of the contribution from "secondary emissions" in the QCD case, one for each of the decay products of the boosted colourless object. The contributions from each parton will just differ by the replacement z 1 ↔ (1 − z 1 ). For simplicity, we still use L 1 = log(1/z 1 ) and additionally introduce L − = log(1/(1 − z 1 )).
Finally, as was already seen to be the case for the secondary emission contributions for QCD jets, the results presented in this section apply invariantly for the recursive or non-recursive versions of the shapes.

τ 21 cut
Following the same construction as in Section 4.1, we find that for an emission off the parton carrying a momentum (1 − z 1 )p t , we have This leads to . For a fixed coupling approximation, and keeping only the first non-trivial terms in L 1 and L − , terms we find

µ 2 cut
As for the case of QCD jets discussed in Section 4.2, expressions for µ 2 differ from the N -subjettiness ones due to the fact that the p t normalisations are different.
For an emission off the parton carrying a momentum (1 − z 1 )p t , we have This leads to Note that formally the Θ constraint above will result in the condition Θ(µ 2 < (1−z 1 )/z 1 ) but this will only lead to power corrections in µ 2 and can hence be neglected.
For a fixed coupling approximation the extra contributions from the two legs thus cancel, giving Note that in the case of the signal, the calculation for µ 2 0 would lead to the same result. However, other effects like soft and large-angle gluon emissions that we have neglected here would appear at the same order and lead to an infrared divergence for µ 2 0 .

C 2 cut
This time for emissions off the parton carrying a momentum (1 − z 1 )p t , we find This leads to For a fixed coupling approximation, we get Again, formally the extra factor z 2 1 (1 − z 1 ) will enter in the Θ(L e > L ρ ) condition but its effect is only power corrections and then can be neglected.

Integration over the z 1 splitting
For most of the splitting relevant for phenomenological studies, the splitting function in terms of z 1 is expressed as z k 1 (1 − z 1 ) k or as a linear combination of such terms (typically, only k = 0 and k = 1 are needed for W/Z/H or photon signals).
Introducing B 2 (x) = B(x, x) = Γ 2 (x)/Γ(2x), the integration over z 1 can be performed in the fixed-coupling approximation, using with p a number varying from one shape to another.

Comparison with fixed-order Monte-Carlo
Similarly to what was presented in Section 4.8 for QCD jets, we can compare our results with EVENT2 simulations. In this case, we boost the event along the z axis and rotate it to obtain boosted photons decaying to a jet at y = 0. The expansion of the above results to first order in α s gives, after integration over In the above expressions, a γ = 3 2 a 0 − 1 2 a 1 = 13 6 with a 0 = 2 and a 1 = 5 3 . The comparison of these analytic results with EVENT2 simulations is presented in Fig. 8 and shows a good agreement. It is also interesting to notice that the convergence seems faster than it was for QCD jets, probably due to the fact that here the jet mass is fixed.

Comparison with parton-shower Monte-Carlo
As for the case of the QCD background jets, we want to compare our analytic calculations to parton-shower Monte Carlo simulations. This time, we used Pythia to generate ZZ events with both Z bosons decaying to hadrons. To match the jet selection of Section 4.9 in the case of QCD jets, we have selected anti-k t (R = 1) jets with p t ≥ 3 TeV and artificially varied the mass of the Z boson to scan over the ρ range. The distributions obtained for the shapes are plotted on Fig. 9 for Z bosons decaying hadronically. As for the case of QCD jets, we see a good overall description of the features of the distributions and of the differences between the three shapes, particularly in the large L v region which is targeted by our calculation.
Based on the results for both the signal and the QCD background, we have plotted a set of ROC curves on Fig. 10 obtained by varying the cut on the three shapes for a given value of the jet mass. Note that here, the signal and background efficiencies are normalised to the sample of jets that are within the mass window under investigation. The main result here is that a cut on the energy correlation function is more efficient at rejecting the QCD background than a cut on N -subjettiness, itself performing a bit better than a cut on the mass-drop parameter. This behaviour is clearly seen in both the Pythia simulations and our analytic calculations. 22 We leave a detailed discussion of this comparison for Section 7.

Non-perturbative effects and combination with grooming
We have already seen in Section 4.9 and in Figure 7 that initial-state radiation and non-perturbative effects can have a large impact on the shapes we have studied. One difficulty in trying to assess these effects is that they do not only affect the different shapes we are interested in but also the jet mass and hence our selection of a sample of jets with a mass lying within a given window.
To make a physically meaningful comparison, we have to adapt our normalisation of the background and signal efficiencies compared to what we used to produce Figure 10. Instead, we shall now compute the efficiencies as the fraction of the jets passing the initial p t cut which satisfy both the constraint on the mass and the constraint on the shape. In such a case, as the cut on the shape increases, the signal and background efficiencies progressively increase to ultimately reach an endpoint, common to all shapes, where just the cut on the mass is effective.
As before, we work with anti-k t jets with R = 1 and impose a p t cut of 3 TeV. For the signal, we used a massive Z boson with a mass of 217 GeV and impose the constraint on the mass that 5 < log(p 2 t R 2 /m 2 ) < 5.5. 23 Here the background is taken as quark-only to match with the results presented in the previous sections.  Figure 11: Effects of the initial-state radiation (green), hadronisation (blue) and Underlying Event (black) on the ROC curves, compared to pure final-state radiation (red). In all cases, we impose that 5 < log(p 2 t R 2 /m 2 ) < 5.5. The left, central and right columns correspond to τ 21 , µ 2 1/2 and C 2 , respectively. For the top row, the mass and shape constraints are imposed on the plain, ungroomed, jet. For the plots on the bottom row, we have first applied a SoftDrop procedure with β = 2 and z cut = 0.1 before imposing the mass and shape constraints.
The top row of Fig. 11 show the ROC curves obtained for our three shapes starting from events including only final-state radiation effects at parton level (in red) and adding successively initial-state radiation (in green), hadronisation effects (in blue) and the Underlying Event (in black). We clearly see large deviations from what we observe for pure FSR results, noticeably when adding initial-state radiation and the Underlying Event. Concentrating on the endpoint of these curves, where the cut on the shapes has no effect, we see that these effects are already present when applying the initial mass cut.
In practice, when working with large-R jets, one usually first applies a grooming procedure in order to obtain, at the very least, a good resolution on the jet mass. The bottom row of Fig. 11 shows the same plot as on the top row, now obtained by first grooming the jet with the SoftDrop procedure [17], using z cut = 0.1 and β = 2, before imposing the cut on the mass and on the shapes. Although this reduces the performance observed on events with pure final-state radiation, this has two positive effects: (i) it stabilises remarkably the ROC curves against initial-state radiation and non-perturbative effects, and (ii) at full parton level it even gives better performance than without the grooming procedure. Again, the ordering between the three shapes remains the same, albeit with strongly reduced differences compared to the plain jet case.

Discussion and conclusions
In this paper, we have provided a first-principles comparison of the performance of three common jet-shapes -N -subjettiness, the mass-drop parameter and Energy-Correlation Functions -used to discriminate boosted two-prong decays from QCD jets. In order to ensure infrared safety, we have defined the mass-drop parameter based on the subjets obtained via a clustering with the generalised k t algorithm with the extra parameter p set to 1/2. Similarly, for N -subjettiness, we find that using the exclusive gen-k t (p = 1/2) algorithm is an efficient alternative to the more complicated optimal axes. The usage of the gen-k t algorithm is closely connected to the fact that it respects the ordering in mass, which is helpful in our situation where we work at a fixed jet mass and study shapes that have a mass-like behaviour.
The main observation from our analytical results and simulations involving only final-state radiation is that there appears to be a clear ordering in the discriminating power of the shapes we have studied: the energy-correlation function ratio is more powerful than the N -subjettiness ratio which, in turn, is more powerful than a cut on the µ 2 parameter.
Our results indicate a Sudakov suppression of both the signal and the background for v 1. This suppression is however more powerful for the background for two major reasons. Recall that, since we work at a fixed jet mass, both the QCD jets and the signal jets can be seen as two-pronged objects. 24 A cut on the shape thus constrains additional radiation from that system. Given that, discrimination power comes from constraints on radiation at angles smaller and larger than the opening angle between the two prongs. For large angles, the cut on the shape only affects the background due to the colour-singlet nature of the signal. At small angles, the radiation from each of the two prongs is proportional to their colour factors, which tend to be larger  Figure 12: Background fake rate for a 25% signal efficiency as a function of the jet mass. As above, we used R = 1 and p t,jet > 3 TeV for the Pythia simulation (left plot) and p t = 3 TeV, for the analytic calculation (right plot).
for QCD jets, involving gluons in their two-prong decay, than for resonances mostly decaying to quarks. 25 Since we know from experience with quark-gluon discrimination that exploiting differences in colour factors only lead to moderate discrimination power [12,[46][47][48], we expect that the large-angle effect would be the main source of difference in tagging two-body decays. The ordering in discrimination power between the different shapes can also be understood from that viewpoint. Say we work at a given signal efficiency. The corresponding cut on the shape would determine the constraints on small-angle radiation for both the signal and the background (up to colour-factor effects discussed above). Once this is fixed, one has to look at the constraint put on the large-angle radiation for QCD jets. In that region, it is clear from our results, that the radiation veto imposed by a cut on C 2 is more constraining than that imposed by a cut on τ 21 , itself more constraining than a cut on µ 2 . This can be deduced from Fig. 1: fixing the signal efficiency amounts to fix the rejected region at small angle and once this is held equal for all three shapes, the vetoed region at large angle shows a clear ordering between C 2 , τ 21 and µ 2 . 26 This statement can be made more quantitative from our analytic results. First, the difference between τ 21 and µ 2 mostly comes from the large-angle region where gluon emissions are clustered with the gluon setting the mass. The extra z 1 factor in the expression for µ 2 compared to τ 21 , see Eq. (4.2) v. (4.6), results in a smaller vetoed region for µ 2 . Parametrically, this region scales like α s log(1/θ 2 1 ) log(1/v) ∝ α s log(1/ρ) log(1/v). This can be deduced algebraically from our results by fixing the signal efficiency and computing the background for the corresponding cut (with additional α s log 2 (1/v) terms also coming from the small-angle region). In the case of C 2 , the constraint at large angle now becomes proportional to θ 4 2 , see Eq. (4.9), and this translates into an additional vetoed region compared to τ 21 which is proportional to α s log 2 (1/θ 2 1 ) ∝ α s log 2 (1/ρ). In conclusion, we expect the ordering between the shapes to be more visible when increasing the boost of the jet. This difference should also grow faster with p t /m when comparing C 2 and τ 21 than for τ 21 and µ 2 . This is indeed what is observed from both pure-FSR Monte-Carlo studies and from our analytic calculations, as seen in Fig. 12, where we have plotted the background rejection rate for a 25% signal efficiency as a function of log(1/ρ) = log(p 2 t R 2 /m 2 ). 27 Note that our explanation of the differences between C 2 and τ 21 is consistent with a similar observation made in [12] but our more detailed analytic treatment allows for more quantitative understanding.
The next important observation is that, without grooming, the shapes are significantly affected by ISR and non-perturbative effects, UE in particular. These modeldependent effects can be substantial enough to wash out or even invert the differences between the shapes observed from pure FSR and analytic studies (see e.g. the top row of Fig. 11). This is due to the impact of these effects on both the mass resolution for the jet -mostly for signal jets -and the sensitivity of the shapes themselves. Since ISR and UE mostly affect the soft-and-large-angle region, we expect C 2 to be more affected than τ 21 , itself more affected than µ 2 (see the discussion above) and this is indeed what we observe from Monte Carlo studies.
Furthermore, we have seen that applying a grooming procedure on the jet before computing its mass and values of the shapes largely improves the robustness against ISR and non-perturbative effects, also restoring the ordering between the shapes observed with pure FSR. Again, this can be interpreted as grooming cutting away a part of the soft-and-large-angle region. This increased robustness however comes at a price in that reducing the soft-and-large-angle region using grooming also reduces the discriminating power of the shape cuts. In practice, there will be a trade-off between sheer efficiency and robustness against model-dependent effects. We reserve the detailed study of an optimal combination of a shape cut with a proper grooming procedure for future work.
In addition, note that working at a fixed jet mass ensures that our results are infrared-and collinear safe because it fixes automatically the value of τ 1 and e 2 . If we were to impose a cut on the shapes without fixing the jet mass, our results would still be finite after integration of (3.1) over ρ because the infrared region is killed by the plain mass Sudakov. This is an example of Sudakov-safe observables [49,50]. It is interesting to note that, after integration over the jet mass, we recover a distribution that can be expressed as a series in √ α s log(1/v), similar to what was obtained for ratios of angularities in [49].
The arguments above can be applied when comparing the recursive and nonrecursive versions of the shapes: the recursive versions have a smaller vetoed region at large angle while retaining the same small-angle region as their corresponding nonrecursive version. Thus, although the recursive versions have the advantage of being less sensitive to ISR and non-perturbative effects, they have a smaller discriminating power. A combination of a non-recursive cut on the shape with a proper grooming of the jet is expected to perform better while at the same time limiting non-perturbative effects.
Another key aspect of our results is that a cut on the shapes leads to an exponential suppression of the signal efficiency. This has to be contrasted with two-prong taggers like the mass-drop tagger, trimming or pruning which would only give a linear suppression [40]. This means that although it initially seems natural to work in the small v limit, in practice one will not be able to take the cut on v too small. Computing corrections for finite v could then become relevant for this discussion.
Finally, there are several other developments that can be made based on this study. In this paper, we have focused on a subset of jet shapes sensitive to the mass of the subjets. It would be interesting to extend this study to more generic jet shapes, e.g. studying the β dependence of energy-correlation-function ratios and N -subjettiness ratios. On the more formal side, we could also refine our calculations to include effects such as the initial-state radiation and finite jet radius contributions as well as attaining full NLL accuracy, optionally matched to a fixed-order calculation.
Larkoski. MD wishes to thank the IPhT and the French ANR for hospitality and financial support. MD's work is supported in part by the Lancaster-Manchester-Sheffield Consortium for Fundamental Physics under STFC grant ST/L000520/1. This work was also in part supported by the ERC advanced grant Higgs@LHC, by the French Agence Nationale de la Recherche, under grant ANR-10-CEXC-009-01, and by the Paris-Saclay IDEX.
A Results with running coupling: QCD background Results including running-coupling corrections can be straightforwardly obtained from the expressions before integration over z 2 and θ 2 given in Section 4. The running of the coupling is expressed wrt its value α s ≡ α s (p t R) taken at the physical scale of the problem, p t R, using the CMW scheme as appropriate for resummations [51,52]. We also freeze the coupling at a scale µ fr , giving (A.2) To keep the notations concise, we introduce λ x = 2α s β 0 L x where L x denotes any symbol we have introduced in (4.1) and L fr = log(p t R/µ fr ) = log(1/μ fr ).

A.1 Basic building blocks
It is helpful to introduce a few building blocks that will greatly help in writing the several results below in a short and understandable way.
The most basic building block we shall use is the integral over a "triangle" bounded by a maximal angle, a constant k t ∝ zθ line (upper or lower bound) and a constant generic line of constant zθ α , as represented on Fig. 13. Expressed as a function of the minimal and maximal k t scales of this triangle, this triangle can be written as The exact expressions for these integrals depend on the positions of k min and k max compared toμ fr . For k min >μ fr we find, introducing L min = log(1/k min ), λ min = 2α s β 0 L min and similar quantities associated with k max , where the B i term for α < 1 only has to be included if the "triangle" upper edge corresponds to z = 1. For k min <μ fr but k max >μ fr , one obtains In that expression, we have introduced α s,1-loop (k t ) = α s /(1 − 2α s β 0 log(p t R/k t )), the running-coupling at 1-loop, which multiplies the contributions proportional to B i in the frozen region. This reflects the fact that contributions proportional to β 1 B i and KB i , coming from the 2-loop corrections to the running of α s are subleading. They are not included, neither in the frozen region, nor in the running-coupling region. And, finally, for k max <μ fr , one gets From this fundamental building block, we can build two derived objects which will be used to describe all the expressions we have below. The first one is again a triangle bound by a maximal angle, a maximal zθ α line and a minimal zθ β line, see the right plot of Fig. 13. This can be seen as a superposition of two of the above triangles. Again, we can express this new object as a function of the minimal and maximal k t scales on the maximal-angle side of the triangle, and, assuming α < β, we get The last object we shall use is a "parallelogram" bounded by a minimal and a maximal angle and two parallel lines of constant zθ α , assuming here α > 1, see again the right plot of Fig. 13. This is expressed as a function of the maximal k t scale k 1 (at the minimal angle) and the maximal and minimal k t scales, k 2 and k 3 at the maximal angle. We can view this as a function of three of our basic triangles Note that we will often substitute the k t scale with their logarithm, log(1/k t ) and it is worth keeping in mind that the maximal k t would correspond to the minimal log(1/k t ).

A.2 Results for the QCD background
Now that we have building blocks corresponding to the integration of Sudakov factors over basic phase-space regions, we can use them to find simple expressions for the Sudakov factors corresponding to the shapes we are studying.
The phase-space regions will correspond exactly to the regions we have already used for the fixed-coupling calculation given in the main text, so we just list the results here.
N -subjettiness. This is the most simple result because the phase-space just corresponds to a triangle for the primary emissions and another one for the secondary emissions: where the negative term subtracts the Sudakov factor for the plain jet mass which has been factored out in our expressions. 28 T 0β (k max = 1, k min ) is related to the radiator given in Appendix A.1 of Ref. [53].

Mass-drop (non-recursive).
Here we split the result in a part, R 0 clustered with the main parton and a part, R 1 , clustered with the emission setting the mass.
The total Sudakov R µ 2 1/2 is the sum of these two contributions. Energy correlation function. For C 2 we have to disentangle two cases depending on whether we have a contribution from emissions at small angles of not: This expression can be trivially expressed as a result for D 2 replacing L v by L v − L ρ .
Recursive N -subjettiness. Here, the phase-space constraints can take three different forms. Remember also that we do subtract the Sudakov factor corresponding to the plain jet mass.
R τ,rec (z 1 ) Recursive Mass-drop. The expression is the same as for the recursive N -subjettiness cut,Eq. (A. 15), except that the second argument of the C A term should be Lρ−L 1 2 + L v instead of Lρ+L 1 2 + L v and that term comes with a Θ(L v > L 1 ).
Recursive energy correlation function. Again, we have three different situations This expression can be trivially expressed as a result for D 2 replacing L v by L v − L ρ .

A.3 Results for the signal
As previously, it is fairly straightforward to use the "triangular" building blocks to express our findings. Note also that, compared to the results presented for fixedcoupling in the main text, we have not expanded our results to first order in z 1 and 1 − z 1 . This would only lead to more complicated expressions without changing the formal accuracy of our results. Remember also that for the case of signal jets and at NLL (and small-R) accuracy, the results are the same for the recursive and nonrecursive versions of the shapes.
N -subjettiness (recursive or non-recursive). From the expression in eq. (5.2) it is easy to find Mass-drop (recursive or non-recursive). As for the fixed-coupling case, the only difference between N -subjettiness and a µ 2 1/2 cut lies in the z 1 and 1 − z 1 corrections. Figure 14: Three topologies potentially contributing to the emission of the gluon dominating the value of the shape, starting with a massive two-pronged object. Left: small-angle emission from the prong carrying a fraction 1 − z 1 of the jet p t ("prong 1"), centre: small-angle emission from the prong carrying a fraction z 1 of the jet p t ("prong 2"), right: large-angle emission from the parent object ("parent").
We find Energy correlation function (recursive or non-recursive). Again, the expression for C 2 looks very similar, except for the logarithms involving z 1 . We find This expression can be trivially expressed as a result for D 2 replacing L v by L v − L ρ .
A.4 Including finite z 1 corrections: QCD (background) and signal jets We have argued in Section 4.7 that if we wish to achieve NLL accuracy it is mandatory to include all finite z 1 and 1 − z 1 factors in our expressions for the shapes, with z 1 the fraction of the jet transverse momentum carried by the emission that dominates the mass of the jet. The main reason behind that is that they can be raised to powers of order α s log(1/v) which would give single-logarithmic corrections after integration over z 1 . In this Section, our main goal is to discuss these extra source of NLL terms. As a fringe benefit of this discussion, we will at the same time provide a unified description of the signal and background distributions, allowing for interesting interpretations of the results obtained in this paper.
If we want to properly include the finite z 1 corrections we first need to carefully identify the origin of the gluon emissions. In the collinear limit, sufficient to capture all the finite z 1 corrections, colour coherence indicates that we can encounter three situations, represented in Fig. 14. The first two situations correspond to gluon emissions at small angle θ 2 θ 1 from the splitting of either the hardest or the softest of the two prongs (carrying respectively a fraction 1 − z 1 and z 1 of the jet transverse momentum). These are the first two plots of Fig. 14 and will be referred to as the "prong 1" and "prong 2" topologies respectively for the 1 − z 1 and z 1 case. The third option corresponds to gluons emitted at large angle θ 2 θ 1 from the parent parton in the jet. This is represented on the rightmost plot of Fig. 14 and will be called the "parent" topology in what follows. In that approach, the distribution for QCD jets will receive contributions from all three topologies -the first and third weighted by C R and the second, corresponding to secondary emissions, weighted by C A -while signal jets coming from the decay of colour-neutral bosons would only receive contributions from the first two topologies, both weighted by C R .
For each of the three topologies, one then has to find the expression for the shape in the soft and collinear limit for the gluon emission, 29 and impose that the first emission (z 1 , θ 1 ) dominates the mass. The Sudakov factors for a given mass ρ, splitting momentum fraction z 1 and shape cut v, would then take the following form for each topology: where the splitting function would be the one of a quark, a gluon, or simply 0 for emissions from a colour-neutral object, and ρ = z 1 (1 − z 1 )θ 2 1 . In practice, the two "prong" contributions are the same as the ones we have computed in the case of signal jets, up to the constraint that the (z 1 , θ 1 ) emission dominates the mass. This last term is irrelevant for signal jets as it would only contribute to a constant. For QCD jets it is however crucial to impose it for the emissions from the hard prong since, there, the z 1 1 region can give rise to large logarithms. Strictly speaking, the finite z 1 corrections should only be kept in the expression for the shapes and the mass constraint in the emission from the soft prong is subleading 29 Meaning in particular that one can discard the 1 − z 2 factors. follows: where the last two Θ constraints come from the fact that the first term has to be positive and larger than the second term. Note that the second term in each of these three expressions is the same and come from the kinematic constraint than the second emission (z 2 , θ 2 ) does not dominate the mass. The results for the "prong 2" topology have not been given explicitly but can be directly obtained from the "prong 1" topology by inverting L 1 and L − which corresponds to inverting z 1 and 1 − z 1 .
For the emissions from the parent object, we find in a similar way B Details for the computation the shape value In this Appendix we give all the technical details related to the calculation of the leading-logarithmic expressions for each of the shapes we consider.

B.1 N -subjettiness calculation and axes choice
We need to justify the result in Eq. (4.2). For N -subjettiness with β = 2, we do not have to worry about recoil effects and we can focus on E-scheme recombinations, which uses 4-momentum sum of the particles. We consider a hard parton(p 0 ) accompanied by two emissions, p 1 and p 2 , of transverse momentum fraction z 1 and z 2 respectively emitted at angles θ 1 and θ 2 . We work in the strongly-ordered limit where we can assume that the mass (and τ 1 ) are dominated by the first emission: ρ = τ 1 ≈ z 1 θ 2 1 , neglecting a subleading (1 − z 1 ) power correction, with the axis defining τ 1 aligned with the jet axis.
For τ 2 , three different situations are possible: confusion, we must stress that the discussion below only applies to the non-recursive version of the µ 2 parameter and that the recursive aplication of a µ 2 p cut is infrared-safe for any p.
That said, let us consider a jet with three particles: a hard parton, a first emission with momentum fraction z 1 at an angle θ 1 and a second emission with momentum fraction z 2 at an angle θ 2 , with z 1 θ 2 1 > z 2 θ 2 2 and θ 2 θ 1 . This corresponds to the leading-order (O(α 2 s )) configuration for a jet with m 2 = (z 1 θ 2 1 + z 2 θ 2 2 )p 2 t and with a generic µ 2 = z 2 θ 2 2 /(z 1 θ 2 1 + z 2 θ 2 2 ) (using Cambridge/Aachen de-clustering). At the next order of the perturbation theory, one would have to include real emissions of gluons with momentum fraction z 3 and angle θ 3 as well as the corresponding virtual corrections and the soft divergence z 3 → 0 is supposed to cancel between the real and virtual contributions. However, for θ 3 θ 1 and z 3 → 0, the virtual contribution would give µ 2 virt = z 2 θ 2 2 /(z 1 θ 2 1 + z 2 θ 2 2 ) as for the 2-particle configuration, but the real emissions would give µ 2 real = 1 because of the Cambridge/Aachen de-clustering. This would lead to an infrared unsafety at µ 2 virt . This situtation can happen at any value of µ, depending on the original three-particle configuration.
Although we have not made an explicit calculation, one might expect that the Sudakov R µ 2 p would receive a contribution proportional to (α s /p) log 2 (1/θ 2 1 ), with θ 2 1 = ρ/z 1 , which diverges in the limit p → 0.

D Soft and large-angle emissions
In all the calculations we have performed so far, we have included hard collinear splittings which correspond to the terms proportional to B i and B g in our results. At the same order we could also have single-logarithmic contributions coming from soft and large-angle emissions. In practice, keeping the same notations as above, this means working in the approximation z 2 z 1 without assuming any specific ordering between θ 1 and θ 2 .
This can affect the calculations above at various levels: either through changes in the approximation used for the shape, where so far we have assumed a strong ordering, or through modifications of the matrix element for soft gluons at large angles. 30 Let us first discuss the first effect. Since the expressions we have used so far are correct when θ 2 θ 2 or when θ 2 θ 1 we only have to worry about the region θ 2 ∼ θ 1 . For N -subjettiness and the energy correlation functions, the correct expression in that region will only differ from the asymptotic one used so far by a constant, not enhanced by any parametrically large quantities. As a consequence, if we compute the difference to what has already been included, the integration over z 2 will at most bring a constant. Then, the angular integration over θ 2 ∼ θ 1 will also at most bring a constant giving an overall NNLL subleading correction, as already briefly discussed in Section 4.1.
We are therefore left with potential single logarithms coming from the matrix element for the emission of soft and large-angle gluons. Taking the case of a quark jet, we therefore have to compute the following generic expression: 2) If we focus on the single-logarithmic contribution, we can subtract the doublelogarithmic piece, C F /θ 2 02 + C A /θ 2 12 Θ(θ 12 < θ 01 ), and set v(z i , θ i ) = (z 2 θ 2 2 )/(z 1 θ 2 1 ) in what remains so that the z 2 integration yields a log(1/v). This gives where we have used the fact that d 2 θ 2 /θ 2 02 = d 2 θ 2 /θ 2 12 . Up to subleading corrections, we can extend the θ 2 integration to infinity and show, e.g. using dimensional regularisation, that it vanishes. In the end, there are therefore no soft and large-angle single-logarithmic corrections to what we have computed earlier in the text.

E Further comparisons
In this last appendix, we provide a few additional comparisons between our analytic predictions and Monte-Carlo simulations. One-loop v. two-loop running coupling. First, in Sections 4.9 and 5.6, we have used a one-loop running of α s , with α s (M Z ) = 0.1383, for Pythia simulations, and compared that to analytic calculations including two-loop corrections and using α s (M Z ) = 0.1185. In the case of our analytic calculation, this choice is motivated by the fact that two-loop corrections are easily included and we then used the world-average value [54] at the Z-boson mass. For the Pythia simulation, we simply kept the default which is a one-loop running. We could also have run Pythia with a two-loop running of the coupling and impose α s (M Z ) = 0.1185. We did not do that in the main text because that can only safely be done with a retuning of other parameters in Pythia (mostly for the non-perturbative effects). It is however interesting to check that this difference in the treatment of the running of the strong coupling does not come with large effects. The result is presented in Fig. 15, where we see that this is indeed a small effect which does not alter in any way the conclusions of this paper. We also see from that figure that the size of the effect is similar in Monte-Carlo simulations and in our analytic predictions.
Note also that another interesting check of our results is to compare our fixed-order results with Pythia simulations also done with a fixed coupling. Although we do not show explicit plots here, this comparison shows similar features as the ones observed with a running-coupling prescription.
Dependence on the jet transverse momentum. Throughout this paper, we have shown results for jets with a large transverse momentum of 3 TeV. Here, we briefly show that our calculations remain valid for less boosted jets, closer to those used in today's phenomenological analyses.
In Fig. 16, we show ROC curves obtained from Pythia simulations and our analytic calculations, for three different jet transverse momenta: 3 TeV, 1 TeV and 500 GeV. For this comparison, we have kept the ratio m/p t fixed, i.e. considered a mass of 358, 120 and 60 GeV respectively for each of the three p t scales. We see that the dependence on the jet p t is mild, which is expected since the result only depend on p t through the p t R scale entering in α s . Our conclusions are therefore also valid for jets of more moderate transverse momentum. Note that the small differences observed in Pythia simulations between different jet p t are well reproduced by our analytic calculation.