Factorization and resummation for groomed multi-prong jet shapes

Observables which distinguish boosted topologies from QCD jets are playing an increasingly important role at the Large Hadron Collider (LHC). These observables are often used in conjunction with jet grooming algorithms, which reduce contamination from both theoretical and experimental sources. In this paper we derive factorization formulae for groomed multi-prong substructure observables, focusing in particular on the groomed D2 observable, which is used to identify boosted hadronic decays of electroweak bosons at the LHC. Our factorization formulae allow systematically improvable calculations of the perturbative D2 distribution and the resummation of logarithmically enhanced terms in all regions of phase space using renormalization group evolution. They include a novel factorization for the production of a soft subjet in the presence of a grooming algorithm, in which clustering effects enter directly into the hard matching. We use these factorization formulae to draw robust conclusions of experimental relevance regarding the universality of the D2 distribution in both e+e− and pp collisions. In particular, we show that the only process dependence is carried by the relative quark vs. gluon jet fraction in the sample, no non-global logarithms from event-wide correlations are present in the distribution, hadronization corrections are controlled by the perturbative mass of the jet, and all global color correlations are completely removed by grooming, making groomed D2 a theoretically clean QCD observable even in the LHC environment. We compute all ingredients to one-loop accuracy, and present numerical results at next-to-leading logarithmic accuracy for e+e− collisions, comparing with parton shower Monte Carlo simulations. Results for pp collisions, as relevant for phenomenology at the LHC, are presented in a companion paper [1].


Introduction
The efficient and robust identification of hadronically-decaying boosted electroweak bosons is a problem of fundamental importance at Run 2 and into the future of the Large Hadron Collider (LHC) physics programme. The identification of boosted hadronically-decaying H/W/Z bosons requires two key ingredients: observables for discriminating multi-prong jet substructure, and a method for removing contamination within the jets. There has been substantial progress in both of these aspects of jet physics over the past few years. 1,2 One of the most powerful observables for discrimination of two-prong substructure is D 2 [14,15], based on the n-point energy correlation functions [16]. D 2 parametrically separates jets 1 For a recent review of theory and machine learning aspects of jet substructure, see [2]. 2 A summary of studies from the LHC using jet substructure can be found at https://twiki.cern.ch/ twiki/bin/view/AtlasPublic and http://cms-results.web.cern.ch/cms-results/public-results/ publications/. For the Run 2 experimental status, see the performance documentation in [3][4][5][6][7][8][9][10][11][12][13].

JHEP02(2018)144
with one-and two-prongs, and has been used extensively during Run 2 by ATLAS [17][18][19][20][21][22][23][24][25][26]. Because of the high-luminosity environment of the LHC, the robust measurement of the substructure of jets requires methods for removing contamination from underlying event, pile-up, or other sources that are mostly uncorrelated with the hard scattering. Of these so-called jet grooming techniques, the modified mass drop (mMDT) [27,28] and soft drop [29] groomers are theoretically most well-understood. Soft drop groomed mass distributions of jets initiated both by light QCD partons and by hadronically decaying Z bosons were measured recently by CMS [30]. Many theory advancements in understanding these observables and techniques have been made recently. The groomed jet mass and several related single prong observables have been calculated, both for QCD jets and electroweak bosons, up to next-to-leading logarithm (NLL) [27][28][29]31], and very recently for top jets [32]. Refs. [33,34] presented the first jet substructure calculation to next-to-next-to-leading logarithmic accuracy (NNLL) matched to fixed-order at O(α 2 s ) for the mMDT or soft drop groomed jet mass. A number of two-prong observables, both ungroomed [35] and groomed [36,37], have been computed at leading-logarithmic accuracy (LL), and the observable D 2 was calculated for QCD jets and boosted Z bosons produced in e + e − collisions to next-to-leading logarithmic accuracy (NLL) [15]. These calculations have helped to put the understanding of jet substructure observables on firmer theoretical footing, and have inspired a number of new jet substructure techniques currently used at the LHC.
In this paper, we present the first theory calculations to NLL accuracy for groomed jets on which two-prong observables like D 2 are measured, and present a systematically improvable framework for their calculation. This is achieved using factorization formulae for groomed two-prong jet observables derived using the techniques of soft-collinear effective theory (SCET) [38][39][40][41][42]. These allow for the resummation of logarithmically enhanced terms in all regions of phase space using the renormalization group evolution of field theoretic operators, significantly simplifying higher order calculations, and also provide operator definitions of non-perturbative effects. We will illustrate our framework by performing a calculation of the D 2 observable on jets produced in e + e − collisions, although our approach is more general, and could be applied to related observables, such as N 2 [43]. In a companion paper [1] we present D 2 distributions for mMDT/soft drop groomed jets produced in pp collisions for processes of phenomenological relevance for jet substructure at the LHC.
Schematically, the soft drop groomer steps through the clustering history of a jet and removes those branches in the jet which fail the requirement (1.1) Here, E i is the energy of branch i, θ ij is the angle between branches i and j, and R is the jet radius. z cut and β are parameters of soft drop; in this paper, we will only consider β = 0, for which soft drop coincides with the mMDT groomer (Due to their equivalence we will use their names interchangeably.). Typical values of z cut are z cut ≈ 0.1. When eq. (1.1) is satisfied, the procedure terminates. On jets that have been groomed with soft drop, we These properties imply that the full mMDT/soft drop D 2 is only very weakly dependent on the jet mass m J and the jet energy E J . This is a desirable property, especially for experimental applications. The explicit design of observables that are independent of mass and p T (or energy) cuts has been a subject of recent research [43,47]. Due to the final property, quark and gluon jet distributions can be individually extracted from D 2 distributions of jets produced in different processes. The outline of this paper is as follows. In section 2, we define the mMDT and soft drop groomers appropriate for both jets produced in e + e − and pp collisions, and define the energy correlation functions and the ratio observable D 2 . In section 3, we will review the previously published factorization formulae for D 2 and groomed jet mass and derive new factorization formulae for the groomed D 2 cross section in e + e − collisions. Collinear factorization enables us to use all of the results from e + e − collisions to formulate the factorized cross section for the groomed D 2 observable for jets produced in pp collisions in section 4. From the form of the factorization formula, we are immediately able to make all-orders statements regarding properties of the distribution. We will review those properties that are identical to those of the groomed jet mass and discuss in some detail reduced hadronization corrections for the D 2 distribution in section 5. In section 6, we present numerical results, comparing JHEP02(2018)144 our NLL predictions in e + e − collisions to parton shower Monte Carlo. A comparison of robust qualitative features of the distribution derived from the factorization formula for pp are compared with parton shower Monte Carlo predictions in section 7. We conclude in section 8. Technical details and calculations are presented in appendices.

Observables
In this section, we review the definitions of the observables and grooming procedures that will be studied in this paper. As mentioned in the introduction, we will restrict our focus to grooming with mMDT [27,28] or equivalently, soft drop with angular exponent β = 0 [29]. Definitions of mMDT and energy correlation functions will be presented for jets produced in both e + e − and pp collisions.

Modified mass drop/soft drop grooming
The modified mass drop/ soft drop groomer proceeds in the following way. The identified jet is reclustered with the Cambridge/Aachen algorithm [48][49][50], which orders emissions in the jet by their relative angle. Then, starting at the widest angle, the two branches following from each splitting in the jet are required to satisfy an energy fraction constraint. For branches i and j in a jet produced in e + e − collisions, this requirement is where E i is the energy of branch i. For a jet from pp collisions, the requirement is where p T i is the transverse momentum with respect to the proton beam of branch i. If these requirements are not satisfied, the softer (lower energy/transverse momentum) branch is removed from the jet, and the procedure iterates to the next splitting on the harder branch. The process terminates when the two branches in a splitting in the jet satisfy eq. (2.1) or eq. (2.2), as appropriate. The cut parameter z cut is typically taken to be z cut ∼ 0.1. In this paper, we will formally assume that z cut 1, so that emissions that fail these requirements are necessarily soft. Once a jet has been groomed, any observable can be measured on its remaining constituents.

Energy correlation functions and D 2
For powerful identification of hadronic decays of electroweak bosons, we use the energy correlation functions [16]. 4 The n-point energy correlation function is sensitive to radiation about n−1 hard cores in the jet. Therefore, for this application, we use the two-and threepoint energy correlation functions, which are sensitive to radiation about one or two hard For jets produced in pp collisions, the energy correlation functions are simply modified as e (α) 2 pp Here R ij is the distance between particles i and j in the pseudorapidity-azimuth angle plane. For jets that are central (p T J ∼ E J ) and if all emissions in the jet are collinear, the definitions of the two-and three-point energy correlation functions for jets in pp collisions are equivalent to those for e + e − collisions. The angular exponent α in the definition of the energy correlation functions is a parameter that controls sensitivity to wide-angle emissions. For jets that consist of massless particles, the two-point energy correlation function in e + e − collisions reduces to a function of the jet mass, m J , if α = 2: (2.7) It has been shown that the optimal observable, formed from e 3 , for discrimination of boosted hadronic decays of electroweak bosons (with a hard two-prong substructure) from QCD jets (typically with a single hard prong) is a particular ratio of the energy correlation functions called D (α) 2 [14,15]. D (α) 2 is defined as (2.8) In this paper, for jets which have been groomed with mMDT, we subsequently measure D 2 . For brevity, we will often denote D (α) 2 generically as D 2 .

Phase space and behavior of groomed D 2
In this section, we use the power counting techniques of [14] to review the phase space structure of a jet on which the observables e

JHEP02(2018)144
We begin with a brief review of the structure of the e (α) 2 and e (α) 3 phase space without grooming, as was considered in detail in [14,15]. With no grooming the dominant emissions in a one-prong jet with e where z s is the characteristic energy fraction of soft emissions and θ cc is the characteristic angle of collinear emissions. Depending on the assumptions made about the relative scaling of z s and θ cc , one finds the upper and lower boundaries of the phase space for one-prong jets. The upper boundary is the so-called soft haze region where To determine the region of phase space where a two-prong jet lives, it is sufficient to consider a jet with two hard, collinear prongs, with all other radiation at much lower energy. In this case, the scaling of the contributions to e Here, θ ab is the angle between the hard prongs, θ cc is the characteristic angular size of each of the hard prongs individually, z cs is the energy fraction of collinear-soft radiation emitted from the dipole of the two hard prongs, and z s is the energy fraction of soft radiation emitted at large angles. The requirement that the hard prongs are well-defined restricts the energy fraction of the collinear-soft radiation to be small: That is, two-prong jets have measured values of e  3 ) plane are illustrated in figure 1a. We now consider how this phase space is modified in the presence of grooming. First, we consider the parametric scaling of contributions in the one-prong region of phase space. We assume that there exists radiation in the groomed jet whose energy fraction is set by z cut at a characteristic angle θ sc from the jet core, in addition to the collinear modes. The cc + θ α cc θ 2α sc z cut + θ 3α sc z 2 cut . (2.14) From this we can determine the boundaries of the phase space by imposing different relationships between the characteristic angles and energy fractions. As in the ungroomed case, the lower bound of this phase space region occurs when collinear emissions dominate the value of e (α) This is the same lower boundary as with ungroomed jets. This implies in particular, that D 2 remains a powerful discriminant, even after grooming has been applied. The upper boundary of this region is more interesting. Assuming that the contributions to e (α)

JHEP02(2018)144
We refer to the region of phase space near this upper boundary as "collinear-soft haze". Note that, assuming that two-prong jets have two hard prongs, their phase space is unchanged from the ungroomed case. This is quite interesting. Since we formally assume z cut 1, there is a separation of the collinear-soft haze region from the lower boundary of groomed one-prong jets. However, both boundaries have the same cubic relationship between e 2 ) 2 . From this analysis, it is straightforward to determine the maximal value of D 2 with and without grooming. For the ungroomed jet, the maximal value of D 2 is when With a more careful analysis (discussed in ref. [14]), one can derive the factor of 1/2 in the location of the endpoint. Note that for α = 2 this maximum value is sensitive to both the energy and mass of the jet: (2.20) The endpoint of the D 2 distribution formally increases without bound as the energy of the jet increases, for a fixed mass cut. 5 On the other hand, when the jet is groomed, we find Therefore, when a jet is groomed with mMDT or soft drop, the endpoint of the D 2 distribution is independent of both the jet mass and energy. This property will be one part of the reason why the groomed D 2 distribution is incredibly robust to changes in energy and/or mass cuts. To demonstrate that this scaling is satisfied in simulation, in figure 2 we plot the distribution of jets in the (e 3 ) phase space plane as simulated in parton shower Monte Carlo. The details of the Monte Carlo simulation will be described in section 6. Here we use the angular exponent α = 1 and impose an upper cut on the mass of m J < 100 GeV, which more clearly illustrates the phase space regions. The same general features are present for other values of α. On these jets we then measure e 2 ) 3 /z cut , as illustrated in figure 2b. This demonstrates that our parametric scaling analysis of the phase space is satisfied by parton shower simulation. More detailed tests will be provided in section 6, when we study the structure of the D 2 distribution in our analytic calculation, and in parton shower Monte Carlo. 5 Of course, this isn't quite true because there is a characteristic mass scale of QCD. Even perturbatively this isn't true because the Sudakov factor will exponentially suppress low masses.
3 ) phase space plane as simulated in parton shower Monte Carlo. Here, α = 1 and the mass of the jet is restricted as m J < 100 GeV. (a) Ungroomed jets, that extend up to e The location of the endpoint for the groomed D 2 distribution also has important consequences for its calculation. In particular, for the ungroomed D 2 distribution, the endpoint is at ∼ 1/e (α) 2 . In the limit e (α) 2 1, this is formally large, and can be neglected. This is what was done in ref. [15]. However, for the groomed D 2 distribution, the endpoint of the distribution is at ∼ 1/z cut . Since we assume z cut e (α) 2 , we must compute the matrix element in this region of the phase space, and match it to our resummed calculation, to accurately predict the endpoint of the distribution.
3 Factorized cross section in e + e − collisions In this section we present factorization formulae for mMDT/soft drop groomed D 2 . These allow for a systematically improvable calculation of the D 2 distribution, and the resummation of logarithmically enhanced terms in all regions of phase space to be resummed by renormalization group evolution. The factorization formulae are presented in the language of SCET [38][39][40][41], an effective field theory describing soft and collinear radiation in the presence of a hard scattering. For the case of jet substructure observables, where multiple hierarchies are present within the jet, extensions of SCET are required. These have been developed in refs. [15,[54][55][56], and were discussed in detail in the context of the D 2 observable in ref. [15]. In this section we will restrict ourselves to giving physical descriptions of the functions appearing in the factorization formulae. Field theoretic definitions, and one-loop calculations, are given in the appendices.
Our approach to deriving the factorization formulae will closely follow the techniques used to study the groomed jet mass and the ungroomed D 2 observable. In particular, we -9 -will begin from the factorization for the groomed jet mass cross section, and then perform a refactorization of the resolved substructure. Because of this, we begin in section 3.1 with a review of the factorization formulae in both these cases. We then present the factorization for groomed D 2 in section 3.2. Throughout this section, we will restrict ourselves to the case of e + e − collisions for simplicity. In section 4, we will then show that the grooming algorithm allows us to trivially extend this factorization formula to the case of pp collisions.

Review of known results
Factorization formulae are known for both the groomed jet mass [33,34] and for the ungroomed D 2 observable [15]. Since our factorization for the groomed D 2 observable will rely heavily on ingredients from both these analyses, we begin by reviewing the essential ingredients of the factorization formulae for these two cases. The discussion will be brief, and more details can be found in the respective papers.

Groomed jet mass factorization formula
A factorization formulae was presented in SCET for the soft dropped two-point energy correlation functions e (α) 2 , and was used to calculate the distribution to NNLL order [33,34]. Throughout this section we will always take the soft drop parameter β = 0. The case β > 0 follows an identical logic, and is discussed in detail in refs. [33,34].
The factorization formula is valid in the limit e (α) 2 z cut 1. It can be derived through a multi-stage matching procedure from the standard SCET involving a global soft function and jet functions. The first stage of the matching is a soft and collinear factorization, with the soft virtuality set by Qz cut , and the collinear virtuality set by e (3.1) The soft drop grooming has isolated the jet dynamics from the rest of the event, due to the angular ordering of the algorithm. However, this factorization still contains large logarithms within the collinear sector. These can be resummed by refactorizing into a collinear-soft function, which allows for the resummation of all logarithms of e (α) 2,R . The final factorization formula for measuring e (α) 2 in each of the groomed hemispheres in e + e − collisions is given by The physical interpretation of the functions entering this factorization formula are as follows (field theoretic definitions can be found in [33,34]): • H(Q 2 ) is the standard hard function, describing in this case the production of two back to back jets in an e + e − collision.
• S(z cut ) is the global soft function. It describes wide angle soft radiation, which is removed by the groomer. It is therefore independent of the observable, and depends just on z cut .  2 ) is a jet function describing collinear radiation. Since this radiation is energetic, it is not affected by the groomer, so that this function does not depend on z cut .

JHEP02(2018)144
• S(e (α) 2 z α−1 cut ) describes collinear soft radiation, which contributes to the observable, but is sensitive to the groomer. It can be shown that it depends only on the scales z cut and e (α) 2 through the combination e (α) 2 z α−1 cut , as indicated by the argument of the function. The multi-stage matching procedure is shown in figure 3, which also shows the virtualities of the modes contributing to the factorization formula. The results for all functions appearing in the factorization formula of eq. (3.2) allowing for resummation up to NNLL were computed in refs. [33,34] 3.1.2 Ungroomed D 2 factorization formula In ref. [15] a factorization formula was presented for the D 2 observable. For a two-prong substructure observable such as D 2 , multiple kinematic regimes with distinct hierarchies exist, each of which contribute to a different region of the multi-dimensional phase space discussed in section 2.3. The approach taken in ref. [15] was to identify all parametric regions of phase space where hierarchies occur, and to develop distinct effective field theories describing each of these regions. The different effective field theories can then be pieced together to give a complete description of the entire phase space region.
In ref. [15], three phase space regions were required to provide a description of the D 2 observable: • Soft Haze: the jet does not have a resolved substructure, and is formed from unresolved soft and collinear radiation, as in figure 4a. Here the factorization formula involves multi-differential jet and soft functions as developed in refs. [57][58][59].  is formulated in the SCET + theory of ref. [54]. In addition to the standard soft and collinear modes of SCET, it involves collinear-soft modes emitted from the dipole formed by the two subjets. In this factorization formula, the modes describing the radiation within the subjets are not sensitive to the presence of the jet boundary.
• Soft Subjet: the jet is formed of a single highly-energetic subjet, and a wide-angle subjet with energy fraction z sj 1, as shown in figure 4c. The effective field theory description of this region of phase space was first presented in ref. [55]. Its complexity arises due to the fact that the soft subjet is sensitive to the presence of the jet boundary.
A smooth transition between the collinear subjets and soft subjet regions of phase space was achieved using a zero bin-like procedure to remove any overlap. A similar approach was advocated in ref. [56].

Groomed D 2 factorization formula
Having reviewed the factorization formula for the soft dropped energy correlation functions, as well as for the D 2 observable, we can now combine these two approaches to provide a factorized description for groomed D 2 . This will be accomplished by refactorizing an analogous parent effective theory to the expression eq. (3.1) in the different parametric regions:     The merging and region analysis will be similar to that performed in ref. [15] for D 2 without soft drop. Before giving a detailed discussion of each of the factorized expressions -12 -

JHEP02(2018)144
in the different phase space regions, we give a brief overview of the different regions of phase space that can contribute and the dynamics occurring in each region, as well as comparing them to the three phase space regions which contributed to ungroomed D 2 , as shown in figure 4.
To describe the D 2 distribution of a jet on which the soft drop grooming algorithm has been applied, we will similarly need three regions of phase space. Note, however, that since all the factorizations will appear as refactorizations of the eq. (3.3), all components of the factorized expression which contribute to D 2 will be collinear in nature. This will significantly simplify the analysis. In particular, the wide-angle soft subjet region of phase space is completely removed from contributing to the observables by the soft drop algorithm. In the soft subjet region of phase space, we would have e (α) 2 ∼ z sj . However, by assumption, we take e (α) 2 z cut , and therefore, the wide angle soft subjet is removed by the soft drop algorithm. This region of phase space will instead be replaced by a collinear-soft subjet which has characteristic energy fraction z cs ∼ z cut . The effective field theory description for this hierarchy is new, and will be described in section 3.2.3.
The three phase space regions that will contribute to the D 2 observable as measured on a soft dropped jet are shown schematically in figure 5. A brief description of each of the different phase space regions is as follows: • Collinear-Soft Haze: the jet does not have a resolved substructure. It is formed entirely from unresolved collinear-soft radiation. This is shown schematically in figure 5a.
• Collinear Subjets: as shown in figure 5b, in the collinear subjets region of phase space, the jet consists of two subjets of approximately equal energies, and a small opening angle, surrounded by collinear-soft radiation.
• Collinear-Soft Subjets: the jet is formed of two subjets, of parametrically different energies, with the softer jet energy set by z cut , but where the opening angle between the jets is still assumed to be small. Unlike the previous two phase space regions, because z cut sets the energy of the soft jet, there is no additional collinear-soft radiation at a wider angle than the soft subjet. This is shown schematically in figure 5c.
It is interesting to contrast the different phase space regions for the D 2 observable with and without the soft drop grooming algorithm applied, as shown in figures 4 and 5. These configurations are similar, with the exception that the wide angle radiation is removed by the soft drop algorithm, so that only collinear-soft radiation remains. Importantly, this radiation is boosted along the direction of the jet. It is therefore not sensitive to the directions of other jets in the event, all of which appear boosted in the opposite direction, and it is also not sensitive to the radius of the jet. This will lead to a large degree of universality for the soft dropped D 2 distributions, and simplify their calculation in the presence of additional jets.
We now discuss each of the phase space regions in figure 5 in detail, and present factorization formulae describing the radiation in these different regions of phase space. These factorization formulae will allow for the radiation at each hierarchical scale to be described by a different function, allowing for large logarithms in the perturbative calculation to be resummed. A complete description of the groomed D 2 distribution can then be obtained by merging these different factorization formulae. We will discuss how this is done in sections 3.2.4 and 3.2.5.

Unresolved substructure: collinear-soft haze
We begin by discussing the factorization in the region of phase space where the jet has no resolved subjets. The factorization formula in this region of phase space will follow almost identically the soft-haze factorization formula of ref. [15], except that the soft function will be replaced by a boosted collinear-soft function due to the implementation of the soft drop algorithm. Recall that in this region of phase space we have collinear modes and collinear-soft modes, and the power counting for the observables is In this region of phase space, e in this region of phase space is then given by An illustration of the multi-stage matching for this factorization theorem is illustrated in figure 6.
A similar factorization formula was proposed in ref. [15] for describing the unresolved region of phase space for the D 2 observable without the soft drop algorithm. This region also first contributed to the observable at NNLL order, and was therefore not considered. This was because the endpoint of the ungroomed distribution is 1/e (α) 2 1, and therefore the distribution has a smooth long tail, which can be well-approximated by simply extending the factorization formulae from the two-prong region of phase space. However, in the case that the soft drop algorithm is applied, it was shown in section 2.3 that the D 2 distribution has an upper boundary at D max 2 = 1/(2z cut ). This endpoint feature is not described by the factorization formulae in the two-prong region of phase space, as it is expanded away. Matrix elements in the collinear-soft haze region of phase space are required to describe this kinematic feature. We will therefore compute the fixed-order matrix elements at O(α 2 s ) and match within the effective theory.
The most convenient way to calculate the D 2 distribution in the soft haze region to O(α 2 s ) is to integrate the appropriate 1 → 3 splitting functions, as described in appendix D. This is equivalent to calculating the D 2 distribution in the parent theory of eq. (3.3). One can explicitly check that one reproduces the matrix elements of the collinear-soft haze factorization when two of the emissions in the 1 → 3 splitting functions are taken to be soft, and when two are taken to be collinear and one is soft. These contributions reproduce the two-loop soft function, and the convolution between the one-loop jet and one-loop soft functions within the factorization formula of eq. (3.5).
A critical feature of the 1 → 3 splitting functions, as shown in appendix D, and by extension, also the collinear-soft haze factorization formula, is that at O(α 2 s ) all the e dependence merely becomes a multiplicative factor to the shape of the D 2 distribution in the collinear-soft haze region. This implies to N 3 LL logarithmic counting in the e (α) 2 logarithms, that the e (α) 2 spectrum is simply multiplicative to the normalized D 2 distribution. This is consistent with the arguments given in section 2.3 about the endpoints of the groomed and ungroomed D 2 distributions. The endpoint of the ungroomed D 2 distribution is set by the value of e (α) 2 at fixed order, so the functional dependence of the ungroomed D 2 spectrum is highly nontrivial. One would have to convolve the Sudakov resummation of the e (α) 2 spectrum with the ungroomed D 2 distribution as a function of e (α) 2 in order to accurately describe even the normalized endpoint of the ungroomed D 2 distribution. The grooming procedure decouples the shape of the endpoint from the value of e (α) 2 , significantly simplifying the calculation of the D 2 distribution at large values of D 2 . We explain in more detail the importance of these observations when considering the matching between resolved and unresolved limits in section 3.2.5.
As a check of the splitting function integration, we also compute the D 2 distribution with EVENT2 [60] and then match to the factorization formulae for the two-prong phase space regions.

Resolved substructure: collinear limit
Here, we will determine the factorization formula in the limit when the jet has two relatively hard collinear subjets. To derive this factorization formula, we must return to the parent theory of eq. (3.3) (for brevity, we just focus on one hemisphere): Now, on this soft dropped jet on which we have measured e (α) 2 , we additionally measure e (α) 3 , with the assumption that e 2 ) 3 . In this limit, and using the mode decomposition outlined in ref. [15], we can factorize the jet function into a hard, collinear splitting: Here, z is the momentum fraction of one of the subjets, and H 2 (z, e 2 ) is a function that depends on e (α) 2 that describes the hard, collinear splitting. 6 3 ) are the jet functions that describe the collinear radiation off of the two hard prongs in the splitting. C s (e (α) 3 , z cut ) is the collinear-soft function that describes relatively soft radiation emitted off of the dipole formed by the two hard prongs. In contrast to the ungroomed D 2 distribution, there is no global soft contribution (and thus for e + e − , no two-eikonal line soft function depending on e (α) 3 ), as the jet has already been isolated by the grooming procedure. The factorization formula in the two-prong collinear limit is then: We could stop with this factorization, and begin calculating the resummation of D 2 ; however, it is worthwhile to further analyze the structure of the collinear-soft function.
As e (α) 3 → 0, this forces the energy of the soft modes within C s (e (α) 3 , z cut ) to zero. All emissions generated off of the eikonal lines within the collinear-soft function can only contribute to the observable by being clustered with one of the legs of the hard prongs, before the legs themselves are clustered together. Otherwise, the emission will be at too low an energy scale and too wide of an angle to be included in the groomed jet. Correspondingly, emissions that do contribute to e (α) 3 ) cannot be emitted at too wide of an angle. If these emissions are not first clustered with one of the two hard prongs, then they are necessarily groomed away. Therefore, we separate out two angular regions of the collinear soft function and write: Again, we emphasize that the constraint θ < θ ab or θ > θ ab is schematic; the precise constraint will depend on the detailed clustering history, however it is purely geometrical 2 zcut A schematic of the multistage matching procedure in the collinear limit of resolved substructure. The function incorporating non-global collinear effects is not shown, but is discussed in the text.
in its implementation. The last function is independent of the renormalization group, and encodes local-to-the-jet non-global correlations. The presence of the hard splitting with an opening angle set by e (α) 2 implies that an effective jet area is created within the two prong region. This will lead to non-global correlations between emissions that are groomed away, but emit into this opening angle, and the emissions which come off of the primary hard legs.
An illustration of the origin of these non-global logarithms (NGLs) is illustrated in figure 8. In this figure, a hard collinear quark and collinear gluon (denoted by the curly curve with a line through it) sets the mass of the jet, and then their dipole emits a softcollinear gluon. This soft-collinear gluon has sufficiently low energy and fails soft drop, but re-emits another soft-collinear gluon that is clustered into the hard collinear particles. Such a re-emission is non-global in origin, as it is simultaneously sensitive to the infrared scales z cut and e (α) 3 . However, all particles in this picture are collinear, as the jet was already isolated from the rest of the event in the first stage of matching. Therefore the resulting NGLs only depend on the fact that the jet was initiated by a hard quark (in general, they depend on the flavor structure of the splitting). Because of this universality, these non-global logarithms are significantly less worrying than more standard NGLs that occur in (ungroomed) jet mass distributions. Techniques have been developed for systematic calculation of NGLs [55,[61][62][63][64][65][66][67][68], and these NGLs associated with the soft drop procedure have interesting features not previously encountered due to the clustering history. We have performed some preliminary estimations of these NGLs, and find their numerical effect is small, well within our uncertainties for the purely global (Sudakov) contributions. At leading logarithmic accuracy in the large-N c limit, they can be computed using an extension of the Monte Carlo algorithm of Dasgupta and Salam [46], which is described in appendix E.  With these replacements, the factorization formula now becomes where the functions are as follows is the hard function, in this case for e + e − → dijets.
• S(z cut ) is the soft function describing wide angle soft radiation which has been soft dropped.
2 ) is a hard function describing the production of the two collinear subjets.
3 ) are the jet functions for the collinear subjets.
3 , z cut ) describes the entanglement between the groomed soft-collinear emissions and the two-prong region.
The hierarchy of scales of the functions in this region of phase space is illustrated in figure 7. The calculations of the functions in this factorization formula to one-loop accuracy are presented in appendix A. There, we also demonstrate the consistency of this factorization formula by showing that the sum of anomalous dimensions is indeed 0. Figure 9. Structure of emissions at one-loop order that contribute to the hard matching function in the soft subjet factorization formula for groomed jets. The emission at a wider angle will be groomed away, while the other emission sets the mass of the jet. (a) corresponds to the Abelian emission of gluons, and will contribute proportional to C F ; (b) corresponds to non-Abelian gluon emissions, and will contribute proportional to C A ; and (c) corresponds to a gluon splitting to quarks, which contributes proportional to n F .

Resolved substructure: soft limit
When the jet has two subjets whose energies are hierarchically separated we can determine the form of the appropriate factorization formula in the same manner as in the previous section. As in that case, we start with the parent jet function from eq. (3 .3): This preliminary factorization has removed the collinear subjet contributions, but has not distentangled all the soft scales. This requires a matching procedure that cannot be implemented at the level of the amplitude, but must be performed at the amplitudesquared. 7 The matching procedure is complicated by essentially the same physics that determines the NGLs encountered in the collinear-subjets of the resolved region, but now we must take the energy scale of one of the legs to be just above z cut , and hence not parametrically separated from the soft-collinear emissions which are groomed away just below z cut . Thus we write 3 , z cut ) denotes hard matching contributions where additional Wilson lines and jet functions are introduced to capture the non-global correlations. The function C s (e (α) 3 , θ < θ ab ) is the same as found in eq. (3.10). As it stands, this factorization is sufficient to resum all large global (Sudakov) logarithms, and to leading logarithmic accuracy in the NGLs, the function C sj−NG s (e (α) 3 , z cut ) is identical to that found in eq. (3.10). 8 There are a few things to note about this factorization formula. First, there is a jet function J sc (e (α) 3 ) that describes collinear radiation off of the soft subjet in the larger jet. To leading power, this subjet is always a gluon and is identical to the z → 0 limit of the JHEP02(2018)144 corresponding gluon jet function in the factorization formula in the case of hard, collinear subjets. Additionally, there is the identical collinear-soft function as in the hard collinear subjets factorization formula. Because we assume that D (α) 2 z cut , emissions that set the value of D (α) 2 must be at parametrically lower energies than either of the subjets. Therefore, the soft drop constraint on these emissions is just a geometric constraint that enforces the emissions to first cluster with one of the hard subjets. This geometrical constraint is necessarily independent of the energy of the subjets of the larger jet, and therefore this collinear-soft function is identical to that which appears in eq. (3.10).
The novel part of this factorization formula is the hard matching function, 2 , z cut ) that describes the production of the soft subjet. This function has now two contributions relevant for a NLL resummation, or one-loop calculation. First, there are the standard virtual contributions, which just correspond to the z → 0 limit of the corresponding matching coefficient in eq. (3.10). There is, however, a new contribution to the matching function in this factorization formula. Because we apply soft drop, it is possible that there is an initial emission in the jet that fails soft drop, and so does not seed the production of a soft subjet. However, a secondary emission could then pass soft drop, and produce the soft subjet. These different configurations are shown in figure 9.
The calculation of this two emission contribution to the hard matching function 2 , z cut ) is presented in appendix B.1, but we will describe its features here. We must consider all possible pairs of soft emissions which are reclustered in such a way that the first angular-ordered emission fails soft drop, while the second passes. This is the reason why we explicitly show the z cut dependence in this function. Note that the constraint that one emission fails soft drop while the other passes eliminates the collinear singularity when the emissions become close in angle. If the two emissions are sufficiently close in angle compared to their collective angle to the hard jet core, then they will be clustered together first, which is forbidden by assumption. This implies that the contribution to this hard function from the emission of a soft quark-anti-quark pair does not contribute to NLL order. The emission of soft gluons will contribute at NLL order. Figure 9 shows a schematic picture of these two-emission contributions to the hard matching function.
We therefore find that the complete factorization formula for a soft dropped groomed jet with a soft subjet is 3 , z cut ) .
(3.13) A brief review of the different functions entering the factorization is as follows is the hard function, in this case for e + e − → dijets.
2 , z cut ) is a hard function describing the production of the soft subjet.
3 ) is the jet function for the energetic jet. Figure 10. A schematic of the multistage matching procedure in the soft-collinear subjet region of phase space. The function incorporating non-global collinear effects is not shown, but is discussed in the text.
3 , z cut ) describes the non-global correlations arising from groomed softcollinear emissions.
• S(z cut ) is the soft function describing wide angle soft radiation which has been soft dropped.
For future use, we record the virtualities of the different modes, which are also shown in figure 10: . (3.14) The scalings of these modes will play an important role when studying the behavior of non-perturbative power corrections. Note that while the functions in the factorization theorems in the collinear and soft subjet regions are similar, the hierarchy of scales is different. These differences are present in figures 7 and 10, and are related to the different sensitivities to infrared scales in the two factorization theorems. For example, the J sc (e

Merging collinear and soft resolved limits
To perform a complete calculation, we must merge our description of the different resolved regions. We merge between the soft-collinear subjet and collinear subjet region by subtracting their overlap. This gives: However, to NLL accuracy, including NGLs, we can show that the collinear factorization suffices to capture all large logarithms with the appropriate scale setting. First we note that the tree-level results of the subtracted hard matching of the collinear factorization and the soft-collinear subjet agree. Then all one must check is that the resummation in the collinear sector arising from running S c (z cut , θ > θ ab ) naturally merges with the resummation in the soft-collinear sector of the function H sj 2 (z, e 2 , z cut ). For this to happen, the natural renormalization scales of product: 2 , z cut ) when z → z cut . This is accomplished so long as we use the transverse momentum of the collinear splitting as the renormalization scale for the collinear hard splitting function. We then compare the scales take from appendix A.10 (for simplicity, we take α = 2): . (3.18) In the limit z → z cut , we have: showing that the two scales merge. Finally, we note that the sum of the anomalous dimensions in eq. (A.62) gives eq. (B.7) in the limit 1 − z q → z cut , that is, the collinear subjets approach the soft-collinear region. That this must be the case stems from the purely geometrical character of the soft drop constraint. Regardless of the relative energy scales between the emissions that sets the JHEP02(2018)144

Matching resolved and unresolved limits
In this section we discuss how the factorization formulae in the resolved and unresolved limits can be merged to provide a complete description of the entire D 2 distribution. We consider two distinct merging schemes: one using profile functions [69,70] to turn off the resummation at the endpoint of the distribution, and a second using only canonical scales for the resummation at all values of D 2 , never turning off the resummation. We scale set at the level of the cumulative distribution, and take the derivative for the differential cross section. When using profiles, we retain all the logarithms of µ over the natural scale of the function in the matching, jet, and soft-collinear/collinear-soft functions to order O(α s ). 9 Thus when we turn off the resummation by taking all scales to be the factorization scale, we are left with the singular terms of the fixed-order D 2 distribution. When resummation is fully turned on by the profile function, this trivializes the contribution from the factorized functions. The specific profile used and the canonical scale choices are summarized in appendix A. 10.
For all distributions (quark, gluon, and signal), the use of canonical scales in the resummation gives a resummed distribution that completely over-shoots the singular terms of the fixed order result throughout the range of D 2 , leading to an unphysical endpoint of the distribution much greater than 1/(2z cut ). If one were to additively match and normalize, the resulting curve would be equivalent to just the normalized resummed canonical prediction, with completely unphysical behavior in the large D 2 region, and with a peak much too low due to the broad tail. Thus we adopt a strategy of multiplicative matching for the distribution (3.20) Here "fo" stands for the fixed-order distribution, which is determined from the 1 → 3 splitting functions (as discussed in appendix D) or from EVENT2 [60]. Thus, regardless of whether we use canonical scale choices in the resummed distribution or profiles to turn off the resummation, the distribution will always terminate at the physical value 1/(2z cut ). The only subtlety is if the singular distribution has a zero in the physical range of D 2 . This occurs in some cases, and we are then forced to only use distributions where the resummation is turned off via profiles before this zero is reached. We find this to be the case generically for the signal distributions if z cut < 0.2, and for the quark distribution if z cut ≤ 0.05.
It is worthwhile to understand how the merging interplays with the resummation of the e (α) 2 spectrum, and the counting of logs of D 2 versus logs of e (α) 2 . As can be directly seen from appendix D, the D 2 spectrum in the large D 2 region, which is controlled by the factorization in the soft-haze region of section 3.2.1, is independent of the value of e (α) 2 . Since the D 2 spectrum at leading order is set by the two-loop matrix elements in the soft-haze region, we may write

21)
9 If we had also retained the constants, this would be equivalent to NLL .

JHEP02(2018)144
where F (D 2 , z cut ) reproduces the fixed order spectrum in D 2 . The subscript N 3 LL indicates that this expression is valid up to N 3 LL order. Although the fixed order distribution diverges at D 2 = 0, the plus distribution ensures that the singularity at D 2 = 0 is formally cancelled by the appropriate virtual corrections, so that we have The factor multiplying the fixed order D 2 spectrum is simply the groomed e (α) 2 spectrum to N 3 LL accuracy. Once we match the resummed D 2 spectrum to the fixed-order D 2 spectrum, we replace (3.23) The matched function satisfies the properties: The resummation ensures the integrability of the matched distribution, and the matching ensures that in the region of validity of the soft-haze factorization, we reproduce the softhaze spectrum. Thus we may simply replace the plus-distribution for the fixed order result within eq. (3.21) This result is still valid to the same logarithmic accuracy in the log counting for the e spectrum is multiplictative to the D 2 spectrum, we can correctly predict the shape of the distribution for both large and small D 2 without resumming any logs of e (α) 2 to at least N 4 LL accuracy, which is far beyond the practically achievable accuracy.
We stress that such a simple matching procedure does not work when considering the ungroomed D 2 distribution. For the ungroomed distribution in the soft haze region, we are forced to write where we now have a convolution in the e (α) 2 variable, denoted by the ⊗. We may still replace the fixed order distribution in D 2 with the matched distribution, both normalized to obey the correct D 2 sum rule, but now we must perform a convolution in e   Figure 11. The collinear subjets configuration for a boosted color singlet decay is shown in (a). The structure of factorization formula is shown in (b). Figure from [15].

Signal jets
In this section, we give the effective field theory description for a groomed hadronicallydecaying color singlet, which we take for concreteness to be a Z boson: A brief description of the functions appearing in eq. (3.28) is as follows: • H(Q 2 ) is the hard function describing the production of the on-shell Z boson.
2 , m 2 Z ) is a hard function describing the decay of the Z boson into a qq pair.
• J a (e

Factorized cross section in pp collisions
In this section we will discuss the extension of the e + e − factorization formulae of section 3.2 to pp collisions. In particular, we show that for phenomenologically relevant parameters for -26 -

JHEP02(2018)144
the jet mass and p T , the assumptions of the factorization formula hold, and no new ingredients are required to extend the factorization formula to pp. The only process dependence is carried by the quark, anti-quark and gluon fractions of the process. This will follow straightforwardly from the universality of collinear factorization and the fact that all the factorization formulae of section 3.2 were obtained through a refactorization of the jet or collinearsoft functions. For concreteness, in this section we will consider the factorization for the groomed D 2 observable in pp → Z +j. We identify the highest p T jet satisfying |η J | < η max , groom it with the mMDT/soft drop algorithm, and then measure D 2 on the groomed jet. It is important to emphasize that here we are completely inclusive over additional hadronic activity throughout the event. We do not need to apply any form of veto on out-of-jet radiation, as is sometimes imposed to study ungroomed jet mass (for example, see ref. [71]).
Since our factorization formula for the groomed D 2 observable is obtained as a refactorization of the cross section for mMDT/soft drop groomed e (α) 2 observable, we begin by summarizing its factorization in pp collisions. In ref. [34] it was shown that in the region where the factorization formula applies, namely e (α) 2 z cut 1, the cross section can be written as Here, it is important to emphasize that since we are inclusive over hadronic activity in the event, a strict factorization into jet and soft functions does not apply. Indeed, it is clear that this must be the case, since the number of jets in the event is not fixed. Nevertheless, eq. (4.1) shows that all dependence on the rest of the event can be absorbed into a process dependent normalization factor D k , which does not depend on the e (α) 2 observable. In general, D k depends on the minimum p T cut, the jet radius R, rapidity cuts, parton distributions, z cut , etc. The e (α) 2 observable is set by universal collinear physics described by the convolution between the collinear-soft function and the jet function. Since these are collinear matrix elements, they depend only on the collinear dynamics of the particular jet in question, and are independent of other jets in the event. In particular, global color correlations are absent.
The D k functions depend on the parton flavor, which must be summed over, an added complication of jets in pp collisions. While parton flavor is not in general an IRC safe quantity, due to the fact that soft partons can radiate flavor into or out of the jet, it was shown in ref. [34] that the parton flavor can be defined on soft dropped jets in the limit e (β) 2 z cut 1, where the factorization formula applies. On a soft dropped jet, we can define the flavor of the jet as where f q = 1, fq = −1, f g = 0, and J SD indicates the constituents of the jet after the soft drop algorithm has been applied. If f J = ±1, then the jet is defined as quark type, while if f J = 0, the jet is defined as gluon type. In the normalized distribution, the D k can therefore be interpreted as quark, anti-quark, and gluon jet fractions in the event sample -27 -
The factorization of D 2 in pp collisions now follows trivially from combining the factorization formula of eq. (4.1) for soft dropped e (α) 2 with e (α) 2 z cut 1 with the factorization formulae derived for e + e − in section 3.2. To proceed, starting from eq. (4.1), we refactorize the jet and collinear-soft functions, as appropriate. This also implies that the same process dependent functions D k also appear in the expression of the cross section of D 2 . We can then write in the collinear subjets region of phase space, in the soft subjet region of phase space, and in the collinear-soft haze region of phase space. Importantly, since the same D k factor appears in each of the factorization formulae in the different regions of phase space, we can then perform the marginalization separately over the different factorization formulae. We can therefore write, for the normalized distribution when summed over the factorization formulae: where the κ k can be interpreted as the fraction of jets in the sample with flavor k.

Consequences of factorization formulae
Given the factorization formulae developed in the previous sections, there are several fascinating consequences that immediately follow. Several of these have been noted before (see ref. [34]), and are consequences of the fact that mMDT or soft drop removes soft, wide angle radiation in a jet from contributing to the observables of interest. Here, we will briefly mention these general properties of mMDT and soft drop grooming, and discuss in some detail features that are new to measuring D 2 on these groomed jets. The absence of soft, wide angle radiation in the jet eliminates event-wide color correlations and NGLs of the groomed jet observables to all orders in α s . With the relative scaling that we have assumed between the two-point energy correlation function and z cut , e E 2 J . Note that this property requires that we restrict the jet mass appropriately and then measure D 2 , an observable which resolves further substructure of the jet.
For applications to the LHC, it is interesting to briefly consider the values of the jet mass and p T for which our factorization formula, and therefore this universality, holds. Observables such as D 2 are used at the LHC both to identify hadronically decaying W/Z/H bosons, as well as to search for new light particles with m m Z which decaying hadronically [76]. For many of these searches, the bulk of the data is for p T > 500 GeV, and extends up to approximately p T ∼ 1000 GeV. Using the condition e (2) 2 z cut 1, with z cut = 0.1, we expect that our factorization will begin to break down around p T = 500 GeV for m J ∼ m Z , if the value of z cut = 0.1 is fixed. For lower values of p T , one will become sensitive again to global color correlations from emissions with energy fraction greater than z cut , which do not fail the soft drop criteria, and can contribute to the observable. Taking as a concrete example a bin from p T = 600 − 800 GeV in which the D 2 observable has been -29 -JHEP02(2018)144 measured by ATLAS [77], for a jet mass of m J ∼ m Z , this has e (2) 2 0.02. For z cut = 0.1 the assumptions of our factorization formula safely hold. For lighter particles the p T range can be extended, or alternatively, the expansion parameter is smaller. We therefore find that our factorization applies for most of the p T range of phenomenological interest, and therefore so do our conclusions regarding the universality of the distribution. We believe that this understanding of universality derived from the factorization formula is one of the most important outcomes of our analysis.

Hadronization corrections suppressed by perturbative jet mass
The dominant non-perturbative corrections to a factorization formula arise from modes whose virtualities approach Λ QCD . A simple estimate of the size or importance of these non-perturbative effects follows from determining the value of the observable at which the lowest virtuality mode becomes comparable to Λ QCD . The mode with the lowest virtuality often corresponds to soft, wide angle emissions. So, by grooming them away with mMDT or soft drop, we can significantly reduce the effect of non-perturbative corrections and render the perturbative distribution more robust.
To see this for D (α) 2 on groomed jets, we first review the size of non-perturbative corrections in the ungroomed case. For concreteness, we will focus on the non-perturbative corrections to the collinear subjets factorization formula. In the ungroomed case the lowest virtuality mode is that of soft, wide-angle radiation; see section 3.1.2. Its virtuality was identified in ref. [15] and is where we have expressed e If we take α = 2 for concreteness, this can be rewritten in terms of the jet mass and energy as Therefore, perhaps surprisingly, as the jet energy increases for a fixed jet mass, nonperturbative corrections increase significantly. If we assume that E J = 500 GeV, m J = 100 GeV, and take Λ QCD = 1 GeV, then D As before, taking α = 2 for concreteness, we find that non-perturbative effects dominate when D become larger, for a fixed jet mass cut, as the energy of the jet is increased. However, by grooming the jet with mMDT or soft drop, non-perturbative corrections are independent of the jet energy! Physically this arises since after grooming the jet behaves loosely like a boosted event shape, and it is the jet mass that sets the scale. As long as the mass cut on the jet is perturbative, hadronization corrections are highly suppressed. Importantly, the distribution is perturbative well below D 2 ∼ 1, into the region where two-prong jets live. Taking the numerical values of z cut = 0.1, Λ QCD = 1 GeV, and m J = m Z , we find that dominant non-perturbative correction arises from the soft dropped soft subjet region of phase space, and we can estimate that non-pertubative effects becomes important at D (2) 2 ∼ 0.35. Non-perturbative corrections for the other regions of phase space in the factorization formulae are further suppressed, and so are ignored. A more detailed study of non-perturbative effects for the D 2 distribution is performed in a companion paper [1].
Combined with the fact that the distribution terminates at (as discussed in section 2.3) this implies that, for a fixed mass cut, the full distribution, including non-perturbative effects, of the mMDT/soft drop groomed D (α) 2 is largely independent of the jet energy! Unlike the ungroomed D (α) 2 distribution, which had both an upper endpoint and location of non-perturbative corrections that depended on the jet energy, the groomed D (α) 2 distribution has endpoints and non-perturbative corrections that are independent of the jet energy. We will demonstrate in sections 6 and 7 that both the NLL calculation of the distribution as well as the Monte Carlo simulation respect this prediction.

Grooming efficiency for signal jets
While we have focused on the properties of the mMDT/soft drop D 2 distribution for background (QCD) jets, jet grooming can have an effect on the signal distribution as well. For an unpolarized boosted Z boson that decays to a qq pair, the distribution of the energy fraction z of the quark, say, is approximately flat:

JHEP02(2018)144
This implies that when the boosted Z jet is groomed, a fraction 2z cut of the jets will have one prong removed by grooming. For these jets that lose one prong, they will also typically fail the mass cut, as well as no longer have a clear two-prong structure. Of course, for z cut 1, this is formally a small effect, but practically, if z cut 0.1, then about 20% of the Z jets could have a prong removed. This effect could have a large effect on the signal D 2 distribution.
While at leading-order the distribution of the energy fraction z is approximately flat, when all-orders effects are included the regions with z → 0 and z → 1 are suppressed by a Sudakov factor. When z → 0, for example, there is of course no divergence in the leadingorder Z decay matrix element. However, a gluon emitted off of the soft decay product will itself necessarily be soft, and result in a divergence at fixed-order. When all-orders effects are included, these soft gluon divergences arrange themselves into a Sudakov factor that suppresses the probability for a decay product to only carry a small fraction of the energy of the Z. At double logarithmic accuracy (DLA), this Sudakov factor is which can be derived from the Z boson decay matrix element at next-to-leading order. This Sudakov factor pushes decay products of the Z to have more equal energies, and reduces the fraction of Z jets that have a subjet that is removed by the jet groomer. That is, due to all-orders effects, hadronic decays of Z bosons can look more two-prong-like than their fixed-order description would suggest. In our analytic calculations for the prediction of the D 2 distribution on groomed signal jets, we include this resummation to NLL accuracy. The suppression of the z → 0 and z → 1 regions will be much larger than that suggested by the simple Sudakov factor that exists at DLA accuracy. Nevertheless, even at DLA accuracy, this suppression is nontrivial. With z cut = 0.1 and using the distribution of eq. (5.9), only about 15% of Z jets fail soft drop, as compared to 20% using eq. (5.8).

NLL predictions in e + e − collisions
In this section we use our factorization formulae to provide numerical results for the D 2 distribution in e + e − collisions. In section 6.1 we compare our result, expanded to fixed order, with the fixed order code EVENT2 to ensure that we reproduce the singular behavior of the D 2 distribution. In section 6.2 we compare our resummed results, matched to fixed order, with parton shower Monte Carlo.
In all of the plots that follow, we set the angular exponent in the definition of D 2 to α = 2. We do this for two reasons. First, when α = 2, the two-point energy correlation function e (α) 2 is proportional to the squared mass of the jet. Therefore, it is easy to enforce a mass cut on jets of fixed energy. Additionally, from the analysis of non-perturbative effects in section 5.2, with α = 2, for a given mass cut, there is no dependence on jet energy of the size of non-perturbative corrections. A study of mMDT groomed D 2 with exponent α = 2 is beyond the scope of this paper, though we expect that many of the features we observe to apply to groomed two-pronged observables more generally. e + e -→ qq, 1 TeV

Singular results and comparison with EVENT2
To verify that our factorization reproduces the singular behavior of the D 2 distribution as D 2 → 0, we can compare the results of our factorization formula, expanded to α 2 s , with the fixed order generator EVENT2 [60]. In figure 12a we show the result of EVENT2 in each of the color channels, compared with the expansion of our NLL formula. We see that at small values of D 2 , our NLL formula captures the singular structure of the EVENT2 distribution, as is required. Here we consider e + e − → dijets at 1 TeV with z cut = 0.1, but we have found similar agreement for other values of the z cut parameter, while verifying the independence on the center-of-mass energy and jet mass bin. This verifies the consistency of our factorization to O(α 2 s ). Due to the complexity of our factorization, this is a highly non-trivial check, and gives us confidence that have correctly incorporated all modes in the effective theory.
In figure 12b we show a linear plot of the D 2 distribution, comparing EVENT2 [60], Pythia 8.226 [78,79], and a calculation using the 1 → 3 splitting functions that is discussed in detail in appendix D. The details of the Pythia 8.226 result will be discussed in section 6.2. This figure illustrates two important points. First, as described in section 3, our factorization formula for the D 2 observable isolates the collinear physics. If we did not want to resum the small D 2 behavior, then this shows that the fixed order result can be computed, up to power corrections, using the 1 → 3 splitting function. This is seen by the excellent agreement between the result of EVENT2 and the result computed using the 1 → 3 splitting functions, shown in figure 12b. Second, our factorization formulae describe both the small D 2 region, where there is a resolved substructure, as well as the large D 2 region, where the substructure is unresolved. A correct description of the unresolved region of phase space, with the collinear-soft haze factorization of section 3.2.1, is required to describe the correct endpoint of the distribution, which occurs at 1/(2z cut ). In the

JHEP02(2018)144
collinear-soft haze factorization, we do not need to resum logarithms of D 2 , and therefore we can simply compute to fixed order, which is equivalent to a fixed order calculation using the 1 → 3 splitting function. In figure 12b, we see first of all that all three curves reproduce well the expected 1/(2z cut ) endpoint, and second, that the calculation based on the 1 → 3 splitting function describes relatively well the distribution at large values of D 2 , and in particular, the approach to the endpoint. 10 This is important, since it illustrates that already at LO one can have a reasonable description of the endpoint of the distribution, and that the phase space of the observable is already reasonably well filled out. In section 6.2 we will further study this in the matched distributions for different values of z cut .

Comparison with parton shower Monte Carlo
Having shown that we reproduce the singular structure of the D 2 distribution, in this section we compare our NLL resummed predictions multiplicatively matched to the leading order (LO) EVENT2 or 1 → 3 splitting functions with parton shower Monte Carlo. For QCD jets, we consider both e + e − → qq, as well as e + e − → gg, generated through an off shell Higgs, while for signal, we consider e + e − → ZZ events with both Zs decaying hadronically. The events were generated with MadGraph5 2.5.5 [80], and showered with Pythia 8.226 [78,79]. We also verified that similar results are obtained with Vincia [81][82][83][84][85], although for simplicity we do not show distributions from Vincia. Throughout this section we use FastJet 3.1.2 [86] and the EnergyCorrelator FastJet contrib [86,87] for jet clustering and analysis. All jets are clustered using the e + e − anti-k T metric [86,88] using the WTA recombination scheme [89,90], with an energy metric.
In figure 13 we show comparisons of our analytic predictions (on the left) with parton shower Monte Carlo results at parton level (on the right). Results are shown for both quark and gluon jets. We also highlight the region where non-perturbative effects from hadronization will have a significant impact on the distribution, as will be discussed shortly. The distributions are shown for three different values of the z cut parameter, namely z cut = 0.05, 0.1, 0.2. Overall, good agreement between the analytic calculation and the parton shower Monte Carlo is observed, and differences between quarks and gluons, as well as the behavior as a function of z cut are well reproduced. In particular, due to the inclusion of the fixed order corrections, the correct endpoint of the distribution is obtained in the analytic calculation. This is crucial for obtaining agreement of the distributions. We also note that, as observed for a wide range of groomed jet observables, the mMDT groomed D 2 distributions for quarks and gluons have a relatively small difference, as compared to its ungroomed counterpart.
It is interesting to consider the behavior as a function of z cut , and in particular the differences between the parton shower and analytic results as the value of z cut is increased, corresponding to a more aggressive grooming. In figure 13 we see that for smaller values 10 The precise behavior of the 1 → 3 splitting function calculation and EVENT2 at the endpoint becomes sensitive to the binning used in this region, since the distribution is rapidly vanishing. One must trade accuracy of reproducing the endpoint for numerical stability of the bins. We have found that using smaller binning always improves the agreement between EVENT2 and the 1 → 3 splitting functions, at the expense of having to run longer to achieve adequate accuracy and precision.   of z cut , where the grooming has a smaller effect, better agreement is observed between the analytic and parton shower results. We believe that this arises primarily due to two effects. First, since the endpoint of the distribution scales as 1/z cut , for smaller values of z cut , there is a less rapid transition between the resummation and fixed order regime, ensuring that there are well separated resummation and fixed orders regions. Secondly, we observe that for more aggressive grooming the LO fixed order prediction undershoots the Monte Carlo distribution at large values of D 2 . Since we are considering normalized distributions, this then translates into a large difference in the peak region, as seen most clearly for z cut = 0.2. We believe that this could be remedied by including higher order perturbative corrections, which would fill out the phase space better, more similar to the parton shower. Indeed, the LO predictions are known to undershoot in other grooming or D 2 studies [34,35,53,91,92]. It would be very interesting to include higher order corrections, but this is beyond the scope of the current paper. Although our focus in this paper is primarily on the calculation of groomed D 2 for QCD jets, in figure 14 we show a comparison of our analytic results with parton shower Monte Carlo for a hadronically decaying boosted Z. Excellent agreement is observed.
Finally, it is important to address the impact of hadronization corrections to the distribution. Although it was shown in section 5.2 that hadronization corrections are suppressed significantly by the grooming procedure, they still have a non-negligible impact on the D 2 distribution. In figure 15 we show the effect of hadronization on the groomed D 2 spectrum in parton shower Monte Carlo for both quark and gluon jets, and for the different values of z cut considered above. In figure 14b the effect of hadronization is shown for the boosted Z distribution. We see that in all cases, the effect of hadronization is quite minor, and does not dominate the shape of the distribution. It can be included in the analytic calculation using a model shape function [69,[93][94][95][96], although we will not pursue this further in this paper.
2 distribution for quark (left column) and gluon (right column) jets. Non-perturbative corrections are suppressed by the grooming procedure. A mass cut of m J ∈ [80, 100] GeV has been applied.

JHEP02(2018)144
A detailed study of the non-perturbative corrections for the D 2 distribution at the LHC, and their incorporation through shape functions, are presented in a companion paper [1].

Monte Carlo results in pp collisions
Analytic predictions of the D 2 distribution for pp processes of interest are presented in a companion paper [1]. In this section we perform a Monte Carlo study demonstrating the consequences of the power counting and factorization analysis presented earlier. For convenience, we recall here the major predictions of our analysis. We emphasize that these are robust predictions of the factorization formula, combined with the power counting analysis, which hold within the region of validity of the factorization formula, namely e (α) 2 z cut . They should therefore be independent of the details of the hadronization model or details of the perturbative shower. The non-trivial predictions are: • The endpoint of the D 2 distribution is fixed as 1/(2z cut ). This is independent of the jet mass, jet energy, hadronization, or the angular exponent used to define the energy correlation functions.
• The scale at which hadronization corrections become important is independent of the p T of the jet. It depends only on Λ QCD , the jet mass, and the value of the z cut parameter.
• The distributions depend only the quark vs. gluon fraction of the jets in the event, but are otherwise process independent. We have previously argued that the soft drop procedure also reduces the dependence of the distribution on the parton flavor.
We will see that each of these predictions is well reproduced by Monte Carlo parton shower in pp collisions. The parton-level samples in this section were generated at the 13 TeV LHC with MadGraph5 2.5.5 [80] and showered with Pythia 8.226 [78,79] with default settings. Jets were clustered with the anti-k T algorithm [88] in FastJet 3.1.2 [86] and the EnergyCorrelator and SoftDrop FastJet contribs [86,87] for jet analysis.

Signal distributions
We first show the distributions for signal jets. The signal jets are clustered from pp → ZZ events in which one Z boson is forced to decay to neutrinos, and the other to hadrons. In figure 16, we plot the groomed jet D 2 distributions of the hadronically-decaying Z boson. In these plots, the mMDT groomer (soft drop with β = 0) is used, with the parameter z cut = 0.1. A mass cut of 80 GeV < m J < 100 GeV is also imposed on the groomed jet. First, in figure 16a, we plot the D 2 distribution for various p T cuts on the jets. The signal distributions are stable with respect to p T , which might be expected as the Z boson is a color-singlet and has an intrinsic, Lorentz-invariant energy scale, m Z . Importantly, the grooming appears to successfully remove contamination radiation at wide angles in the jet that would distort the distribution, and potentially become more important at larger p T .
In figure 16b, we fix the transverse momentum range to p T > 500 GeV, and vary the clustered jet radius from R = 0.6 to R = 1.0. Essentially no effect on the distribution is  observed in changing the jet radius, corroborating the performance of the mMDT groomer in removing contamination radiation. Note also that with this p T range and these jet radii, the Z boson decay products are well-contained within the jet. For this p T range, the angular scale of the Z boson decay products R Z is approximately

Background distributions
It may have been expected that the signal distributions were robust under changes of jet parameters, both due to grooming as well as the intrinsic mass scale. Here, we will show that mMDT/soft drop grooming also renders the background distributions extremely robust to jet parameters. We study jets produced in two processes for our background: pp → Zj and pp → Hj. The production of the Z or H bosons in association with the jet enables a handle on the quark and gluon jet fractions. Because soft drop grooming formally makes quark and gluon jet definitions infrared and collinear safe, we could in principle extract the individual quark and gluon jet distributions of D 2 from these two samples; however, since separate quark and gluon distributions were studied in the context of e + e − in section 6, here we will focus only on the mixed distributions. To easily isolate the hadronic jet in these events, we force the Z boson to decay to neutrinos and the H boson to decay to photons. As with the signal events, a mass cut of 80 GeV < m J < 100 GeV is imposed on the groomed jet.
In figure 17, we plot the mMDT D 2 distributions for various jet p T cuts, for both the pp → Zj and pp → Hj samples. As predicted from our factorization formula, the background distributions are very weakly dependent on the jet p T . This is a consequence of the facts that the endpoint of the distributions are fixed at D max 2 = 1/(2z cut ) = 5 and that non-perturbative effects become important at a scale set by the jet mass, and not the jet p T .
Additionally, the distributions in the Z and H samples are very similar, demonstrating that quark vs. gluon flavor effects are small. This is a consequence of both the constraints on the threshold and endpoint kinematics as well as the mixture of quark and gluon jets in the two samples. The constraints on the bounds of the distribution were derived independent of jet flavor, and so the distribution has a relatively weak dependence on jet flavor.
Plots of the groomed D 2 distributions on the background jet samples with different jet radii are shown in figure 18. Here, the jet p T cut is fixed to p T > 500 GeV, while the -40 -

JHEP02(2018)144
jet radius ranges from R = 0.6 to R = 1.0. As observed with the signal distributions, there is extremely weak dependence on the jet radius, demonstrating that mMDT/soft drop is efficient at removing wide-angle radiation in the jet that would be sensitive to the jet radius. After grooming, all radiation in the jet is collinear, and the relevant angular scales are set by the ratio of the groomed mass to the jet p T . By construction, this angular scale is always less than the jet radius (see eq. (7.1)), and so the jet radius is never relevant.

Conclusions
In this paper we have performed a detailed study of the factorization properties of groomed multi-prong observables, focusing in particular on the D 2 observable with mMDT or soft drop β = 0 grooming. We derived factorization formulae which describe the observable to all orders in α s , and allow us to make powerful statements of phenomenological relevance about the behavior of the groomed D 2 observable. Most interesting are the fixed endpoint of the distribution at 1/(2z cut ), and the independence of non-perturbative corrections on the jet energy scale. Combined, these imply a remarkable robustness of the groomed D 2 observable, which is important for jet substructure applications.
We have introduced factorization formulae describing each region of phase space relevant for groomed boosted boson discrimination. Some of these factorization formulae follow by combining those which previously existed in the literature, however some are new. In particular, we derived a novel factorization describing the production of a soft subjet with energy the scale of the soft drop parameter z cut . This factorization has the interesting property that clustering effects enter into the hard matching coefficient for the production of the soft subjet. We computed the functions entering the factorization at one-loop, and showed renormalization group consistency of the factorization formulae. While we have focused on applying these factorization formulae to the particular case of soft dropped D 2 , we believe that they will be more generally applicable for describing groomed jet substructure observables, particularly those based on the energy correlation functions. This includes, for example, the N 2 [43] observable used by CMS [76,97], or more ambitiously, energy correlation based observables for boosted top tagging [16,43,98].
We performed a numerical study for e + e − → dijets at NLL order, considering both the case of e + e − → Z → qq and e + e − → H → gg, allowing us to understand the differences between the D 2 distributions for quark and gluon jets. Non-perturbative effects were found to be small, and good agreement was found between the predictions of the Monte Carlo parton shower, and the analytic calculation. Since the D 2 observable probes multiparticle splittings, it may prove useful for testing Monte Carlo generators that implement 1 → 3 [99,100] or 2 → 4 splittings [101].
Perhaps most interestingly, we have shown that due to the grooming procedure, our calculations extend straightforwardly to proton-proton collisions at the LHC. This allows for experimentally realistic jet substructure observables currently used at the LHC to be calculated with theoretical precision. In this paper we performed a Monte Carlo study, showing that the features predicted by the factorization formulae are reproduced by Monte Carlo parton shower generators. In particular, we focused on the robustness of the groomed -41 -JHEP02(2018)144 D 2 distribution as a function of the jet p T , the scaling of non-perturbative hadronization effects, and the partonic content of the jet. We also showed that the groomed D 2 distribution is process independent up to the quark-gluon fraction of the jet. These are experimentally desirable features, which can be derived from a first principles theoretical description, and make groomed D 2 a promising observable for QCD studies at the LHC, as well as putting its theoretical understanding as a jet substructure tagger on firm theoretical footing. Analytic results for D 2 distributions for relevant processes at the LHC are presented in a companion publication [1].

Acknowledgments
We thank Ben Nachman, Stefan Prestel and Matt Schwartz for helpful discussions.

A Ingredients for collinear subjets
In this appendix we present the one-loop calculation of all the functions appearing in the collinear subjets factorization formula.

A.1 Kinematics and notation
We begin by briefly describe the kinematics and notation. We follow closely [15]. We let Q be the center of mass energy of the e + e − collisions. The energy deposited in each hemisphere is therefore Q/2, and the four-momenta of the hemispheres are so that we have s = Q 2 . For the kinematics of the subjets, we will use the following notation

JHEP02(2018)144
These satisfy the relations n · n a = n · n b = n a · n b 4 ,n · n a =n · n b = 2 , (A.6) n ⊥a,b ·n ⊥a,b = −n ⊥a,b · n ⊥a,b =n ⊥a,b ·n ⊥a,b = n a · n b 2 . (A.7) For a particle in the collinear sector a or b, we have We will label the energy fractions carried in each subjet by where the second relation is true to leading power.
We can now compute the leading power expressions for the observables in these kinematics. The value of e (α) 2 is given by For three emissions, with momenta k 1 , k 2 , k 3 , the expression for the three point energy correlation function is For an emission collinear with one of the subjets, where we have the splitting p a,b → k 1 +k 2 , we have For a third collinear-soft emission k off of the p a,b partons, we have 14) The factorization formula in this region of phase space, which we repeat here for convenience, is: 3 , z cut ) .

JHEP02(2018)144
The convolutions over e (α) 3 can be turned into products by Laplace transforming with respect to the variable e (α) 3 . We denote the Laplace transformed variables and functions with a tilde. When Laplace transformed, the cross section can be written as In this appendix, we will present the expressions for these Laplace transformed functions.

A.2 Matrix element definitions
We give the matrix element definitions of all functions in the factorization formula in eqs. (A.15) and (3.13), and eq. (3.28). The factorization formulae presented in this paper are formulated in the language of SCET [38][39][40][41]. We refer readers unfamiliar with SCET to the reviews [102,103]. The jet functions are identical to [15], so we only give the soft matrix elements. They are defined in terms of soft Wilson lines Explicitly, we have . (A.20) Here a, b are the light-cone directions of the subjets, the operator Θ SD (a, b, z cut ) imposes an energy cut of Qz cut /2 on any emission that is not clustered into legs a, b before they themselves are clustered, and E 3

(α)
SDa,b returns only the energy-correlation function computed on all momenta that are clustered into a or b before a and b are clustered. This works out to be a purely geometrical constraint, for both the soft-collinear and collinearsoft functions, since all emissions are at a scale where they cannot pass soft drop on their own. T f are color matrices contracted into the Wilson lines, and in general depend upon the flavor structure of the splitting, as does the representation of the Wilson lines, that is, whether the 1 → 2 subjet splitting was g → gg, q → qg, or g → qq.
Throughout these appendices we will refer to the perturbative order of the calculation with a superscript (L), where L denotes the loop order of the calculations. For example, for a function F , we have where F (L) denotes the L-th loop correction to the function F . The superscript (T ) stands for "tree".

A.3 Hard function
The hard function for e + e − → dijets can be found in [54,[104][105][106]. To O(α s ) it is given by and its anomalous dimension is given by For notational simplicity, throughout these appendices we do not explicitly write the arguments of the anomalous dimensions.

A.4 Hard splitting function
The hard splitting function describing the q → qg splitting into two collinear subjets was first derived in the SCET + context in [54] using results from [107,108], while the g → gg splitting and g → qq was derived in [15], where all the coefficients were given in terms of the variables most useful for the current study. Taking as an example initial quarks splitting to O(α s ), the hard splitting function is given by and its anomalous dimension is

A.5 Jet functions
The jet functions in the collinear subjets region of phase space are identical to the case that no grooming is applied, and therefore are given in [15]. To O(α s ), the quark jet function in the direction n a is given bỹ

JHEP02(2018)144
and its anomalous dimension is The gluon jet function is given bỹ and its anomalous dimension is

A.6 Wide-angle soft function
The wide angle soft function in the collinear subjets region of phase space does not resolve the collinear splitting, and is therefore identical to that for the soft dropped mass in e + e − , which was derived in [34]. At O(α s ) it can be written as a sum over dipoles, and is given by Its anomalous dimension is (A.31)

A.7 Collinear-soft function
The collinear-soft function, C s (e (α) 3 , θ < θ ab ) is new, and we therefore calculate it to O(α s ) in this appendix. At O(α s ), it can be written as a sum over dipoles where the contribution from a given dipole is given by 3 . (A.33) The soft drop constraint in this region of phase space is given by (A.34) -46 -

JHEP02(2018)144
The soft drop constraint is easily understood: the emission is to be clustered into direction a or direction b of the splitting before the two legs are themselves clustered. The measurement function is We may take the light-cone direction of the parent jet to be n = 1 2 (n a + n b ), conjugate tō n. Decomposing k into the light-cone basis formed by n-n, we have n a · k = n a · n b 4n · k + n · k + n a · k ⊥ , (A.36) Here we have used n a ·n = n b ·n ≈ 2. We can write the dot product in the transverse plane as Here we choose a fixed but arbitrary direction in the transverse plane for the projection of the direction n a , defining the angle φ. Solving the on-shell condition and rescaling, we havē and the Lorentz invariants become We can now see that the integration can be efficiently represented by an integral in the transverse plane. The integration measure, having solved the on-shell conditions with these rescalings, is then given by We can now perform a shift k → k + 1 2n , to get the compact form n a · k = Qn a · n b z k 2 , (A.43) The measurement function then becomes 2 (n a · n b ) 2 16 z k 2 ( k +n) 2 , (A. 45) and the soft drop condition becomes We note that the reduction of the collinear-soft function to integrals over a transverse plane is not entirely unexpected, given the duality between time-like and space-like soft processes, see refs. [109][110][111].

A.7.1 Soft integrals in the transverse plane
We solve the measurement constraint using the z integral, and go to Laplace space. We note that the roles of a and b are interchangeable (this amounts to mapping k → − k −n), so that we have 3 , θ < θ ab ) = −C in T a · Tn Here we have introduced the prefactor

A.7.2 Isolating divergent contributions
We note that all integrals in eq. (A.47) only have a divergence at k 2 = 0, so that we may add and subtract to each color structure the integral The resulting subtracted integrals are all finite, and can be evaluated numerically as a Taylor series in in terms of two-dimensional integrals. We have found an analytic expression for all the divergent contributions. For instance, in the a-b dipole, we first perform the integral over the magnitude of the vector in the transverse plane, which after some algebra, gives the angular integral Here Cl 2 (x) is the Clausen function, which has a value of approximately 1.01494 at its maximum x = π 3 . The other dipoles are handled similarly. Putting in the appropriate factors for MS scheme, we find C s,ab (ẽ where the logarithm is defined as From these expressions it is straightforward to extract the anomalous dimension and perform the renormalization group evolution using standard techniques.

A.8 Soft-collinear function
Now we compute how radiation that is not clustered into the two hard prongs gets groomed. Such radiation is described by the function S c (z cut , θ > θ ab ). At one-loop order, the contribution from the generic dipole i, j is given by where Θ SD was defined in eq. (A.34), and the constraint on the energy fraction of the groomed radiation is given by We use the same coordinate system as above, and arrive at the following representation for each dipole where we have defined the constant (A.57) -49 -

JHEP02(2018)144
To isolate the k = 0 divergence, we now add and subtract the integral We then find that the divergences and µ-dependent logs have the structure where the logarithm is defined as It is again straightforward to extract the anomalous dimensions from the above results, and perform the renormalization group evolution.

A.9 Anomalous dimensions
Here we show that the anomalous dimensions sum to zero, showing the consistency of the factorization formula at the one-loop level. We label the anomalous dimension's flavor structure to distinguish different channels. For q → qg splitting, the anomalous dimensions calculated above are given by For g → gg splitting, we have One can deduce the anomalous dimension structure for g → qq using the appropriate color generators in the collinear-soft matrix elements, and the matching for the splitting given in [15]. To achieve NLL accuracy, one must include the contribution from the two-loop cusp anomalous dimension to the coefficient multiplying logarithmic terms in each of the anomalous dimensions. For e + e − → hadrons events in which we divide the event into hemispheres, groom each hemisphere, and then measure e 2 and e 3 on each hemisphere, the anomalous dimensions must satisfy One can verify that indeed this is satisfied, demonstrating consistency of the factorization at one-loop, or NLL accuracy. Due to the highly non-trivial combinations of scales appearing in the different functions, this provides a strong cross-check of our calculation and factorization formula.

A.10 Description of resummation, scale choices and profiles
Since we are resumming to NLL, the contribution to the cross section from each factorized function is given by the formula where γ F is the appropriate anomalous dimension for the considered function, and µ f is the scale where we run the function to. For canonical scale setting, µ f is where all large -51 -

JHEP02(2018)144
logarithms are minimized. To perform the resummation, we substitute in eq. (A.15) the resummed expression for each function. In general, so that we may make use of profiles to control the precise value of the resummation scale where needed, and so that we can match to the fixed order result, we keep all terms in the renormalized functions that explicitly contribute to the anomalous dimensions. Thus, when we turn off resummation, we will recover explicitly the differential cross section. When we resum we always scale set in the cumulative distribution. That is, we exponentiate all anomalous dimensions in Laplace space, perform the inverse Laplace transform with generic endpoints to the RG evolution, and integrate to get the cumulative distribution. Then we set all scales to their canonical values (which are given below), and take the derivative to get the differential distribution. For a detailed discussion of how to implement the resummation procedure in SCET, see [112].
The canonical scales, given as functions of D 2 , are When performing the resummation, we take the wide-angle soft scale as the common scale where we factorize. When assessing resummation uncertainties, we vary all scales up and down by a factor of two. To handle the Landau pole in the running coupling, we smoothly "freeze out" the running coupling as a function of its scale at 1 GeV, so that it is simply a constant function below this value, and vary this freeze out scale up and down by 0.5 GeV. In general we have very little sensitivity to the freezing scale of the running coupling. To achieve NLL accuracy, we also promote the coefficient of the logarithmic terms in the anomalous dimensions in appendix A.9 to two-loop accuracy, as given by the perturbative expansion of the cusp anomalous dimension, and the running of the coupling (including when integrating the anomalous dimensions).
To turn off the resummation, we use the simple profile This choice ensures that canonical resummation effectively dominates at D 2 ∼ 1, and that the resummation is turned off at the endpoint D 2 = 1/(2z cut ). As part of the uncertainty estimate (in addition to scale variations of the canonical resummation scales), these choices were varied by 50%. We found that these profiles were sufficient to numerically cancel any zero that developed by dividing by the singular result in eq. (3.20), when the singular result developed a zero in the physical range. A more sophisticated profile would ensure that such a cancellation would happen exactly, but we found negligible differences between the matched result and the fixed-orded cross section even in a small neighborhood about the zero, given our numerical accuracy and sampling of the matched spectra.
The differences between the canonical scales setting and the profile scale setting is illustrated in figure 19a. The profile scale setting shifts the peak of the distribution to larger values of D (2) 2 , and the canonical scheme is more consistent with the MC distribution found from Pythia. In figure 19b we give an example where the profile scale setting must be used when performing multiplicative matching, due to the singularity that would otherwise be induced by not turning off the resummation. Again, the profiled distribution has its peak shifted to larger values of the distribution when compared against Pythia. For the background distributions considered in this paper, only the quark jet with z cut = 0.05 needed a profile scheme.
Results in this appendix will be expressed in Laplace space. Note that the anomalous dimension of the hard function H(Q 2 ) is zero in QCD because the production of the Z boson occurs in the QCD vacuum.

C.1 Soft matrix elements
The soft matrix elements are simplified relative to the earlier case of QCD splittings, since we have no Wilson line in the direction of the recoil. We therefore have 3 , θ < θ ab )S c (z cut , θ > θ ab ) .

(C.5)
As before, S q are soft Wilson lines, as defined in eq. (A.17).

C.2 Hard decay function
The anomalous dimension of the hard decay function, H Z→qq 2 (e (α) 2 , m 2 Z ), is identical to the anomalous dimension for the hard function of e + e − → qq, at Q 2 = m 2 Z . We therefore have Identifying the mass of the jet with the two-point energy correlation function as this anomalous dimension can also be expressed as The tree-level matrix element for Z boson decay to quarks is well-known and was first calculated in ref. [114]. with J(y a , y b , y k 2 ) = 1 16δ 3 sech 2 y a δ sech 2 y b δ sech 2 y k 2 δ .
(D. 16) The numerical integral is then performed at = 0. To perform the integral, we randomly sample y a , y b , and y k 2 uniformly on an interval [−y max , y max ], while φ is uniformly sampled on [0, 2π]. The variable δ simply controls how often one samples the region where D 2 ∼ 1. The value of y max sets the minimal D 2 that we can integrate down to. For each generated phase space point we compute the value of corresponding value of D 2 and the weight w(y a , y b , y k 2 , φ) = SD z cut , z a , (1 − z a )z b , k 2 , φ Θ J R; e Then a histogram H D 2 , indexed by D 2 , is updated according to where ∆D 2 is the width of bin at position D 2 . We then divide all bins in the histogram by the total number of phase space points sampled. In the limit of infinitely narrow bins, infinite statistics, and y max → ∞, the final histogram H D 2 is proportional to the differential spectrum dσ de

JHEP02(2018)144
Aside from the transformation to the rapidity like variables y a , y b and y k 2 , which serve to smooth out the soft and collinear singularities, no further importance sampling is performed. To actually fix the normalization of the histogram, the simplest procedure is to compare to the analytic predictions in the singular region. For a fixed bin size, we can fit for the singular behavior of the histogram at a specific soft drop parameter z cut . Then we take the ratio to the analytic result in the limit D 2 → 0. This selects for the ratio of the double-logarithmic terms, and gives the relative normalization of the histogram to the singular result. This normalization is the same for all other values of z cut when using the same bin spacing, so that for all other z cut we can then test that the fixed order result is reproducing the analytic double and single logarithmic structure at small D 2 .

D.3 Singular cross section
To find the singular cross section for gluon splitting, g → gg, we sum the collinear-soft and jet function contributions. This gives Performing a similar calculation for q → qg gives

E Collinear non-global logarithms
In this appendix we give the appropriate modification of the Dasgupta-Salam Monte Carlo algorithm [46] for computing NGLs in the large-N c limit, changed to account for the phase space constraints of the soft drop procedure. The origin of these NGLs was discussed in section 3.2.2, and the lowest order diagram that contributes was shown in figure 8. Throughout this section, we define the eikonal factor as W xy (j) = x · y (x · j) (j · y) . (E.1) Here the round bracket is defined as (i · j) ≡ 1 − cos θ ij .

JHEP02(2018)144
We start the algorithm with a list of initial dipoles, D init , that depends on the flavor structure of the 1 → 2 splitting, and we generate a list of active radiating dipoles D and a list of emissions E that fail soft drop. That is, they are not clustered into the legs a and b of the hard 1 → 2 splitting. We introduce the phase space constraint given this list of emissions C ab j, E = Θ θ ab −θ aj Θ θ bj −θ aj i∈E Θ θ ij −θ aj +Θ θ ab −θ bj Θ θ aj −θ bj i∈E Θ θ ij −θ bj .

(E.2)
Here θ xy is the angle between x and y. In other words, we test emission j to see if it is closer in angle to either direction a or b than any other emission in the list E. If it is, it will pass soft drop by virtue of being clustered into the hard splitting. Finally, we introduce the one emission phase space that sets the resummation of the Sudakov or global logarithms C ab (j) = Θ θ ab − θ aj Θ θ bj − θ aj + Θ θ ab − θ bj Θ θ aj − θ bj . (E.3) Though written explicitly in terms of angles, this is the same phase space as eq. (A.34), since the Lorentz products can be simplified to an angular constraint once we factor out and cancel the energy scales. The idea of the algorithm is that in the leading logarithmic approximation for the NGLs, every emission not clustered into the a-b dipole is formally at an energy scale below Qz cut , and the emissions are ordered in energy. We therefore generate the emissions at each step according to a distributioñ Here D is the set of decaying dipoles, and D init the set of initial dipoles. After generating the emission, we update the list of real emissions E and the dipole list D, so long as the emission qualifies as a real emission, rather than a virtual process. The virtual subtraction is implemented via a veto algorithm following the original Dasgupta-Salam algorithm. We allow the initial dipoles to decay until we have an emission that is clustered into the a-b dipole, before it is clustered into any other emission, that is, C ab (j, E) = 1. This replaces the hemisphere condition to start a new event. Once this condition is met, we end the event and book the histogram. Thus, for each event we must track all the real emissions that have been created so far, and check for each new emission whether it clusters into the a-b dipole rather than any other emission generated so far. 11 The virtual subtraction condition, that is whether or not we treat the emission as a virtual correction and so reweight the event, is triggered when C ab (j) > 0 and j was emitted from a dipole containing an initial leg. We take the high scale for the NGL evolution to be µ Sc , and evolve down to the scale µ Cs , see appendix A.10.
Open Access. This article is distributed under the terms of the Creative Commons Attribution License (CC-BY 4.0), which permits any use, distribution and reproduction in any medium, provided the original author(s) and source are credited.