The soft drop groomed jet radius at NLL

We present results for the soft drop groomed jet radius $R_g$ at next-to-leading logarithmic accuracy. The radius of a groomed jet which corresponds to the angle between the two branches passing the soft drop criterion is one of the characteristic observables relevant for the precise understanding of groomed jet substructure. We establish a factorization formalism that allows for the resummation of all relevant large logarithms, which is based on demonstrating the all order equivalence to a jet veto in the region between the boundaries of the groomed and ungroomed jet. Non-global logarithms including clustering effects due to the Cambridge/Aachen algorithm are resummed to all orders using a suitable Monte Carlo algorithm. We perform numerical calculations and find a very good agreement with Pythia 8 simulations. We provide theoretical predictions for the LHC and RHIC.


Introduction
At present day collider experiments such as the LHC and RHIC, highly energetic jets play an important role as precision probes of the Standard Model and beyond. In the past years, jet substructure techniques have become important tools in high energy particle and nuclear physics. One of the important techniques that have been developed is jet grooming which is designed to remove soft wide-angle radiation from the identified jets. Algorithms that remove the soft contamination of jets allow for a more direct comparison of perturbative QCD calculations and data due to the reduced sensitivity to nonperturbative effects. Different grooming algorithms have been developed in the literature such as [1][2][3][4]. In this work, we focus on the soft drop grooming algorithm of [4]. Both on the experimental [5][6][7][8] and the theoretical side [9][10][11][12][13][14][15][16][17][18][19][20][21][22][23][24], significant progress has been made recently in improving our understanding of soft drop groomed jet observables. In the heavy-ion community, soft drop groomed jet substructure observables have also received increasing attention from both experiment [25][26][27][28] and theory [29][30][31][32][33][34][35][36][37][38][39]. Jet grooming techniques can be used to isolate different aspects of jet quenching and may help to discriminate between different model assumptions [40].
One of the interesting features of soft drop grooming is that the radius of the groomed jet is adjusted dynamically, capturing only the hard collinear core of the jet [4] which we study in this work within perturbative QCD. We consider inclusive jet production pp → jet + X where jets are identified with the anti-k T algorithm [41] with a given radius R. Following the soft drop algorithm, the identified jets are then reclustered with the Cambridge/Aachen (C/A) algorithm [42,43]. The obtained angular ordered clustering tree is then declustered recursively where at each step the soft drop condition is checked (1.1) Here p T 1,2 are the transverse momenta of the two branches obtained at each declustering step and ∆R 2 12 = ∆η 2 + ∆φ 2 is their geometric distance. Soft branches that fail the criterion are removed from the jet. The algorithm terminates when the criterion is met and the particles in the remaining two branches constitute the groomed jet. The soft threshold z cut and the angular exponent β are fixed parameters that determine how aggressively soft radiation is removed. For β = 0, the soft drop algorithm reduces to the modified mass drop tagger (mMDT) [3]. Two variables that characterize important features of the soft drop groomed jet are the momentum sharing fraction z g and the groomed jet radius R g . Their values are obtained from the kinematics of the two remaining branches when the soft drop algorithm terminates z g = min[p T 1 , p T 2 ] p T 1 + p T 2 , R g = ∆R 12 = R θ g . (1.2) Often the variable θ g is used which corresponds to the geometric distance of the two remaining branches normalized by the radius R of the ungroomed initial jet. Note that unlike the external parameter R which is the radius of the initial jet, the groomed radius R g is a distribution which is determined through the soft drop grooming procedure. Since the C/A algorithm first clusters particles that are closer in angle, the groomed jet radius R g defines the maximally allowed angle between two branches that can be clustered. Similar to the radius R of the initial jet, the distance R g constitutes the radius of the soft drop groomed jet. By analyzing the active area of recursive k T -type algorithms, it was found in [44] that jets have an area of the order O(πR 2 ). A similar analysis was performed for the groomed radius R g in [4] verifying that the active area of a soft drop groomed jet is of the order O(πR 2 g ). In the phenomenologically relevant limit of R g 1 and z cut 1, large logarithms may spoil the convergence of the perturbative series expansion in terms of the QCD strong coupling constant. In [4], the soft drop groomed radius was calculated within the modified leading-logarithmic (MLL) approximation. In this work, we extend the calculation to nextto-leading logarithmic (NLL) accuracy using a factorization formalism developed within Soft Collinear Effective Theory (SCET) [45][46][47][48][49], which is suitable for the extension to yet higher perturbative accuracy. Besides the resummation of logarithms of R g , we also take into account logarithms of R and the soft threshold parameter z cut . Based on the equivalence of the R g measurement and a jet veto on emissions between the splitting that satisfies the soft drop criterion and the boundary of the initial ungroomed jet, we establish the all order factorization framework. Different than for example the groomed jet mass distribution, non-global logarithms (NGLs) [50] directly contribute to the cross section starting at NLL accuracy. The use of the C/A algorithm introduces clustering constraints that give rise to clustering logarithms associated with both NGLs as well as global logarithms, which are referred to as Abelian clustering logarithms [51][52][53][54][55][56]. We resum the NGLs including clustering constraints and the Abelian clustering logarithms at leading logarithmic (LL) accuracy and leading color using a suitable Monte Carlo algorithm which we introduce here following the work of [50,57,58].
The remainder of this paper is organized as follows. In section 2, we outline the factorization formalism developed in this work based on the equivalence between the groomed radius measurement and a jet veto when R g 1 and z cut 1. We identify the relevant NGLs and Abelian clustering logarithms and perform the relevant fixed order calculations. In section 3, we introduce the Monte Carlo setup that allows for the all order resummation of NGLs and clustering logarithms at LL accuracy needed to achieve the overall accuracy at NLL. Numerical studies and a comparison to Pythia 8 simulations are presented in section 4. We draw our conclusions in section 5 and present an outlook.

Factorization and resummation
In this section, we develop the factorization theorem for the soft drop groomed jet radius within SCET. We start from the cross section Σ(θ g ) differential in the transverse momentum p T and rapidity η of the observed jet, but cumulative in the groomed jet radius where any value below θ g contributes. The distribution differential in θ g can then be obtained as dσ dη dp T dθ g = d dθ g dΣ(θ g ) dη dp T . (2.1) We work in the limit where the observed jet is sufficiently collimated R 1 and we drop power corrections of the form O(R 2 ). This type of power corrections are generally found to be small even for relatively large values of the jet radius [59]. In this limit, the production of an energetic parton in a hard-scattering event factorizes from the formation and evolution of the jet initiated by the produced parton. The hard-scattering process ab → c is described by hard functions H c ab which are known analytically to next-to-leading order (NLO) [60,61]. The subsequent formation and evolution of the jet is described by a semi-inclusive jet function G c [62][63][64][65][66]. This separation is generally expected to hold to all orders due to the universality of the collinear limit in QCD [67]. We can thus write the cumulative cross section in θ g as where f a,b denote the parton distribution functions (PDFs) for finding partons a, b in the colliding protons. Here, ⊗ denote appropriate integrals over the longitudinal momentum fractions x a,b of the initial partons and z which is the fraction of transverse momentum contained in the observed jet relative to the scattered parton c. Note that the jet rapidity η only appears in the hard functions H c ab when subleading terms ∼ O(R 2 ) are ignored. On the other hand, the entire dependence on θ g and the grooming parameters is contained in the jet function G c . Single logarithms of the jet radius α n s ln n R can be resummed by solving the renormalization group (RG) evolution equation (DGLAP) associated with the jet function G c which is given by Here, P dc denote the Altarelli-Parisi splitting functions which can be computed order by order in α s . In the kinematic region where z cut ∼ θ g ∼ O(1), the factorization theorem in eq. (2.2) is sufficient to carry out calculations at fixed order in perturbation theory. In the phenomenologically relevant region where z cut 1 and θ g 1, logarithms of the form α n s ln 2n θ g (β > 0) and α n s ln 2n z cut may spoil the perturbative convergence and an all order resummation is required. This can be achieved by a refactorization of the semi-inclusive jet function G c in order to separate the physics at different scales in the relevant kinematic regime. The associated RG evolution equations then allow for the resummation of all relevant large logarithms.

Refactorization of the semi-inclusive jet function
In this section we discuss the refactorization of the semi-inclusive jet function G c in the limit when both z cut 1 and θ g 1. We make use of power counting arguments to establish the refactorization. First, we consider energetic collinear radiation at the jet scale µ H ∼ p T R. To NLO, these are given by out-of-jet radiation diagrams, see for example [68,69]. The scaling of the associated collinear mode in terms of light-cone momentum components is given by Second, we consider soft modes that describe wide angle soft radiation 1 within the jet at an angle θ ∼ R. If such radiation passes grooming with momentum fraction z > z cut , then the scaling θ g 1 would be violated. Therefore, this kind of soft radiation must fail the grooming condition and it is thus independent of the θ g measurement. The associated momentum scaling is (2.5) The superscript indicates that the soft radiation considered here fails the grooming condition. The radiation associated with the two modes identified so far are taken into account by two functions, H n c→i and S / ∈gr i,n . Both are independent of the measured groomed jet radius and the same modes were obtained in other factorization theorems of groomed jet substructure observables before, see for example [11,13,14]. At this point, we obtain the following refactorized expression of the semi-inclusive jet function Here the additional summation over n and ⊗ Ω are introduced to account for NGLs [70,71] as discussed in more detail below. The remaining function F i contains the dependence on the groomed radius θ g . Here we need to consider both collinear and collinear-soft radiation [72]. The collinear radiation with momentum fraction z ∼ 1 always passes the grooming condition at leading power. The collinear-soft radiation instead is sensitive to the grooming condition and has z ∼ z cut θ β g 1. In both cases, the characteristic angular scale is θ ∼ R g and the radiation described by F i is thus insensitive to the boundary of the initial ungroomed jet. Note that this situation is different than for example the mode decomposition when the jet mass is measured to be small m 2 J /p 2 T 1. In that case, the angle of the collinear and collinear-soft radiation is set by θ ∼ m 2 J /p 2 T /z which depends on the scaling of the different momentum fractions. Because of this scaling that is imposed by the small jet mass measurement, the collinear and the collinear-soft radiation can be treated as two independent sectors. For the soft drop groomed jet radius, we thus have two additional modes with the following momentum scalings The soft drop declustering algorithm makes a further separation of these two modes to all orders highly non-trivial. However, as will be demonstrated in the next section 2.2, there is a formal equivalence between the soft drop declustering algorithm and a jet veto procedure when θ g is measured to be small. We can treat the groomed jet with radius R g as the signal jet and the collinear-soft branches are subject to a veto condition where the veto parameter is set to z cut θ β g p T . With this equivalence we can further refactorize F i in eq. (2.6) using results from jet veto calculations, see for example [70,[73][74][75][76]. We find that we can write F i in terms of a collinear function C i and a collinear-soft function S ∈gr NGLs are accounted for by the convolution integrals denoted by ⊗ Ω and the additional sum over the directions of collinear emissions m. Here we follow the notation introduced in [70], see also eq. (2.6) above. Collinear final-state particles set the directions for a multi-Wilson line structure. We sum over these directions n, m in eqs. (2.6) and (2.9) and ⊗ Ω indicates that angular integrals cannot be carried out independently which gives rise to correlations between the different functions resulting in NGLs. The NGLs in z cut associated with the functions H c→i and S / ∈gr i in eq. (2.6) will affect the θ g distribution only indirectly through the relative normalization of partonic channels. We note that the contribution from the correlation between the θ g sensitive and insensitive modes are power suppressed [4]. This can also be seen from eq. (2.6), where H c→i and S ∈gr i are fully decoupled from F i . In addition, beyond NLO clustering logarithms need to be taken into account due to the mismatch between the grooming operation acting on branches rather than individual partons and the use of the C/A algorithm. These contributions appear either in the soft function S / ∈gr i and the combination of C i ⊗ Ω S ∈gr i . Due to the summation over the collinear emission history and the angular convolution structure, the analytical resummation using the refactorized cross section is usually difficult and the approaches discussed in the literature typically resort to the Monte Carlo methods [50,57,58,70,73]. Up to NLL using the known jet veto results [50,51,53,74,75,77], we can write F i as Here . . . indicates that we performed the solid angle integration, which thus allows us to solve the RG evolution equations of the collinear and collinear-soft function analytically.
Here we define the variable t as The NGLs due to the correlation of the radiation near the boundary of the groomed jet in eq. (2.10) are taken into account by the function S C/A i,NGL (t, θ g ) which has the following perturbative expansion with coefficients S C/A i,n (θ g ). The relevant configuration at NNLO is illustrated on the right side of Fig. 1, which takes into account correlations between emissions inside and outside the groomed jet but inside the initial ungroomed jet boundary. At NNLO, these NGLs are of the form ∼ α 2 s ln 2 (z cut θ β g ). As the phase space of the in-and-out configurations is affected by the C/A algorithm, the numerical size of the NGLs are reduced due to clustering effects. The C/A algorithm also introduces global Abelian logarithms. The function A C/A i,Abel. (t, θ g ) takes into account this contribution which can be calculated perturbatively as with coefficients A C/A i,n (θ g ). In sections 2.5 and 2.6, we compute the leading NGLs and Abelian clustering logarithms at NNLO and determine the coefficients S C/A i,2 and A C/A i,2 . In Figure 1. Configurations that give rise to different NGL contributions at NNLO ∼ α 2 s ln 2 z cut (left) and ∼ α 2 s ln 2 (z cut θ β g ) (right). The inner ellipse denotes the area set by the groomed radius R g inside the original ungroomed jet. order to achieve the resummation at LL and leading color, the Abelian clustering logarithms and NGLs are captured simultaneously by a suitable Monte Carlo algorithm as discussed in section 3. To leading logarithmic accuracy, the NGLs resulting due to the correlation of radiation inside and outside of the initial ungroomed jet can be resummed using the Monte Carlo results of [50]. Clustering corrections to this class of NGLs are power suppressed as the ungroomed jet is identified with the anti-k T algorithm. The respective NNLO configuration ∼ α 2 s ln 2 z cut is illustrated on the left side of Fig. 1. To NLL accuracy, we can thus write the convolution structure in eq. (2.6) as Here t is defined as t in eq. (2.11), but with the lower integration limit replaced by z cut p T . Therefore, S NGL (t , z cut ) can be obtained directly from the Monte Carlo result in [50] to leading logarithmic accuracy and leading color. In order to realize the resummation at NLL accuracy, all components of the refactorized semi-inclusive jet function need to be calculated to NLO. The hard-collinear matching coefficients H c→i at NLO can be found in [68,69]. The operator definition of the soft function S / ∈gr i and its result at NLO can be found in [13]. Both functions do not directly affect the shape of the θ g distribution, but they are important in order to determine the fractions of quark and gluon jets. The operator definitions of the remaining functions C i and S ∈gr i that appear in eq. (2.9) can be readily obtained by including the soft drop grooming operation in the relevant functions, see [70]. In this work, we calculate the functions C i and S ∈gr i in the refactorized expression in eq. (2.10) to NLO in order to achieve the resummation at NLL. The operator definitions and the results at NLO are presented in sections 2.3 and 2.4 below.

Equivalence between the soft drop groomed radius and the jet veto case
In this section, we show the equivalence between the soft drop declustering algorithm and a jet veto on emissions outside the groomed jet for small values of the groomed radius θ g 1 and z cut 1, in which case the collinear and collinear-soft modes are well-defined. Non-trivial examples of this equivalence at NNLO are presented in Appendix A.
The measurement functions of the collinear and the soft sectors are identical for both cases and we thus focus only on the collinear-soft radiation. We denote branches with collinear-soft scaling in the C/A clustering tree by J i which need to be tested against the soft drop criterion. The energetic collinear branch is denoted by J which can also contain further collinear-soft radiation and J by itself is not necessarily the final groomed jet. It is sufficient to consider a single collinear branch as two collinear branches always pass the soft drop condition. Due to angular ordering the collinear-soft branches J i are not clustered together, i.e. θ J i ,J j > θ J i(j) ,J for all i and j. Here we use the notation θ 2 as the angular distance between the branches a and b. Let us first consider the case of one collinear-soft branch J 1 for the cumulative distribution of R g . The corresponding measurement function can be written as where "p" ("f ") means that J 1 passes (fails) the soft drop criterion. More specifically, and where we have used the fact that p T,J + p T,J 1 ≈ p T,J at leading power. One can directly see the equivalence between the soft drop procedure and the jet veto when there is only one collinear-soft branch since the measurement function can be written as If the separation of the collinear-soft branch J 1 from the collinear one J is larger than R g , i.e. outside the "signal jet", the branch is required to be below the jet veto threshold, in this case z cut θ β J 1 ,J p T,J . On the other hand, if θ J 1 ,J is less than R g , J 1 is within the energetic signal jet and thus will always be kept. To proceed, we first note that which requires θ J 1 ,J > R g . More generally for multiple branches, the measure 1− N i M 1 (J i ) will require θ J i ,J > R g for at least one of the J i with i = 1 . . . N . For 2 collinear-soft branches, the measurement function is given by where we introduce the short-hand notation Θ(J i ) which denotes that θ J i ,J is the largest angle of the J i relative to the collinear branch. In the first line, when J 2 fails, we proceed to test J 1 against the soft drop criterion (first term) while if J 2 passes, we stop (second term). In addition, we sum over all possible permutations. In the case of two branches this includes both configurations when θ J 2 ,J and θ J 1 ,J is larger. The second term in the second line of eq. (2.20) vanishes due to the contradiction of the two conditions θ J 2 ,J > θ J 1 ,J and θ J 2 ,J < R g as required by M 1 (J 2 p) and θ J 1 ,J > R g . We note that the angular ordering of the C/A algorithm is crucial here to generate the conflict. For the anti-k T algorithm, the overall Θ(J i ) is replaced by the anti-k T distance metric which reduces to Θ(J i ) up to power corrections since min(p −2α T,J θ J i ,J and the contradiction is still obtained. When other jet algorithms are used, the Θ here will be replaced by a different ordering, and the conflicts could therefore be avoided which would lead to a non-vanishing second term. This would eventually cause a difference between the jet veto and the soft drop declustering procedure. After carrying out the sum over the two permutations, we obtain the third line which is is an independent veto of the branches 1 and 2 when their separation from the collinear branch is larger than R g . Thus the equivalence holds for 2 collinear-soft branches. Similarly, for 3 branches we have (2.21) The first term in the first line states that if branch-3 fails the soft drop criterion, we proceed to test the remaining 2 branches until the procedure stops. The second term corresponds to the case where branch-3 passes the criterion and the algorithm terminates. Following a similar argument as in the case of 2 branches, the second term in the second line vanishes and we get the last line which demonstrates the equivalence for 3 branches. For arbitrary N , we find by induction that This shows the equivalence between soft drop declustering and the jet veto procedure as long as θ g 1 with small z cut 1.

The collinear function
The operator definition of the collinear function as it appears in eq. (2.9) can be written as for quark jets and a similar expression can be obtained for gluon jets, see [70]. Here the null four-vector is taken asn = (1,n), wheren is pointing in the jet direction and |P j ({k Xc }) P j ({k Xc })| is the matrix for producing the collinear state X c . The measurement Θ R g −r g | C/A represents the C/A jet algorithm which acts on the final collinear state X c requiring that the separation between the last two branches in the clustering history is less than R g . At NLO, after performing the angular integration, the collinear function as it appears in eq. (2.10) is found to be where C i on the right-hand side corresponds to C F,A for quarks and gluons, respectively. The other constants are given by 26) and the logarithm L is defined as see also [75,78]. The natural collinear scale choice used to minimize the logarithmic contribution is given by µ C ∼ θ g p T R, and the anomalous dimensions γ C i are found to be (2.28)

The collinear-soft function
The collinear-soft function as it appears in eq. (2.9) is defined at the operator level as with the null vector n i = (1,n i ), wheren i is oriented along the propagation direction of the collinear radiation i and W n is a Wilson line in the n-direction. Here Θ (R g −r g | softdrop ) encodes the soft drop grooming algorithm operating on the collinear-soft final state X cs with the knowledge of the eikonal directions n 1 . . . n m . In order to achieve the resummation at NLL accuracy, see eq. (2.10), we need the collinear-soft function at NLO. After performing again the angular integration, we find The natural scale of the collinear-soft mode is indeed found to be µ gr S ∼ z cut θ 1+β g p T R, and the anomalous dimensions are given by (2.31)

Leading NGLs including C/A clustering effects
The leading NGLs of the θ g distribution can be readily inferred from the equivalence with the jet veto case. The leading NGLs originate from correlated strongly ordered emissions when the harder emission is inside the groomed jet while the softer one is outside and vetoed. We adopt the notation of [50]. At next-to-next-to-leading order (NNLO), when clustering effects due to the C/A algorithm are ignored, we thus have Where we introduced the notation L = − ln(z cut θ β g ) and the polar angles c i = cos θ i and s i = sin θ i of the two emissions at NNLO measured with respect to the groomed jet axis and their respective transverse momentum fractions relative to the total momentum of the jet x i = k T i /p T . Here, we also replaced the veto condition which is valid for the leading NGLs. Comparing with the structure in eq. (2.12), we would obtain the first coefficient S i,2 as When clustering effects are included, this coefficient will be reduced since any soft emission outside R g that is clustered into the groomed jet will not be subject to the veto condition. At NNLO, the clustering happens when the distance between the emissions inside and outside R g is smaller than the distance between the groomed jet axis and the radiation inside R g . Therefore, to we need to insert the constraint 36) and the phase space which generates the NGLs will thus be reduced. We thus have the following modified expression compared to eq. (2.34) above (2.37) In principle, the integral in eq. (2.37) can be evaluated numerically. Using the small angle approximation, we can approximate the distances d i and d ij as The relevant integral can then be approximated as where we introduced the variableθ i = θ i cosh(η J )/R g . Note that the variable change removes the dependence on η J . Following the definition of the collinear-soft mode, the upper bound for theθ 2 integral should be ∞. Here we use instead 1/θ g = R/R g keeping in mind that the radiation outside the groomed jet is within the original ungroomed jet with radius R, see Fig. 1. In addition, in the limit θ g → 1 the associated NGLs in F i vanish, as the NGLs are proportional to the area of the veto region. The integral in eq. (2.40) can be performed analytically for θ g 1 and we find that the coefficient of the non-global logarithm is significantly reduced due to the additional constraint. In the limit θ g 1, the θ g dependence of S C/A i,2 is power suppressed and we find Thus the size of the NGL is reduced due to clustering by a factor of 4/9. A similar reduction due to clustering effects was observed for example in [53] in the context of jet mass measurements. For general R g < R, we find that the coefficient in eq. (2.40) evaluates to which reduces to eq. (2.41) in the limit θ g 1. The φ integral here can be done analytically but the result is rather lengthy. The remaining y integral can be evaluated numerically.

Leading Abelian C/A clustering logarithms
To study the Abelian C/A clustering effects, we start with two independent collinear-soft emissions with momenta k 1,2 . In order to extract the leading Abelian logarithm, it suffices to consider the strongly ordered limit in which p T k 1 k 2 or p T k 2 k 1 . The leading logarithms come from the configuration where the harder gluon is initially inside the groomed jet and the softer gluon is outside but within the initial ungroomed jet. The C/A clustering pulls the softer gluon into the jet and generates a mismatch with the realvirtual correction. In the small θ g approximation, the NNLO contribution to the Abelian clustering reads where we use the same notation conventions as in eq. (2.32) above. Terms that are power suppressed by θ g are omitted and can be found in [53]. We have where as beforeθ i = θ i cosh(η J )/R g . Performing the integral, we find the first Abelian corrections due to the C/A clustering The remaining integral can be worked out analytically but it is rather lengthy. In the small θ g limit, we find (2.47) and beyond can be computed following the method developed in [53]. The authors further proposed an exponentiation of Abelian clustering logarithms by calculating higher order corrections and analyzing the structure of the perturbative series. In this work, we instead resum the Abelian clustering logarithms at LL and leading color using the Monte Carlo method discussed in section 3, where we compare to the fixed order result derived here.
with subleading terms A C/A i,2 Figure 2. Comparison of the numerical size of the NGLs (blue) and the clustering logarithms (red) at NNLO with (solid) and without (dashed) subleading contributions as a function of θ g . The results are normalized to (α s /π) 2 C i C A and (α s /π) 2 C 2 i for the NGLs and the clustering logarithms, respectively. We choose the parameters z cut = 0.1, β = 1 as a representative example.

Subleading NGLs and clustering logarithms
Here we comment on the potential impact of subleading NGLs and Abelian clustering logarithms beyond single logarithmic accuracy. Due to the soft drop criterion, the corresponding jet veto parameter is z cut d β/2 i /R β p T = z cut θ β gθ β i p T withθ i as defined in the previous sections. Therefore, for instance the x i -integration bound in eq. (2.44) should be replaced by z cut θ β gθ β 1 instead of z cut θ β g . The latter is sufficient to get the leading logarithms as derived in the previous section. The additionalθ i dependence gives rise to part of the subleading logarithms. Similar reasonings apply to the NGL case. In the θ g → 0 limit, we find at NNLO for the Abelian clustering logarithms. In Fig. 2, we show a comparison of the leading NGL (blue) and clustering logarithms (red) at NNLO (dashed) and when the subleading terms in eqs. (2.48) and (2.49) are included (solid). We result is plotted as a function of θ g , for exemplary values of the soft drop grooming parameters β = 1, z cut = 0.1. We observe a moderate reduction of the NGL contribution when the subleading terms in eq. (2.48) are included. The subleading contribution of the Abelian clustering logarithms turns out to be almost negligible.
Though not yet conclusive, the results in this section suggest that the impact of subleading NGLs and clustering logarithms is moderate. We thus expect that the numerical results for the soft drop groomed jet radius presented in section 4, which only include the leading NGLs and Abelian clustering logarithms to all orders, capture the dominant perturbative effects.

Comparison to results in the literature
In this section, we compare the calculation presented in this work to results available in the literature. In [4], the resummation of the cumulative θ g distribution was realized at MLL accuracy. We show that our results reduce to [4] when only the leading logarithms are taken into account. The resummation in [4] is based on the coherent branching formalism and the result can be expressed as Here σ incl denotes the inclusive jet cross section, f i are the leading-order quark/gluon fractions and Σ i (θ g ) denote the respective resummed exponents that depend on θ g . On the right hand side we leave the dependence on other variables besides θ g implicit. At MLL accuracy for a fixed coupling constant, the resummed exponent can be written as where the constants γ i are defined in eqs. (2.25) and (2.26). At MLL accuracy also running coupling effects are taken into account. The improvements achieved in this work concern both the quark/gluon fractions f i and the resummed exponents Σ i (θ g ). Here, the resummation is carried out at full NLL accuracy, including both global and non-global logarithms. In addition, clustering effects due to the C/A algorithm are taken into account. In order to recast the formalism developed here into the form of eq. (2.50), we separate the production of the jet from the jet substructure measurement as discussed in [64,79,80]. We start by rewriting the jet function G c in eq. (2.2) at fixed order as Here J (1) d is the O(α s ) contribution of the semi-inclusive jet function as it appears in the inclusive jet cross section [64][65][66]. The functions J cd are related to the semi-inclusive jet functions except that we keep track also of the jet flavor d such that d J cd (z, p T R, µ) = J c (z, p T R, µ) . (2.53) We would like to stress that only at leading-order the jet flavor d is the same as the final state parton c from the hard-scattering event. Note that the separation in eq. (2.52) is multiplicative and the functions J cd contain the complete z-dependence. The z-dependence is associated with out-of-jet radiation diagrams at NLO and it is the same for different jet substructure observables. We can now calculate the cross section for a jet of flavor d as Here we use a more compact notation compared to the factorization in eq. (2.2) above. After summing over d in eq. (2.54), the inclusive jet cross section σ incl is obtained. Therefore, in our calculation the quark/gluon fractions f i , see eq. (2.50), can be obtained systematically beyond leading-order as where also the ln R resummation is included. The resummed exponents beyond MLL accuracy are now obtained from the refactorized expression of the jet function G q,g after subtracting the NLO semi-inclusive jet function at fixed order, see eq. (2.52). Following the discussion in the sections above, we thus have (2.56) The constantsH i were calculated in [80]. After solving the evolution equations of the different functions and including NGLs and Abelian clustering logarithms, the resummation at NLL accuracy can be achieved which includes logarithms of θ g , R and z cut . The result for fixed scales at leading logarithmic accuracy is given by (2.57) After making the canonical scale choices, which we list here for convenience we recover the result for Σ i (θ g ) in eq. (2.51) up to the constantsH i , which is 1 at leading order. As can be seen from eq. (2.57), the terms containing double logarithms of z cut in the exponent of Σ i (θ g ) can generally induce contributions to the QCD scale variations considered in the next section. Only for the central scale choice or when the scales µ S / ∈gr and µ S∈gr are varied simultaneously, the contribution of these logarithms cancel completely. Of course we further include NGLs and Abelian clustering logarithms to achieve full NLL. Furthermore, we would like to stress again that an important feature of our approach is that it can be systematically extended beyond NLL accuracy.

The soft drop groomed radius in Monte Carlo
In this section, we present an algorithm for the large-N c leading log resummation of the NGL distribution, including clustering effects. We also perform a numerical comparison to the resummed distributions to gauge power corrections in factorizing the NGLs of the soft function of eq. (2.6) from those of the collinear-soft function of eq. (2.9), as well as the range of validity approximating the all orders resummation with the two-loop leading NGLs and Abelian clustering logarithms calculated previously.

The Monte Carlo setup
As explained in section 2.2, the soft drop declustering angle operates as a jet veto algorithm. However, the jet being vetoed is simply the last branch to be declustered in the C/A algorithm once that branch is at an angular scale larger than R g . The ungroomed jet is defined by the anti-k t algorithm, so the jet has a hard angular boundary at R, whereas R g is the soft drop declustering angle. Then the Monte Carlo resummation in the large-N c limit follows the general procedure found in [50,57,58]. We define: • t as the MC time where ω is the energy of the emission established at the shower time t. In our case, the scale Q is set by p T R.
• A histogram H t indexed by t.
• R g as the subjet radius, R as the fat jet radius.
• D as the list of dipoles.
• k is the current number of emissions.
• n P is the direction of the first branch in the declustering procedure that passes soft drop. n · n P sets the current angular scale of the shower.
• E J is the list of emitted eikonal lines that cluster into either the jet direction n or n P .
All eikonal lines are of the form n i = (1,n i ), so the Lorentz product n i · n j = 1 − cos θ ij simply measures the angle between the lines. Strong energy ordering implies the following clustering rule, ignoring recoil: where ω i,j are the energies and n i,j are the null directions of the two emissions to be clustered. Since the shower is energy ordered, we only need to keep track of the order of emissions to know the relative energy scales. We initialize the shower as follows: • t = 0.
The algorithm then proceeds as: 1. Take an MC time step t → t+∆t and create a new emission with energy and direction {ω k+1 , n k+1 }. For details, see App. 3.4.
3. Check if n k+1 · n j > n P · n, ∀n j ∈ E J .
• If at least one of these conditions fail, add n k+1 to E J . Goto step 1.
• If all these conditions are true, the emission is a new candidate for the declustering branch.
-If n k+1 · n > 2 sin 2 (R g /2). Add 1 to appropriate bin of H t , and re-initialize shower for a new event.
Finally, normalize the histogram by the number of events.

Vetoing the declustered branches
We construct E J such that all lines within will cluster into either n P or n before n P and n themselves cluster at each step in the shower. Thus if we are given a new emission n k+1 such that n k+1 · n j > n P · n, ∀n j ∈ E J , then n and n P will cluster before n k+1 clusters into any of the established eikonal lines. Thus n k+1 and E J now define the two branches that are the first to be declustered under C/A, and the branch formed from E J will have direction n according to the clustering rule eq. (3.2). We then check whether the angle between these two branches is less than the desired R g . If it is, we redefine the branch n P to be n k+1 , this is the new branch that sets the current declustering angle. If n k+1 is at too wide an angle from n, then the emission n k+1 sets the energy scale z cut p T R g . If we were to create subsequent emissions in the shower, they would have energy below z cut p T R g , and so if they created new branches, they would fail soft drop, and if they are clustered into the branches which pass soft drop, they cannot change the directions of those branches according to the clustering rule eq. (3.2). Thus n P cannot change, and the shower is over. If n k+1 · n j < n P · n, for at least one n j ∈ E J , this emission in the shower will cluster into an emission that will eventually cluster into either n or n P before n and n p themselves cluster. Thus this emission will not change the direction of the two branches that pass soft drop.

Numerical results
Formally, the Monte Carlo algorithm described above resums the NGLs from both the collinear-soft function defined in eq. (2.9), and the soft function of eq. (2.6). However, in the small R g limit, these two functions factorize from each other. Thus to isolate the NGLs from the collinear-soft function alone, we divide out from the histogram produced by the LL MC described above both the hemisphere jet-mass NGL distribution of Ref. [50] (which corresponds to the NGLs of the soft function of eq. (2.6)), as well as dividing out the exponentiation of the one-emission contribution to the distribution to remove any global contributions. These one-emission contributions are included in the anomalous dimension calculated in eq. (2.31). For an initial quark dipole the distributions for R g = 0.25, 0.15, 0.1, 0.05 are shown in Fig. 3, with the ungroomed jet radius of R = 0.8, and the small-t region is highlighted in Fig. 4. We have check numerically that the gluon distribution with an adjoint dipole is well approximated by the square of the quark distribution, despite clustering effects which would spoil this relation at large-N c . In comparison  to the fixed order results, we include in the large-N c (C F → C A /2) limit both the contributions from eq. (2.42) and eq. (2.46), since the MC covers the whole soft phase space at large-N c and leading log. Thus the MC algorithm accounts for clustering effects off the primary emission, but only in the large-N c limit. Using the methods of [81], we could resum the Abelian clustering effects with the correct color structure, accounting for some of the subleading N c effects.
Since the MC includes the multiple emissions evolution in the out-of-jet region as well as the evolution in the groomed region, we can test this collinear factorization of the two soft functions. We can see that for multiple emissions at R g = 0.25, R = 0.8 the power corrections to the small R g limit of the collinear function are sizeable. However, for R g ≤ 0.15, R = 0.8, the small R g limit of the fixed order NGL at two-loops distribution describes well the NGL distribution for phenomenological values of t. Moreover, we have checked that the LL distribution for the collinear-soft function is independent of R once we are in the regime R g R. For example, the distribution for R g = 0.1 and R = 1.5708 is almost identical up to statistical noise as R g = 0.1 and R = 0.8. We use a shower angular cutoff scale of δ = 0.001, and checked the independence of the distributions.

Evolving dipoles
We start with a list of dipoles D, where an element is given by {x, y}. x, y are the null directions forming eikonal lines of the dipole. We let: Increase t by ∆t.
2. Choose the dipole {x, y} ∈ D with probability P δ xy /P δ D . 3. Create an emission j with distribution W δ xy (j), such that the angles are θ xj , θ yj > δ. This is equivalent to uniformily randomly creating in the rest frame of x, y a direction j in rapidity and azimuth, then boosting back to the lab frame and checking the angular constraint.

Phenomenology
In this section we present numerical results for the soft drop groomed jet radius θ g = R g /R at NLL accuracy. We start with proton-proton collisions at √ s = 13 TeV collision and we consider inclusive jets pp → jet+X with transverse momentum p T > 600 GeV in the central rapidity region of |η| < 1.5. Jets are reconstructed with the anti-k T algorithm and R = 0.8. For all numerical results presented in this section we choose the soft threshold parameter z cut = 0.1 and we present results for different values of the angular exponent β. We choose the NLO CT14 PDF set of [83] as default for all our numerical calculations. Since we consider the θ g differential cross section normalized to the inclusive jet production cross section, the impact of the choice of the PDF set is small. In Fig. 5, we present the numerical results (dashed black, yellow band) for the θ g differential distribution θ g /σ incl dσ/dθ g which +ISR+MPI+had +ISR+MPI +ISR Partonic log 10 (θ g ) Figure 6. Pythia 8 results [82] for the soft drop groomed jet radius θ g for the same kinematics as in Fig. 5 above. We separately show the purely partonic result (black), including initial-state radiation (red), multi-parton interactions (green) and hadronization corrections (blue).
is obtained by differentiating the cumulative cross section Σ(θ g ), see eq. (2.1). Before taking the derivative with respect to θ g , we choose the canonical scales as listed in eqs. µ S∈gr = z cut θ β g µ C . (4.3) As expected, we find that aggressive soft drop grooming (β = 0) yields a relatively flat distribution (multiplied by θ g ) of the soft drop groomed jet radius. Less aggressive grooming (β = 1, 2) instead gives a distribution that peaks at intermediate to relatively large values of θ g which means that the groomed jet does not shrink as much in size compared to the initial ungroomed jet. Eventually, in the limit β → ∞, the groomer is removed and the distribution approaches a delta function at θ g = 1. For comparison, we also show Pythia 8.230 results using the default tune [82] (purple) in Fig. 5. Here we do not include the nonperturbative hadronization and the contribution from underlying event (UE). Below we study these effects in more detail. In general, we find very good agreement between the Pythia 8 simulation and our perturbative results at NLL accuracy concerning both the shape and the overall magnitude. We note that in the region θ g ∼ 1 perturbative power corrections play a role which is not captured by our factorization theorem. In principle, they could be included at fixed order by performing a matching calculation. In addition, we note that the perturbative resummation region ends when the lowest scale in the factorization theorem runs into the nonperturbative regime µ S∈gr ∼ z cut θ 1+β g p T R ∼ Λ QCD ∼ 1 GeV. This corresponds to values of the soft drop groomed jet radius of θ g (Λ QCD /(z cut p T R)) 1/(1+β) . Numerical results in this region are obtained by smoothly freezing the running of the QCD coupling constant above the Landau pole. We choose to freeze the running of α s at the scale 0.4 GeV for the numerical results presented here.
In Fig. 6, we study different perturbative and nonperturbative effects for the same kinematics as in Fig. 5. First, the purely partonic results are shown by the black line. We then include the contribution of Initial State Radiation (ISR) as shown by the blue line. In general, ISR leads to a shift of the distribution toward larger values of θ g . We note that there is no exact one-to-one correspondence between the different parton and hadron level contributions in Pythia and the QCD factorization theorem we develop in this work. ISR is power suppressed and not included in our factorization theorem. However, since ISR is a perturbative initial state contribution, we include it in the comparison in Fig. 5 above. Next, we include the underlying event contribution or multi-parton interactions (MPI), as shown by the green line in Fig. 6. As it turns out, MPI does not play a significant role for the jet kinematics and soft drop parameters considered here. Finally, we include hadronization effects for the θ g distribution as shown by the blue line in Fig. 6. Interestingly, hadronization leads to a shift of the distribution to lower values of θ g for soft drop grooming with β = 0, whereas it shifts the distribution to higher values for β = 1, 2. It will be interesting to study such effects in more detail in the future following the work of [84]. We conclude that the impact of different effects that are not captured by the factorization theorem presented in this work are relatively small and that the soft drop groomed radius θ g is under good control within perturbative QCD.
Finally, in Fig. 7 we present results for jet kinematics relevant for the STAR experiment at RHIC at √ s = 200 GeV. For the setup of the analysis presented in [85], we show numerical results for three jet transverse momentum intervals 25 < p T < 30 GeV, 30 < p T < 40 GeV and 40 < p T < 60 GeV (left to right) with R = 0.4, z cut = 0.1 and β = 0. We show the theoretical result 1/σ incl dσ/dR g as a function of R g . As expected, the QCD scale uncertainty is relatively large for the lower jet transverse momentum intervals considered here.

Conclusions
In this work, we considered the soft drop groomed jet radius θ g = R g /R at next-to-leading logarithmic accuracy. The radius of a soft drop groomed jet is one of the key observables characterizing the impact of grooming on a jet and is calculable in perturbative QCD. It is defined as the opening angle of the splitting that satisfies the soft drop grooming condition and is related to the active area of the groomed jet. The extension of the calculation beyond leading-logarithmic accuracy required us to study the nontrivial all order structure of nonglobal logarithms which are affected by clustering constraints due to the use of the C/A algorithm. In addition, Abelian clustering logarithms need to be taken into account. An important ingredient to understand the factorization structure is the equivalence between the soft drop groomed radius measurement and a jet veto between the boundaries of the groomed and ungroomed jet. Within SCET we established a factorization theorem which allows for the resummation of logarithms of θ g , the jet radius R and the soft drop parameter z cut at NLL. We performed an explicit calculation of the non-global and Abelian clustering logarithms at fixed order. The all order resummation at leading logarithmic accuracy within the large-N c approximation was achieved by making use of a suitably designed Monte Carlo algorithm. We performed numerical calculations and compared our results to Pythia 8 simulations for LHC kinematics and found very good agreement. From these findings, we concluded that the soft drop groomed radius is under good perturbative control as the overall impact of nonperturbative effects, mainly due to hadronization, appears to be relatively small for LHC kinematics. Numerical predictions for the STAR experiments at RHIC are also provided. Our results allow for precision comparisons to data from the LHC and RHIC which will further improve our understanding of soft drop groomed jet substructure observables. The formalism developed in this work can be systematically extended beyond NLL accuracy and matched to fixed order calculations. In addition, it will be interesting to systematically investigate nonperturbative effects. Applications in heavy-ion collisions will further extend the use of the calculations performed in this work.

A Collinear-soft emissions at NNLO
Here we consider the phase space for collinear-soft emissions at NNLO as a non-trivial example to check and illustrate the equivalence between the soft drop groomed R g measurement and the jet veto constraint. We first introduce the notation for two collinear-soft partons i and j with momenta k i and k j as where J denotes the eikonal direction set by the collinear mode. We further write the soft drop groomed radius R g phase space for 1 collinear-soft emission as which is equivalent to 3) The first term indicates that an emission i is kept as long as it is within the cone set by R g , whereas it is vetoed if it is outside. This is the usual veto operation and manifests the equivalence at NLO. For future use, we also note that At NNLO, we need to consider the three configurations shown in Fig. 8. We start with configuration a. In this case we can write the phase space measure as where the overall factor indicates that θ 1J is the smallest distance. When declustering the jet, we thus first check emission 2 against the soft drop criterion. If 2 passes, it has to be within R g by construction and the algorithm terminates (first term in square brackets). If emission 2 fails, we continue to check whether emission 1 passes the soft drop criterion (second term in square brackets). To proceed, we write M a 2 as follows M a 2 = Θ(θ 2J − θ 1J )Θ(θ 12 − θ 1J ) Θ 2,p Θ(R g − θ 2J ) + M 1 (k 2 )M 1 (k 1 ) where we used the definition of M 1 (k 2 ) in eq. (A.2). Now we combine the first and the third term and use eq. (A.4) to obtain The first term vanishes since there is a contradiction between the different conditions requiring θ 1J to be the smallest distance as well as θ 1J > R g and R g > θ 2J . Therefore, we find for the configuration a the result This states that we separately veto emissions 1 and 2 as long as θ 12 is not the smallest overall distance in which case they will be combined into a single branch first. Lastly, we consider the configuration c in Fig. 8 which can be written as If the distance θ 12 is the smallest distance, the emissions 1 and 2 will be combined first. In the declustering procedure, the branch containing both emissions will be checked againnst the soft drop criterion. Adding up all three configurations, we find which shows the equivalence discussed in the main text at NNLO. The extension beyond NNLO can be achieved in a similar way. For instance, in the case of 3 emissions, the only extra configuration one needs to consider is when neither 2 nor 3 emissions out of the 3 are clustered first. Otherwise, up to clustering, it is equivalent to the 1-or 2-emission case discussed here in detail. Following similar steps and arguments as we showed for configuration a of the 2-emission case, one reaches again the equivalence. At NNLO, we can also have 2 collinear emissions J 1 and J 2 along with 1 collinear-soft parton k 1 . The measurement function is readily found to be given by which again is equivalent to the jet veto operation using the C/A algorithm. That is to say, when the soft parton is combined with the collinear radiation J 1 or J 2 , it will be kept. Otherwise, it will be vetoed when it is outside of the region set by R g .