The groomed and ungroomed jet mass distribution for inclusive jet production at the LHC

We study jet mass distributions measured in the single inclusive jet production in proton-proton collisions $pp\to \text{jet}+X$ at the LHC. We consider both standard ungroomed jets as well as soft drop groomed jets. Within the Soft Collinear Effective Theory (SCET), we establish QCD factorization theorems for both cases and we study their relation. The developed framework allows for the joint resummation of several classes of logarithmic corrections to all orders in the strong coupling constant. For the ungroomed case, we resum logarithms in the jet radius parameter and in the small jet mass. For the groomed case, we resum in addition the logarithms in the soft threshold parameter $z_{\text{cut}}$ which is introduced by the soft drop grooming algorithm. In this way, we are able to reliably determine the absolute normalization of the groomed jet mass distribution in proton-proton collisions. All logarithmic corrections are resummed to the next-to-leading logarithmic accuracy. We present numerical results and compare with the available data from the LHC. For both the groomed and ungroomed jet mass distributions we find very good agreement after including non-perturbative corrections.


Introduction
In high energy collisions the fundamental building blocks of Quantum Chromodynamics (QCD), quarks and gluons, lead to the formation of highly energetic collimated sprays of hadrons observed in the detectors which are known as jets [1]. The Large Hadron Collider (LHC) is currently the world's largest and highest energy particle collider where jets provide a unique opportunity to test the fundamental properties of QCD and to search for new physics beyond the standard model at the TeV scale. Therefore jet studies have become one of the most important topics both in the experimental and theoretical communities. One of the most studied benchmark processes at the LHC is the inclusive jet production cross section differential in the jet rapidity and the transverse momentum [2][3][4]. Over the past years, it has been realized that the internal structure of the identified jets contains additional valuable information. When additional measurements are performed on the identified jets in order to characterize and utilize the radiation pattern inside jets, the corresponding observables are generally referred to as jet substructure measurements [5]. For example, jet substructure techniques are used to improve our understanding of the QCD hadronization mechanism [6][7][8], to discriminate between quark and gluon jets [9] or to identify jets originating from the decay of boosted objects [10]. At the LHC heavy particles such as W/Z, Higgs, and top quarks are often produced with a high transverse momentum such that their decay products become collimated and thus are merged into a single jet. The radiation pattern of the produced jets contains information about the different decaying particles. In order to tag such boosted objects and to separate them from the QCD background, jet substructure techniques have proven to be an invaluable tool. In addition, jet substructure techniques are used increasingly for the search of new resonances from physics beyond the standard model. See for example [11] for a recent search for hadronically decaying vector resonances reported by the CMS collaboration relying on jet substructure techniques. Often several jet substructure observables are measured on a single jet in order to enhance the tagging efficiency, see for example [12]. In addition, jet substructure observables are increasingly being studied in heavy-ion collisions where they provide an important test of the hot and dense QCD medium [13].
One of the most prominent and most often used jet substructure observables is the jet mass distribution which we address in this work in the context of inclusive jet production in proton-proton collisions at the LHC. We consider the cross section where the jet mass is measured for jets that are identified with a given transverse momentum p T and rapidity η using a jet radius parameter R and where the measurement is inclusive about everything else in the event which is denoted by X. Thus, we have pp → jet(τ ; η, p T , R) + X, (1.1) where we introduced the dimensionless variable τ which is related to the jet invariant mass m J as Here, p i are the four-momenta of all the particles inside the reconstructed jet. More specifically we consider the normalized jet mass distribution where the numerator and the denominator are the differential jet cross sections with and without the additional measurement of the jet mass, respectively. Traditionally, jet mass measurements have been performed on an inclusive jet sample, see for example the data sets in pp [14] and pp collisions [15] by the CDF collaboration at Tevatron and the ATLAS collaboration at the LHC, respectively. In addition, inclusive jet mass measurements have been performed by the ALICE collaboration in Pb-Pb and p-Pb collisions at the LHC [16].
Although being one of the simplest and most intuitive examples of jet substructure observables, the jet invariant mass spectrum serves as a benchmark observable for jet substructure studies and is therefore of great phenomenological relevance. The jet mass distribution is used to test parton showers in Monte Carlo event generators, to tag quarkgluon jets, and to search for boosted objects as outlined above. In addition, it is expected that jet mass measurements can shed new light on the jet quenching phenomenon observed in heavy-ion collisions. Even though jet mass measurements are of great phenomenological importance, current studies of the inclusive jet mass spectrum rely heavily on the assumption that the jet mass distribution is well modeled by Monte Carlo event generators. Recent studies by the ATLAS and ALICE collaborations suggest that this assumption should be treated with care. For instance, ATLAS found that the predicted spectrum by Pythia [17] is too soft in pp collisions whereas the one from Herwig++ [18] is too hard [15]. A similar situation was observed in the heavy-ion collisions, where the jet mass distribution is overor underestimated by Q-Pythia [19] and Jewel [20] depending on the out-of-jet radiation settings [16]. Therefore, jet mass calculations from first principles in QCD are needed in order to improve our understanding of the underlying mechanisms and to benchmark current models.
The jet mass distribution has been addressed several times in the literature from the theoretical side, where the efforts have focused mostly on exclusive jet configurations where additional constraints are imposed on the final state particles, see for example [21][22][23][24] and references therein. While it is advantageous in some situations to consider exclusive final state jets, it is important to note that the inclusive jet cross section can be measured with the highest statistics since all jets in a given transverse momentum and rapidity interval are taken into account without any further restrictions. So far only a few theoretical studies exist in the literature on the inclusive jet mass spectrum [25][26][27][28]. In [27], the jet mass distribution was calculated in the threshold di-jet limit, while [25,26] focused on process-independent jet functions. See also the theoretical studies in [29,30] on the jet mass distributions in γ/Z+jet and di-jet processes at the LHC, as well as the experimental measurements at the LHC [31]. In this work, we derive a complete factorization theorem from first principles in QCD allowing for a direct comparison with the inclusive jet mass data from the LHC. Using the QCD factorization theorem derived in this work, we are able to jointly resum single logarithms in the jet size parameter R and double logarithms in the jet mass m J .
As it turns out, the invariant jet mass distribution is very sensitive to the soft hadronic activity in the collisions recorded at the LHC. This is the case in particular at the highest energy collisions currently achieved at the LHC at √ s = 13 TeV. The soft radiation includes pileup and the underlying event contribution like multi-parton interactions (MPI) [32]. The jet mass distribution as introduced above may, in fact, play an important role in order to disentangle the various contributions, see for example [22]. However, for many applications, it is important to remove the underlying event contribution in order to restore the understanding of the jet mass as being a direct measure of the mass associated with a highly energetic fragmenting parton or a boosted object that produces the observed final state jet. This can be achieved by considering the groomed jet mass which we denote by m J,gr . Various jet grooming techniques have been developed in the past decade, see for example [5] for an overview. The grooming procedure which we use in this work is the socalled soft drop grooming algorithm [33] which can be included in analytical calculations using QCD factorization theorems. The soft drop grooming algorithm is designed to remove wide-angle soft radiation from the jet by recursively declustering a jet and by removing soft branches from the identified initial ungroomed jet. The algorithm depends on the angular separation of the branches obtained at each declustering step, an angular exponent β, and a soft threshold z cut which sets the cutoff below which soft branches are removed from the jet. See section 3.1 for a more detailed description of the soft drop grooming algorithm. After all the soft branches of a given jet have been identified and removed from the jet any observables may be measured on the remaining jet constituents [34][35][36]. An important feature of the grooming procedure is that it reduces the sensitivity to non-perturbative contributions and non-global logarithms (NGLs) [37,38] which we discuss in more detail below.
In this work, we focus on the soft drop groomed jet mass distribution which we are going to compare to the ungroomed case as discussed above. Earlier work on groomed jet mass distributions can be found in [39][40][41]. Here, we derive a complete QCD factorization theorem that allows for the resummation of three important logarithmic corrections to all orders in the strong coupling constant α s : Single logarithms in the jet size parameter R and double logarithms in the jet mass m J,gr similar to the ungroomed case. In addition, we are able to completely resum logarithms in the soft threshold parameter z cut which was not achieved for jets in pp collisions before. Using the new framework developed for inclusive jet samples it is therefore possible to reliably determine for the first time the absolute normalization of a groomed jet observable up to NGLs. All resummations in this work are carried out at next-to-leading logarithmic accuracy. Throughout this paper, we derive QCD factorization theorems and the resummation of large logarithms within the framework of Soft Collinear Effective Theory (SCET) [42][43][44][45][46]. The experimental measurements of the jet mass distribution for soft drop groomed jets in pp collisions have been performed by both CMS [31,47] and ATLAS [48] collaborations at the LHC.
Recently there has been a great interest in groomed jet observables in the heavy-ion community [49,50]. It is generally desirable to use grooming in order to directly probe the medium modification of highly energetic fragmenting partons that produce a jet in the final state which traverses and thus probes the QCD medium. On the other hand, one has to worry about the interference of the grooming procedure with the employed background subtraction method. In order to disentangle the interplay of the grooming procedure and the subtraction of the background, it is absolutely crucial to consider observables that are defined both with and without grooming. The jet mass distributions discussed in this work constitute an ideal starting point for further studies along these lines as they allow for a continuous transition between the groomed and ungroomed case. See for example [51], where the CMS collaboration reported on the results for the groomed jet mass distribution in Pb-Pb collisions at the LHC.
The remainder of this paper is organized as follows. In Sec. 2 we derive the factorization for the ungroomed jet mass distribution. In Sec. 3 we extend the obtained framework to include soft drop grooming. We emphasize similarities and differences between the groomed and ungroomed jet mass distribution. In Sec. 3.6 we briefly comment on the different NGLs that contribute to the groomed and the ungroomed jet mass spectrum and we comment on the relation of our newly derived factorization to earlier work in the literature. In Sec. 4 we present numerical results for both jet mass distributions and we compare to the available experimental results from the LHC. We summarize our results in Sec. 5 and present an outlook. Several detailed calculations of the relevant soft functions at one-loop order in the presence of soft drop grooming are presented in the Appendix A.
2 Factorization: the ungroomed jet mass In this section, we present the factorization formalism of the ungroomed jet mass distribution for single inclusive jet production. We closely follow the arguments provided recently in [52] where jet angularities were discussed for inclusive jets. The jet mass distribution is a special case of the jet angularity observables τ a with a = 0. See [52] for more detailed discussions. In the small jet mass limit, the factorization procedure involves two steps. The first step is a hard collinear factorization, which describes the production of a single inclusive jet with radius R. The second step deals with the factorization of the details of the jet substructure measurement (i.e. the jet mass m J or τ ) in terms of soft and collinear modes.

First step: hard collinear factorization
For the jet mass distribution measured in the single inclusive jet production in pp collisions, the factorized cross section in the small jet radius limit can be written as where f a,b denote the parton distribution functions (PDFs) of the colliding protons with momentum fractions x a,b . The hard functions H c ab describe the production of an energetic parton c in the hard-scattering event similar to the inclusive production of hadrons [53,54]. In fact, it was shown in [55] that the hard functions are exactly the same as those for the single inclusive hadron production, pp → h + X. The functions G c (z, p T , R, τ, µ) are the semi-inclusive jet mass functions (siJMFs), which describe how a parton c initiates the signal jet that carries a momentum fraction z of that parton, and at the same time the jet mass τ is observed. Following earlier work [52,[56][57][58][59], the siJMFs are defined at the operator level as follows Here χ n and B µ n⊥ are the SCET gauge invariant collinear quark and gluon fields, and P is the label momentum operator. Here, we have defined two light-like vectors n µ = (1,n) andn µ = (1, −n) with n 2 =n 2 = 0 and n ·n = 2, wheren is aligned along the jet axis. The operatorτ (J) represents the jet mass measurement for the observed jet J, with the measured value being equal to τ . This first step of the factorization of the siJMFs G c from the hard functions H c ab is the so-called hard collinear factorization [46], which specifically describes the production of a jet with rapidity η, transverse momentum p T , and jet radius R. To derive this factorization, we work with parametrically small values of the jet size parameter R 1. In this case, we have two distinctive scales, µ H and µ J . The hard functions H c ab live at the scale of the hard-scattering event, while the characteristic momentum scale for the siJMFs G c is set by the jet dynamics and it is given by When R 1, the dynamics at these two distinctive scales will not interfere with each other and thus factorize. This is the intuitive argument behind the factorization formalism in Eq. (2.1). This first step of the factorization, the hard collinear factorization, is illustrated on the left-hand side of Fig. 1. We note that even though the factorized form of the cross section is derived for strictly R 1, it was found that the factorization is a very good approximation even for fat jets with a relatively large jet radius of R ∼ 0.7 and even above [60][61][62][63], as pointed out also in [24]. These observations imply that the power corrections of the form O(R 2 ) to the factorization theorem in Eq. (2.1) are in fact very small. While we do not have a general theoretical argument on the size of the power corrections, we further verify in Sec. 4 below that numerically this is indeed the case.
We find that the siJMFs G c , as well as the corresponding hard functions H c ab , follow the usual timelike DGLAP evolution equations, which is consistent with the hard collinear factorization. See also [55,62,64]. We find where the P ji (z, µ) denote the usual Altarelli-Parisi splitting functions. The DGLAP evolution equations for the siJMFs G c enable us to resum single logarithms in the jet size parameter α n s ln k R with k ≤ n which is achieved by evolving the siJMFs from the jet scale µ J ∼ p T R to the hard scale µ H ∼ p T [65,66].

Second step: soft collinear factorization
The factorization formalism in Eq. (2.1) is only valid for τ ∼ R 2 . When τ is parametrically much smaller than the jet radius squared τ R 2 , the jet mass distribution receives additional large logarithmic corrections originating from soft and collinear emissions that need to be resummed to all orders. In the small τ region, a second step of the factorization is required to resum logarithms of the form α n s ln 2n−k τ /R 2 with 0 ≤ k ≤ 2n. This can be achieved by introducing two additional modes which follow the jet mass constraints: collinear modes within the jet and the collinear-soft mode [67,68]. The collinear radiation within the jet has the momentum scaling where λ ∼ τ 1/2 . The collinear-soft radiation indicated by the subscript "cs" has the following momentum scaling At the same time, any hard-collinear emission of the order p T R has to be outside the jet as they would otherwise violate the hierarchy τ R 2 , and thus do not contribute to the jet mass. In summary we obtain the following factorization for the siJMFs [52] Here H c→i (z, p T R, µ) are the hard matching functions, and describe how an energetic parton c coming from the hard-scattering event produces a jet initiated by parton i with radius R carrying an energy fraction z of the initial parton c. They are related to unconstrained radiation outside the jet, and thus they have the characteristic momentum scale µ J ∼ p T R which is the jet scale. The relevant perturbative expressions and their renormalization group (RG) equations can be found in [52,69]. The collinear functions C i (τ, p T , µ) that take into account collinear radiation inside the observed jet have the following definition at the operator level The operatorτ n is defined to count only the collinear radiation inside the jet. In fact up to an overall normalization, the collinear functions are the same as the usual inclusive jet functions, which describe the measurement of the invariant mass of the jet [45,70]. The corresponding perturbative results are available at next-to-leading order (NLO) [71,72] and next-to-next-to-leading order (NNLO) [73,74]. For completeness, we list here the results for renormalized collinear functions for quarks and gluons i = q, g at NLO (2.12) Here we adopted the notation C i = C F,A for quarks and gluons, respectively. The relevant constants f i and γ i are given by (2.14) From the perturbative NLO results one finds that the characteristic scale of the collinear functions is given by the jet mass which eliminates all large logarithms at a fixed order The collinear functions satisfy the following RG equations where the anomalous dimensions γ C i (τ, p T , µ) are given by The soft functions S i (τ, p T , R, µ) that appear in the factorized expression of the siJMFs in Eq. (2.9) have the following operator definitions Here Y n (Y n ) is a soft Wilson line in the fundamental (adjoint) representation along the light-like direction n µ of the jet, while Yn (Yn) is along the conjugated directionn µ . Note that the operatorτ s is defined to count only the soft radiation following the momentum scaling determined in Eq. (2.8). The perturbative results for the renormalized soft functions at NLO are given by from which the natural momentum scale is obtained to be The corresponding RG equations are given by where the anomalous dimensions γ S i (τ, p T , R, µ) are given by

Factorization: the groomed jet mass
In this section, we derive the factorization formalism for the soft drop groomed jet mass distribution for the single inclusive jet production in pp collisions. We first give a brief review on the soft drop grooming algorithm, and then derive the corresponding factorized expression that allows for the resummation of all relevant large logarithmic corrections.

Soft drop grooming
The soft drop grooming procedure recursively removes soft wide-angle radiation from an identified jet [33]. The algorithm starts by re-clustering the constituents of an anti-k T jet [75] with Cambridge-Aachen algorithm [76,77] which yields an angular ordered clustering tree. One then steps backward through the clustering history of the jet and one iteratively removes soft branches from the jet. At each de-clustering step the jet is separated into two subjets or branches (also referred as proto-jets) with an angular separation ∆R ij = (∆η) 2 + (∆φ) 2 in the η-φ plane and transverse momenta p T i,j . At each step the following soft drop grooming criterion is checked The soft drop algorithm depends on two parameters: a soft threshold z cut and an angular exponent β. Here z cut sets the energy scale below which soft branches are removed from the jet. A typical value currently used by the experiments is z cut = 0.1. The parameter β determines the sensitivity of the grooming algorithm to the wide-angle soft radiation. If the above criterion is not satisfied, the branch with the smaller p T is removed from the jet. The procedure continues until the soft drop criterion is satisfied. The mass of the resulting groomed jet is usually referred to as the soft drop groomed jet mass which we denote by m J,gr . Correspondingly, we define the soft drop groomed τ gr measurement as Note that in the denominator we still use the ungroomed jet transverse momentum p T , instead of the p T of the groomed jet. This is because the ungroomed jet p T is an infrared and collinear (IRC) safe quantity, whereas the groomed analog is not IRC safe. See for example [40]. For β = 0 the soft drop grooming algorithm corresponds to the modified mass drop tagger (mMDT) [78]. Taking the limit β → ∞ removes the groomer and the ungroomed jet mass distribution is recovered. We are going to discuss this limit in more detail below.

First step: hard collinear factorization
Following our discussion of the ungroomed case, the first step factorization for the groomed jet mass distribution takes the form Here G gr c are the groomed siJMFs that take into account the soft drop groomed jet mass measurement τ gr for the observed jet. The groomed siJMFs have the following slightly modified operator definitions where the operatorτ gr (J) represents the jet mass measurement in the presence of soft drop grooming as described above, with the measured value being equal to τ gr . This first step of the factorization in Eq. (3.3) is conceptually the same as Eq. (2.1) for the ungroomed case, where only the ungroomed siJMFs G c (z, p T , R, τ, µ) are replaced by their corresponding groomed analog G gr c (z, p T , R, τ gr , µ; z cut , β). This factorization holds in the region in which z cut ∼ 1 and τ gr ∼ R 2 .
We note that the standard jet transverse momentum p T is set by the hard scattering dynamics at this step, i.e. associated with the hard functions H c ab in the above factorization theorem, which is the same as that for the ungroomed jet mass factorization. Therefore it is consistent to use the ungroomed jet p T also for the case of the groomed jet mass distribution in Eq. (3.2). From the universality of the factorization formalism, the RG equations for the groomed siJMFs have to be consistent with that of the hard functions H c ab , and thus are the same as for the ungroomed case. Therefore, the groomed siJMFs satisfy again the usual DGLAP evolution equations that can be used to resum single logarithms in the jet size parameter

Second step: soft collinear factorization with soft drop grooming
In practice, the LHC experiments usually choose z cut ∼ 0.1 while τ gr can be as low as O 10 −4 . A typical value for the jet radius parameter is R = 0.8, see section 4 below. Therefore, we are particularly interested in the factorization of the cross section in the region where τ gr /R 2 z cut 1. In [40] finite z cut corrections were considered which turn out to be very small for all practical purposes. Following a similar discussion as in [39], we now focus on the refactorization of the siJMFs in the presence of soft drop grooming. We start by identifying the relevant modes in order to derive a factorization theorem in the kinematic region of interest. Similar to the ungroomed case we have τ gr /R 2 1 which implies that only collinear and soft degrees of freedom are relevant to leading power. Therefore, in order to closely relate our discussion here with the ungroomed jet mass distribution discussed above, we study in detail how the soft drop grooming algorithm modifies the factorized structure obtained in Eq. (2.9).
Any hard collinear radiation at the scale µ J ∼ p T R is captured by the hard matching functions H c→i (z, p T R, µ). They correspond to energetic out-of-jet radiation contributions which are not affected by the soft drop grooming algorithm that only deals with the in-jet dynamics. Therefore, the hard matching functions H c→i are not modified in the presence of soft drop grooming.
The collinear radiation inside the jet is described by the collinear functions C i (τ, p T , µ). To leading power in z cut , the collinear functions are also not modified by the soft drop grooming algorithm as all the energetic collinear in-jet radiation always passes the soft drop criterion. This can be understood as follows. Let us denote the energy fraction of the softer branch after a de-clustering step by z. For collinear modes, z should generally satisfy z ∼ 1 z cut which means that the branch passes the soft drop criterion. The dynamics of other branches with the scaling z ∼ z cut 1 are naturally captured by the soft functions. In the analytical calculations, one can show that after zero-bin subtraction [79], the z cutdependent contributions to the collinear functions are suppressed by powers of z cut . Since we are working in the parametric limit z cut 1, one may safely neglect these power corrections of order O(z cut ). The situation here is in fact very similar to the jet angularity calculations in [80]. One finds that the jet algorithm leads to a constraint on the parton branching fraction z such that z lim < z < 1 − z lim , with z lim ∼ τ a /R 2 . As it was shown carefully in [80], these constraints lead to z lim -dependent contributions, which are power suppressed precisely by z lim when τ a /R 2 1. The role of z lim in the angularity calculation is now replaced by z cut [40] from the soft drop grooming algorithm, and thus the same conclusions hold. Finally, let us consider the soft radiation. We find that the soft radiation (or collinearsoft mode) contains particles which may or may not pass the soft drop grooming criterion. Since we are working with the hierarchy τ gr /R 2 z cut , soft radiation emitted at a relatively large angle will naturally fail the soft drop criterion. This can be understood as follows. We choose to work in a reference frame where the jet has no transverse momentum component and let us denote its large light-cone component by ω J . Now we consider the situation where the soft particle with momentum k and z = k 0 /E J = (k + + k − ) /ω J is radiated at an angle θ with respect to the jet axis. The soft drop criterion can then be written as For the large angle soft radiation inside the jet, we have If the soft radiation passes the soft drop criterion in Eq. (3.7), they would remain in the final groomed jet, and thus contribute to the jet mass observable, Combining the above Eqs. (3.7), (3.8), and (3.9), one would have which violates the hierarchy τ gr /R 2 z cut . Therefore, the soft radiation at relatively large angles inside the jet will not contribute to the groomed jet mass m J,gr or τ gr . The precise momentum scaling of these soft emissions is given by where the superscript " / ∈gr" emphasizes the fact that they do not pass the soft drop grooming criterion.
On the other hand, the soft radiation that is emitted at smaller angles θ R passes the soft drop criterion and will contribute to the observed groomed jet mass τ gr . In this case, with k + /k − ∼ θ 2 R 2 , following the same analysis as above, we obtain the momentum scaling for the more collimated soft radiation (3.12) Here the superscript "gr" emphasizes the fact that the soft radiation passes the soft drop criterion and will thus end up within the final groomed jet. To summarize, we have the following refactorized expression for the groomed siJMFs Here the hard matching functions H c→i and the collinear functions C i (τ, p T , µ) are the same as for the ungroomed case, see Eq. (2.9) above. However, the soft functions are different where S / ∈gr i takes into account soft radiation that fails the soft drop criterion, while S gr i is associated with soft particles that pass the soft drop criterion and thus remain inside the groomed jet. We note that the R dependence of the soft function S gr i is only due to the soft drop constraint in Eq. (3.1) instead of the jet clustering constraint. We further illustrate the factorization in Eqs. (3.3) and (3.13) in Figs. 2 and 3.
When we consider the kinematic region with the scaling τ gr /R 2 ∼ z cut 1, there is a transition point [33,81] at τ gr /R 2 = z cut above which the groomed factorization theorem in Eq. (3.13) is reduced to the ungroomed case, as outlined in section 4.4. We present the detailed derivation in the Appendix A.2. The numerical results also show the existence of the transition point between the groomed and the ungroomed case at large values of the jet mass as presented in section 4.4.

Soft functions at NLO
In this section, we present the explicit NLO expressions for both types of soft functions that appear in the factorization theorem in Eq. (3.13). We refer the interested reader to the Appendix for a more detailed derivation. The soft functions S / ∈gr i do not depend on the groomed jet mass τ gr . The reason is that they only take into account soft particles that fail the soft drop criterion and, hence, those soft particles do not contribute to the groomed jet mass. Up to NLO, the renormalized soft functions S / ∈gr i for quarks and gluons i = q, g are given by which is independent of τ gr as expected. From the above result, one can obtain the natural momentum scale for S / ∈gr i , which is given by The relevant anomalous dimensions γ / ∈gr S i are given by (3.17) The other soft functions S gr i describe the soft radiation that passes the soft drop criterion and therefore contributes to the groomed jet mass. The renormalized soft functions S gr i up to NLO are given by where the factor A is given by the following expression From the perturbative NLO result, we find that the natural scale for S gr i is . (3.20) The associated RG equations have a convolution structure with respect to τ and are given by where the anomalous dimensions γ gr S i are given by

Consistency between the groomed and ungroomed case
We are now going to study the connection between the factorization formalism for the groomed and ungroomed jet mass distribution which provides an important consistency check of the obtained factorization theorems. In the kinematic region of interest, τ /R 2 z cut 1, we find that both the hard matching functions H c→i taking into account out-of-jet radiations and the collinear functions C i (τ, p T , µ) are the same for both cases. As mentioned above, it turns out that only the soft functions are different. From the consistency of the RG evolution equations, one expects that the anomalous dimensions γ / ∈gr S i and γ gr S i for the groomed jet mass distribution should be related to the anomalous dimension γ S i for the ungroomed case. In fact, the consistency between the two cases requires the soft anomalous dimensions to satisfy the following relation From the explicit expressions given in Eqs. (3.17), (3.22) and (2.23) above, we can directly verify that the above equality indeed holds true. When we take the limit β → ∞, the soft drop criterion is always satisfied and we get back to the ungroomed jet mass distribution. This limit can be studied directly at the level of the perturbative NLO expressions of the soft functions presented above. One observes that the renormalized soft functions S gr i in Eq. , approach 1 and 0, respectively, in the limit β → ∞. In section 4.4, we are also going to study the transition between the groomed and ungroomed jet mass distributions by taking the limit β → ∞ numerically.

Comment on non-global logarithms and comparison to the literature
Before presenting phenomenological results at the LHC, we would like to briefly comment on the role of NGLs for both the groomed and ungroomed jet mass distribution which we do not take into account in our factorization theorems above. In addition, we address in more detail how our new factorization formalism compares to results available in the literature. Generally, NGLs arise from gluons outside the jet that radiate soft gluons into the jet [37,38]. This leads to single logarithmic contributions starting at NNLO. In order to do precision jet substructure calculations such contributions have to be taken into account even though the NGL contribution is often found to be rather small. In the past years a lot of progress has been made in order to better understand the complicated all order structure of NGLs, see for example [82][83][84][85][86][87][88]. The ungroomed jet mass distribution as discussed in section 2 receives single logarithmic non-global contributions of the form α n s ln k (τ /R 2 ) with k ≤ n. In this sense the NGLs directly affect the jet mass spectrum. For the groomed case, these logarithms of the jet mass are absent (β = 0) or power suppressed (β > 0). Note that for β → ∞ the usual NGLs for the ungroomed case are reproduced. See [33] for a more detailed discussion. However, also the groomed jet mass distribution receives corrections from NGLs which affect the absolute normalization of the cross section and also indirectly the groomed jet mass spectrum. For the groomed inclusive jet mass spectrum NGLs arise due to the angular correlation of emissions between the in-jet wide angle soft radiation in S / ∈gr i and the hard collinear radiation outside the jet in H c→i 1 . Therefore, there are NGLs of the form α n s ln k z cut with k ≤ n that will change the absolute normalization of the cross section. In addition, since NGLs affect the quark and gluon contributions differently they will also indirectly affect the shape of the groomed jet mass distribution. Of course, for all practical purposes the numerical effect of NGLs is expected to be rather small for both the groomed and ungroomed jet mass distribution unless z cut is chosen to be very small.
Finally, we would like to compare our new approach to the jet mass distribution for inclusive jet production to results available in the literature. In particular, we compare to the results of [39]. See also [35,36,40,41] for example. In [39], the inclusive groomed jet mass distribution was considered in pp → Z + jet + X events. The event topology considered in this work is therefore different pp → jet + X but the general factorization structure is the same. Using the notation developed in this work, the factorized structure employed in [39] can be summarized as follows where the sum is taken over c = q,q, g. The hard functions defined here H c are independent of τ gr and have been extracted in [39] to NLO from MCFM [89] for pp → Z + jet + X. Instead, the factorization framework presented in this work now allows for a further separation of H c in terms of hard functions H c ab , hard matching functions H c→i and soft functions S / ∈gr i taking into account soft radiation that fails the soft drop criterion, see Eqs. (3.3) and (3.13). The additional factorization allows for the resummation of logarithms in the jet size parameter R and, more importantly, logarithms in the soft threshold parameter z cut which otherwise can only be determined numerically to fixed order. An important feature of our new formalism is that by resumming all logarithms in z cut we are able to reliably predict the absolute normalization of groomed jet observables, which was not achieved for pp collisions before, up to NGLs. In addition, our new formalism in principle allows us to also systematically include NGLs for groomed jet observables since they can be clearly associated with certain parts of our factorization theorem as discussed above. However, numerical studies of NGLs are beyond the scope of this work and will be addressed in the future.

Phenomenology at the LHC
In this section, we present numerical results for jet mass distribution at LHC energies, for both ungroomed and soft drop groomed jets in pp → jet + X. We first present details of our numerical studies and we then compare with the experimental data taken at the LHC.

RG evolution
For all the numerical studies, we closely follow the methods used in the jet angularity paper of [52]. We solve the respective evolution equations of the collinear and soft functions in position space for which we define the Fourier transform of a generic function F depending on τ as (4.1) We then evolve the collinear and soft functions from their canonical scales to the jet scale µ J ∼ p T R where they will be combined with the hard matching functions H c→i in order to obtain the siJMFs G c in Eq. (2.9) or their groomed counterparts G gr c in Eq. (3.13). For more details, see [52,90]. The final expressions for the ungroomed siJMFs G c can be written in terms of the evolved collinear and soft functions as where the convolution over τ becomes a simple product in the position space variable x.
The coefficient functions C c→i (z, p T R, µ) are related to H c→i and their explicit expressions can be found in [58]. The perturbative results of the relevant functions and their anomalous dimensions in position space can be derived by taking the Fourier transform following the definition in Eq. (4.1). It might be instructive to point out that the above RG running from µ C ∼ p T τ 1/2 to µ J , as well as from µ S ∼ p T τ /R to µ J are both resumming the logarithms of type α n s ln 2k τ /R 2 with k ≤ n at next-to-leading logarithmic (NLL) accuracy. Similarly, we can obtain the final expressions for the groomed siJMFs G gr c in terms of the evolved functions in position space as G gr c (z, p T , R, τ gr , µ; z cut , β) = i C c→i (z, p T R, µ) Here, the RG evolution of the collinear function between the scales µ C and µ J resums logarithms in τ /R 2 which is the same as in the ungroomed case. In addition, logarithms in z cut that are introduced by the grooming procedure are resummed through the RG running as can be seen explicitly here. The soft function S / ∈gr i is evolved from its characteristic scale from µ / ∈gr S ∼ z cut p T R to the jet scale µ J ∼ p T R and similarly for S gr i . The resummation of logarithms in z cut is particularly important when z cut is chosen to be very small. For our phenomenological results presented below we always choose z cut = 0.1. See for example [91] where the authors proposed to use z cut values down to 0.001 which was termed "light grooming". The resummation of logarithms in z cut is related to NGLs and is particularly relevant in order to determine the absolute normalization of the groomed jet cross section as discussed in more detail in section 3.6.
With the above results for G c and G gr c at the canonical scale µ J , we further evolve the ungroomed/groomed siJMFs through their DGLAP equations in Eqs. (2.6) and (3.6) from µ J ∼ p T R to the hard scale µ H ∼ p T . This second step of the RG evolution resums single logarithms in the jet size parameter R. By solving all relevant RG evolution equations we are thus able to resum three dominant classes of logarithmic corrections to all orders in the strong coupling constant for the groomed jet mass: logarithms in the jet mass τ gr /R 2 , the jet radius R and the soft threshold z cut . For the ungroomed case, there are no logarithms in z cut and the jet mass logarithms are given in terms of τ /R 2 .

Non-perturbative shape functions and scale variations
For small values of τ , the soft scale µ S ∼ p T τ /R in Eq. (2.21) for ungroomed jets, and the corresponding soft scale for the groomed case µ gr S ∼ p T (z cut τ 1+β /R β ) 1 2+β in Eq. (3.20) can run into the non-perturbative regime. We use profile functions [92] in order to freeze µ S and µ gr S at 0.25 GeV in order to avoid the Landau pole. See [23,52] for more details. In order to capture non-perturbative effects we then introduce a shape function F i (k). We adopt a simple functional form for the non-perturbative shape function which only depends on a single parameter Ω [22] F i (k) = 4k Ω 2 exp(−2k/Ω) . The shape function F i (k) is normalized to unity and its first moment is equal to the parameter Ω: The subscript i = q, g indicates that in principle we could have different values of Ω for quark and gluon jets. However, for our numerical calculations below, we find that a single value for Ω is sufficient. Therefore, we drop the subscript and we simply write the shape function as F (k) below. For both the groomed and ungroomed jet mass distribution, we then convolve the purely perturbative result with the non-perturbative shape function. For the groomed case we have Here τ gr as it is obtained from the purely perturbative result is shifted by the virtuality of the soft radiation that passes the soft drop criterion, as this mode has the smallest virtuality [39]. From Eq. (3.20), we identify µ gr S ∼ k, which introduces the shift in the above formula. Analogously, for the ungroomed jet mass distribution we find 2 Here the shift can be derived from Eq. (2.21) by identifying µ S ∼ k. Note that also after taking into account the non-perturbative shape function, the ungroomed jet mass distribution is obtained from the groomed case by taking the limit β → ∞. This can be seen directly from Eq. (4.6) which reduces to Eq. (4.7) for β → ∞ which removes the groomer. We note that Ω characterizes the mean shift of the jet mass spectrum due to the non-perturbative effects such as hadronization and the underlying event.
Next we discuss how we estimate theoretical uncertainties. In order to estimate QCD scale uncertainties, we vary our choices of scales for each function or mode in the factorization theorem by factors of 2 around their canonical values. For the ungroomed jet mass, we have µ H , µ J , µ C , µ S with the canonical choices given in Eqs. (2.4), (2.5), (2.15) and (2.21), respectively. On the other hand, for groomed jets, besides µ H , µ J , µ C , we have two separate soft scales µ / ∈gr S and µ gr S , with the corresponding canonical choices given in Eqs. (3.15) and (3.20), respectively. We vary these scales while maintaining the relation where the superscript indicates the canonical scale. Note that we choose to fix the collinear scale µ C in terms of the soft scale µ S for the ungroomed case and only vary them together. Thus, we have Similarly, for the groomed case, we relate the collinear scale µ C to the soft scale µ gr S . In addition, we fix the soft scale µ / ∈gr S relative to the jet scale µ J and, thus, we only vary the two sets of scales together

Numerical results: the ungroomed jet mass
For all the numerical results presented in this work we consider jets that are reclustered through the anti-k T algorithm [75] and we use the CT14NLO PDF set [93]. We start with ungroomed jet mass distribution for the single inclusive jet production pp → jet + X. In Fig. 4, we show the comparison of our theoretical calculations and the experimental data from the ATLAS collaboration which was taken at √ s = 7 TeV at the LHC [15]. The shown ungroomed jet mass distributions are plotted as a function of m J and they are normalized to the inclusive jet cross section, see Eq. (1.3). For the experimental analysis a jet radius parameter of R = 1 was chosen and the jets are taken into account in the rapidity range of |η| < 2. In addition, the observed jets are required to have a transverse momentum in the range of 200 < p T < 600 GeV. The allowed jet transverse momentum range is separated into four intervals 200 < p T < 300 GeV, 300 < p T < 400 GeV, 400 < p T < 500 GeV, 500 < p T < 600 GeV which corresponds to the four panels shown in Fig. 4. The plotted experimental errors include systematic and statistical uncertainties added in quadrature. For each jet transverse momentum interval we show two theory curves, along with the results from Pythia8 simulations [94]. First, the dashed black lines with the yellow uncertainty bands show our purely perturbative predictions at NLL accuracy, i.e. without the non-perturbative shape function. Second, the red lines and the corresponding hatched red error bands show the theory predictions including the non-perturbative shape function as discussed in section 4.2 above. For both cases the theoretical error bands are obtained by varying the scales as discussed in section 4.2 and by taking the envelope. For the parameter Ω in the non-perturbative shape function we choose Ω = 8 GeV which gives a very good description of the experimental data. The fact that we need such a large value for Ω reflects the fact that, as expected, the ungroomed jet mass distribution is very sensitive to non-perturbative physics such as hadronization and the underlying event etc. [22,29]. In fact, the position of the peak is shifted by a factor of 3 depending on the p T of the identified jets. On the other hand, the Pythia simulations that include both hadronization and underlying event contributions describe the data well, as indicated by the blue dashed curves. When grooming is taken into account the sensitivity to non-perturbative physics is expected to be significantly reduced which we confirm in the section below. Note that we did not take into account NGLs which, however, are expected to give a relatively small contribution. Nevertheless, it is remarkable that by tuning a single parameter Ω in the rather simple non-perturbative model for the shape function, the developed factorization formalism can give a very good description of the ungroomed jet mass distribution in pp collisions at the LHC. One generally observes that the ungroomed  . The dashed blue lines are the results from Pythia8 simulations [94]. The jet rapidity is integrated over |η| < 2, and the observed jet transverse momentum is separated into four different intervals 200 < p T < 300 GeV, 300 < p T < 400 GeV, 400 < p T < 500 GeV, 500 < p T < 600 GeV which correspond to the four different panels.
jet mass distribution peaks at larger values as the p T of the identified jets is increased. This is consistent with the usual evolution picture [95], where the larger the p T is, the longer the evolution develops. The fact that our factorization formalism originally derived for R 1 works this well for such a large radius jet R = 1 confirms earlier observation: as emphasized in Sec. 2, the power corrections of the form O(R 2 ) to our factorization formalism are quite small. To further test our factorization formalism and understand the non-perturbative physics, in Fig. 5, we plot the jet mass distributions for jets with a smaller radius R = 0.4 at the same kinematic regions as above. We find that the distributions are concentrated more in the small m J region compared with the larger R counterparts. This is as expected, smaller R leads to more collimated jets and thus smaller jet invariant mass. At the same time, we find that our perturbative results convolved with the non-perturbative shape function with a much smaller Ω = 3.5 GeV than the larger R case, agree very well with the Pythia  simulations. This suggests that while the hadronization effect always exists, the underlying event contributions seem to be smaller for jets with smaller R. This is consistent with the earlier analysis [22].

Numerical results: the groomed jet mass
In this section we are now going to present numerical results for the soft drop groomed jet mass distribution for single inclusive jet production pp → jet + X at the LHC. Unfortunately, there is currently no data available for inclusive jet production that would allow for a direct one-to-one comparison to our theoretical results. In [96], the CMS collaboration presented preliminary results for the groomed jet mass distribution for inclusive jet production for both Pb-Pb and pp collisions at √ s = 5.02 TeV but the pp baseline is smeared to allow for a better comparison to the heavy-ion data. Nevertheless, we expect that such an analysis of LHC data is feasible and will become available in the near future. With this in mind, we present our predictions for √ s = 13 TeV at the LHC. As an example, we assume that jets are reconstructed using the anti-k T algorithm with a jet radius parameter of R = 0.8. We choose the following jet transverse momentum and rapidity intervals for the inclusive jet sample: |η| < 1.5 and p T > 600 GeV. For the soft threshold parameter of the soft drop grooming algorithm, we choose z cut = 0.1. In Fig. 6, we show the soft drop groomed jet mass distributions normalized by the cor-  Figure 6. The theoretical predictions for the soft drop groomed jet mass distribution for single inclusive jet production pp → jet + X at √ s = 13 TeV. The observed jets are reconstructed through the anti-k T algorithm with a jet radius parameter of R = 0.8. The rapidity and transverse momentum intervals for the inclusive jet samples are chosen as |η| < 1.5 and p T > 600 GeV and the soft threshold parameter is z cut = 0.1. The soft drop groomed jet mass distribution is normalized to the corresponding inclusive jet cross section and plotted as a function of log 10 (m 2 J,gr /p 2 T ) for β = 0 (left), β = 1 (middle), and β = 2 (right). The dashed black lines and yellow error bands show the purely perturbative NLL results, while the red lines and the red hatched bands are the NLL results but including the non-perturbative shape function according to Eq. (4.6). We choose Ω = 1 GeV for the parameter of non-perturbative shape function. The dashed blue lines are from Pythia simulations. responding inclusive jet cross sections as a function of log 10 (m 2 J,gr /p 2 T ) for three different values of the angular exponent: β = 0 (left), β = 1 (middle), and β = 2 (right). Similar to Fig. 4, the dashed black lines and the corresponding yellow error bands show our purely perturbative results at NLL accuracy. The red lines and the red hatched bands show the result when the non-perturbative shape function is included where we follow the prescription in Eq. (4.6). We choose the parameter of the non-perturbative shape function as Ω = 1 GeV to illustrate the impact of non-perturbative physics effects. Finally, the dashed blue lines are from Pythia simulations. We find that the numerical results from our factorization formalism with Ω = 1 GeV agree well with the Pythia results, for a relatively large jet radius R = 0.8. This much reduced parameter Ω compared to the ungroomed cases indicates that the groomed jet mass distributions have a much smaller sensitivity to the nonperturbative physics. The fact that Ω = 1 GeV is around the size of a typical hadron mass implies that the nonperturbative contributions come mainly from hadronization.
To further test our factorization formalism for groomed jet substructure and to understand the nonperturbative physics, in Fig. 7, we plot the groomed jet mass distributions for jets with a smaller radius R = 0.4 at the same kinematic regions as in Fig. 6. We find that the same parameter Ω = 1 GeV leads to a good agreement between our numerical results and the Pythia simulations. This strongly suggests that the underlying event contributions are much reduced for the groomed jet mass distribution, and the main nonperturbative physics comes from hadronization. . Same as for Fig. 6, but for jets with R = 0.4. The parameter of the non-perturbative shape function is chosen as Ω = 1 GeV, to agree better with the Pythia8 results.
As discussed above, the ungroomed jet mass distribution should be recovered from the groomed case by taking the limit β → ∞. In section 3.5 we discussed this transition at the level of the analytical perturbative results. Here, we study the β → ∞ limit numerically. In Fig. 8, we plot the groomed jet mass distribution for different values of the angular exponent in the range of β = 0 to 4 (dashed lines) as well as the ungroomed result (solid blue). Note that we only show the purely perturbative results here in order to better illustrate how the groomed results converge to the ungroomed jet mass distribution when β is increased. If we instead have included non-perturbative shape function, then Ω would have to be adjusted when taking the limit β → ∞. Note that here we plot both the groomed and ungroomed results as a function of log 10 (m 2 J,gr /p 2 T ) as in Fig. 6 instead of m J used in Fig. 4. For a stronger grooming procedure (smaller values of β), the jet mass distribution gets flatter and shifted toward smaller values. This is expected intuitively as it becomes more likely to observe smaller values of the jet mass after the grooming procedure which removes soft wide-angle radiation from the jet. As expected a smooth transition between the groomed and ungroomed case can be observed for β → ∞. This feature of the jet mass distributions can be particularly useful in order to understand the impact of grooming in heavy-ion collisions, see for example [16,96].
The groomed jet mass distributions for different values of β all become very similar at the transition point τ gr = m 2 J,gr /p 2 T = z cut R 2 . This can also be seen from the values of the soft scales. By identifying τ gr = z cut R 2 , we find from Eqs. (3.15) and (3.20), which makes the scales of the soft functions in the groomed case to be identical to the scale of the soft function for the ungroomed case, see Eq. (2.21). This makes the evolution factors identical independent of β values and whether there is a grooming or not, and β dependence only enters from the renormalized expressions of the soft functions at the fixedorder. Therefore, although in reality the perturbative results do not all intersect exactly at τ = z cut R 2 , they become very similar at τ = z cut R 2 as can be seen from Fig. 8. At  Fig. 6. We only show the purely perturbative results plotted as a function of log 10 (m 2 J,gr /p 2 T ). In the limit β → ∞, the ungroomed distribution is recovered from the groomed case. larger values, the grooming does not play a role and the ungroomed jet mass distribution is recovered. See the discussion in section 3.3 and the Appendix A.2.
Recently the ATLAS collaboration reported on a measurement of the soft drop groomed jet mass distribution in [48]. A similar analysis was performed by CMS in [47]. The measurement is performed in an inclusive way in the sense that no additional cuts are imposed on the hadronic activity outside the signal jets. However, additional cuts are imposed on the observed jet transverse momenta which unfortunately hinders a direct oneto-one comparison with the inclusive jet production framework developed in this work. The details of the analysis are as follows. Events are taken into account that have at least two jets and the leading jet is required to have a transverse momentum of p T 1 > 600 GeV. In addition, the two leading p T -ordered jets are required to satisfy p T,1 /p T,2 < 1.5. Since the two leading jets are required to have a similar transverse momentum, this additional requirement effectively enforces a di-jet configuration. Events with additional energetic jets are thus removed. The two leading jets are then included in the soft drop jet mass measurement. Furthermore, the η of the thus obtained jet samples is restricted to |η| < 1.5.
The ATLAS results for the groomed jet mass distribution are then plotted as "resummation" region [48] and it is defined as dσ d log 10 τ gr d log 10 τ gr . (4.14) The factorization formalism developed in this work is for single inclusive jet production, which is strictly speaking not compatible with the ATLAS measurement. However, most of the events are indeed di-jet configurations when the jet p T is very large [97]. The additional production of a third jet with a very large transverse momentum is suppressed by an additional power of α s (p T ) 1. One can thus expect that the qualitative features of the soft drop jet mass distribution as measured by ATLAS are nevertheless correctly described by the factorization formalism presented in this work for pp → jet + X. However, we note that high precision jet substructure studies require a direct one-to-one correspondence between the experimental measurement and the theoretical calculations.
Instead of normalizing the soft drop jet mass distribution by the inclusive cross section as shown in Fig. 6, we now adopt the normalization used by ATLAS and divide by σ resum . We thus follow Eq. (4.14) and integrate our results over the range of −3.7 < log 10 τ gr < −1.7. In Fig. 9, the ATLAS data for the soft drop groomed jet mass distribution is shown where both systematic and statistical errors are included. The data is plotted as a function of log 10 (m 2 J,gr /p 2 T ) for three different values of the angular exponent used in the analysis: β = 0 (left), β = 1 (middle), and β = 2 (right). In addition, we show the theoretical results using the factorization formalism developed in this work for the groomed jet mass distribution. As in Fig. 6, the dashed line and the corresponding yellow error band are the purely perturbative results at NLL accuracy, while the red line and the red hatched band are NLL results convolved with the non-perturbative shape function. Again we choose Ω = 1 GeV which gives a very good description of the experimental data in the resummation region. In this region, the factorization formalism developed here is expected to work very well. By including the non-perturbative shape function, the agreement with the data can also be achieved in the very small jet mass region. In the very large jet mass region, we would have to include a matching to fixed order calculations. In general, it is possible to include such a matching in our formalism which, however, is beyond the scope of this work and will be addressed in the future.
The soft drop grooming procedure is designed to eliminate the sensitivity to the underlying event contribution. This is confirmed by our numerical results for the different jet mass distributions. Note that we consistently treat non-perturbative effects for both the groomed and the ungroomed case by using the same shape functions. We find that the non-perturbative parameter Ω = 1 GeV is much smaller for the groomed case than for the ungroomed jet mass distribution where we had to use Ω = 8 GeV in order to find a good agreement with the data. Note that our calculations are performed at the parton level whereas the experimental results are unfolded at the hadron level. Therefore, a remaining non-perturbative correction needs to be taken into account also in the groomed case. However, this remaining hadronization correction is expected to be small since it should be at the order of Λ QCD . This expectation agrees with our observation that Ω = 1 GeV is sufficient in order to obtain a good agreement with the experimental data.

Summary and outlook
In this work, we studied the jet mass distribution for the single inclusive jet production at the LHC, fully differential in the kinematics of the signal jet. We considered both the ungroomed and soft drop groomed mass distributions.
We derived the corresponding factorization formalisms from first principles in perturbative QCD. We studied the connections and differences between the factorization theorems for the groomed and ungroomed case, and we computed all the necessary components to NLO. By solving the associated renormalization group equations, we are able to perform the joint resummation at the NLL accuracy, for logarithms in both the small jet radius parameter R and the small jet mass m J . For the soft drop groomed jet mass distribution, an additional resummation of the logarithms in the soft threshold parameter z cut has also been achieved. In this sense, we realized for the first time a complete description of the groomed inclusive jet mass distribution where all relevant logarithms have been resummed at NLL accuracy. The complete resummation of logarithms in z cut allows us to reliably determine the absolute normalization of groomed jet observables. In addition, the derived factorization theorem allows for systematically including NGLs in the future. Being able to completely resum logarithms in the soft threshold parameter z cut will enable a comparison of theory calculations and data where significantly smaller values are chosen for z cut which can be advantageous in some situations. In addition, the resummation of single logarithms in the jet size parameter is particularly useful for jets measured in heavy-ion collisions where typically a rather small jet radius parameter is chosen.
It is important to realize that the developed hard collinear factorization formalism established in this work enables us to compute the relative contribution of jets that are initiated by either quarks or gluons. Such a relative fraction of quark and gluon jets in the sample can be determined order by order in the perturbation theory through the computation of the hard functions H c ab . For the current study, we have used the NLO hard functions H c ab and thus the relative ratios are computed to NLO accuracy. This is apparently an advantage of our factorization formalism for single inclusive jet production. Our formalism allows for the extension to tagged jets observed in inclusive processes like pp → Z + jet + X which we are planning to address in forthcoming work.
We further presented numerical results for the ungroomed inclusive jet mass distributions at the LHC, with the experimental kinematic cuts fully taken into account. For the groomed jet mass spectrum, a direct one-to-one comparison with LHC data is currently not feasible as there are no soft drop groomed jet mass measurements available for inclusive jet production. Instead, we compared our predictions with the groomed jet mass distribution measured in high p T di-jet events, based on the observation that the inclusive cross section is dominated by di-jet configurations at large jet transverse momenta. In general, we found that our theoretical calculations lead to a very good description of the experimental data in the regions where the factorization theorems hold. Given the success of our formalism for inclusive jet production and the advantages in statistics, we suggest that the soft drop groomed jet mass measurement should also be performed using inclusive jet samples in the future.
To further extend the region of validity of our formalism, a further matching of the NLL results to the full NLO calculation is required which will be left for future work. The full NLO calculations can be achieved using nlojet++ [98]. Computations beyond NLL accuracy for single inclusive jet samples are also possible but would require the calculation of the hard functions H c ab for producing a single inclusive parton to NNLO which is the un-renormalized partonic cross section for producing a parton (not a jet) in the final state. This task is challenging, but recent studies for the single inclusive jet cross section at NNLO [99] make it very promising in the near future. We expect that the framework developed in this work can be directly generalized to study other groomed jet substructure observables.
in which the jet has no transverse momentum component relative to the jet direction. In this frame, the four-momentum of the jet can be written as µ = ( − = ω J , + , 0 ⊥ ) with jet energy E J ≈ ω J /2. In such a reference frame the jet energy is given by the observed jet transverse momentum in the center-of-mass frame, i.e. we have E J = p T . Thus, the soft drop criterion in Eq. (3.1) can be written for soft radiation as where k denotes the soft momentum. Here θ ij is the angle between the soft particle and the jet axis, which can be determined from [80] to be We may thus rewrite Eq. (A.1) as where we have used tan(θ/2) ≈ θ/2 for collimated jets (R 1).

A.1 Soft radiation that fails the soft drop criterion
First we provide the details of the calculation for the soft functions S / ∈gr i (p T , R, µ; z cut , β), which describe soft radiation that fails the soft drop criterion. Since this radiation fails the soft drop criterion, it is removed from the jet and, thus, does not contribute to the observed groomed jet mass. In this case, the soft momentum k within the jet will satisfy the following constraints Here the first inequality states that the soft radiation fails the soft drop criterion, while the second one is the constraint on the soft momentum k within the jet for the anti-k T algorithm. With that, the non-vanishing contribution to the NLO correction for the soft functions S / ∈gr i within MS scheme are given by where the space-time dimensions are given by n = 4 − 2 . We use the notation C i = C F,A for the quark and gluon soft functions, respectively. Changing integration variables from k + , k − to x, y where we find (A.8) Here we neglected the power corrections of the form O(R 2 ). After expanding in powers of , we obtain S / ∈gr i (p T , R, µ; z cut , β) = α s 2π We thus obtain the renormalized soft functions as S / ∈gr i (p T , R, µ; z cut , β) = 1 + α s 2π The associated RG equations take the following form where the anomalous dimensions γ / ∈gr S i are given by γ / ∈gr S i (p T , R, µ; z cut , β) = α s π C i 1 + β ln µ 2 z 2 cut p 2 T R 2 . (A.14)

A.2 Soft radiation that passes the soft drop criterion
We are now going to provide the details of the calculation of the soft functions S gr i (τ, p T , R, µ; z cut , β), which describe soft radiation that passes the soft drop criterion. Since the associated soft particles pass the soft drop criterion, they remain in the groomed jet and thus contribute to the groomed jet mass. Therefore, the soft functions here depend on τ . The NLO corrections to the soft functions S gr i (τ, p T , R, µ; z cut , β) can be written as S gr i (τ, p T , R, µ; z cut , β) =32π 2 α s C i Note that the delta function δ(τ − 4k + /ω J ) in the first line states the fact that the soft radiation here contributes to the jet mass via m 2 s = ω J k + and τ = 4m 2 s /ω 2 J . The first Θ-function in the second line is the soft drop criterion, and the second Θ-function is again due to the jet algorithm constraint, see Eqs. (A.4) and (A.5) above. Both theta functions give constraints on the integration variables k ± and in the following we determine which one sets a more stringent constraint on the corresponding integration regions. To proceed, we first note that k + k − for k + = ω J τ /4. This holds true in the kinematic region we are interested in τ /R 2 z cut 1. The soft drop criterion can thus be simplified as For τ /R 2 z cut , one finds that soft drop criterion obtained here in Eq. (A.16) is always the stronger constraint on the soft radiation than the jet algorithm constraint. Therefore, as long as Eq. (A.16) is satisfied, we can remove the jet algorithm constraint. By making use of these considerations, we obtain the following result up to NLO S gr i (τ, p T , R, µ; z cut , β) = δ(τ ) + α s π C i − 2 + β 2(1 + β) where the factor A is given by . (A.18) The bare and renormalized soft functions are related through a convolution relation as S gr i,bare (τ, p T , R; z cut , β) = dτ Z gr S i (τ − τ , p T , R, µ; z cut , β)S gr i (τ , p T , R, µ; z cut , β) .
(A. 24) We observe that the soft drop constraint is thus less restrictive than the jet algorithm constraint for τ /R 2 > z cut and β > 0: Eq. (A.24) vs. Eq. (A.5). In this kinematic region we can therefore remove the Θ-function in Eq. (A.15) associated with the soft drop grooming algorithm and we are left with the soft function for the ungroomed case. Below the transition point the grooming procedure required us to replace the ungroomed soft mode with two modes, S gr i and S / ∈gr i . Above the transition point we now find that S gr i is reduced to the ungroomed soft function which also implies S / ∈gr i → 1. In this sense, the entire factorization theorem for the soft drop groomed jet mass in Eq. (3.13) reduces to the ungroomed case, see Eq. (2.9), where there is only one (ungroomed) soft function. The result for the ungroomed soft function at NLO was given in Eq. (2.20). In section 4.4, the transition point at τ /R 2 = z cut can be seen directly in the numerical studies of the groomed jet mass distribution.