On the Cutoff Dependence of the Quark Mass Parameter in Angular Ordered Parton Showers

We show that the presence of an infrared cutoff $Q_0$ in the parton shower (PS) evolution for massive quarks implies that the generator quark mass corresponds to a $Q_0$-dependent short-distance mass scheme and is therefore not the pole mass. Our analysis considers an angular ordered parton shower based on the coherent branching formalism for quasi-collinear stable heavy quarks and splitting functions at next-to-leading logarithmic (NLL) order, and it is based on the analysis of the peak of hemisphere jet mass distributions. We show that NLL shower evolution is sufficient to describe the peak jet mass at full next-to-leading order (NLO). We determine the relation of this short-distance mass to the pole mass at NLO. We also show that the shower cut $Q_0$ affects soft radiation in a universal way for massless and quasi-collinear massive quark production. The basis of our analysis is (i) an analytic solution of the PS evolution based on the coherent branching formalism, (ii) an implementation of the infrared cut $Q_0$ of the angular ordered shower into factorized analytic calculations in the framework of Soft-Collinear-Effective-Theory (SCET) and (iii) the dependence of the peak of the jet mass distribution on the shower cut. Numerical comparisons to simulations with the HERWIG 7 event generator confirm our findings. Our analysis provides an important step towards a full understanding concerning the interpretation of top quark mass measurements based on direct reconstruction.


Contents
1 Introduction

Prelude and review
A precise determination of the top quark mass m t represents one of the most important measurements in the context of studies of the Standard Model (SM) as well as of new physics, in particular in the context of electroweak symmetry breaking. The most precise top mass measurements are obtained from template and matrix element fits which are based on the idea of accessing m t by directly reconstructing the kinematic properties of a top quark "particle". These types of measurements naturally yield a very high sensitivity to the top quark mass because they involve endpoints, thresholds or resonant structures in kinematic distributions which substantially reduces the impact of uncertainties that affect poperties such as their normalization. The most recent reconstruction measurements are m MC t = 172.44 (49) GeV (CMS) [1], m MC t = 172.84 (70) GeV (ATLAS) [2] and m MC t = 174.34 (64) GeV (Tevatron) [3].
The characteristic property of these measurements, however, is that the observables employed for the reconstruction analyses are too complicated to be calculated in a systematically improvable way and, in addition, involve sizeable perturbative and non-perturbative corrections due to soft gluon emission which, in the vicinity of kinematic endpoints or thresholds, are not power-suppressed. The theoretical computations used for these measurements are therefore based on multi-purpose Monte Carlo (MC) event generators since they can produce predictions for essentially any conceivable observable. As a consequence, in these direct mass measurements the top mass parameter m MC t of the MC generator employed in the analyses is determined. The experimental collaborations provide estimates of the theoretical uncertainty in the extracted value of m MC t concerning the quality of the modelling of non-perturbative effects, e.g. by using different tunes or MC generators, or concerning theoretical uncertainties, e.g. by variations of theory parameters. The improvement of the theoretical basis of MC event generators and of methods to estimate their uncertainties is an ongoing effort [4][5][6].
However, the intrinsic, i.e. quantum field theoretic meaning of m MC t has up to now not been rigorously specified. Since this matter goes beyond the task of properly estimating or reducing MC modelling uncertainties and is also tied to the constructive elements incorporated to the MC's perturbative and non-perturbative components, it is much harder to quantify. Issues one has to consider do not only involve the truncations of perturbative QCD expansions, but also MC specific implementations such as the cut on the PS evolution or even modifications that are formally subleading but play numerically important roles in reaching better agreement with data or are part of the implementation of the hadronization model. It should also be remembered that the level of theoretical rigor of MC event generators depends on the observable. Since the theoretical description of thresholds and endpoints in general involves the resummation of QCD radiation to all orders, the perturbative aspect of how to interpret m MC t thus significantly depends on the implementation of the parton showers that are used in the MC generators and to the extent that NLO fixedorder QCD corrections have been systematically implemented for the observables that are relevant for the reconstruction analyses. Apart from that, the interface between the perturbative components and the hadronization models, which involves the structure of the infrared cut of the shower evolution, Q 0 ∼ 1 GeV, or the treatment of the top quarks finite width, Γ t ∼ 1.4 GeV, and other finite lifetime effects can play essential roles. Finally, it should also be mentioned that m MC t may also be affected by non-perturbative MC modelling effects as a consequence of the tuning process partly compensating for approximations and model-like features implemented into the MC perturbative components.
So, although m MC t is by construction closely related to the concept of a kinematic top quark mass, the identification to a particular kinematic mass scheme is far from obvious -also because there are several options for kinematic masses including schemes such the pole mass m pole t or short-distance threshold masses as they are employed for the top pair threshold cross section at a future Linear Collider [7][8][9] or in the context of massive quark initiated jets [10,11]. As shown in Ref. [12], these kinematic mass schemes can differ by more than 1 GeV. Given that the reconstruction analyses have reached uncertainties at the level of 0.5 GeV it appears evident that systematic and quantitative examinations on the field theoretic meaning of the MC top mass m MC t are compulsory. This scrutiny may involve examinations of different MC generators, as well as the respective interplay of their perturbative and non-perturbative components.
So far, only a limited number of theoretical considerations dedicated to this issue exist in the literature. In Ref. [13], based on the analogy of the MC components to the QCD factorization for boosted top quark initiated jet masses in the peak region derived in the factorization framework of Soft-Collinear-Effective Theory (SCET) and boosted Heavy-Quark-Effective-Theory (bHQET) [14,15], it was conjectured that the relation between m MC t and the pole mass is given by m pole t − m MC t = R sc (α s /π), where the scale R sc should be closely related to the shower cut Q 0 . The conjecture was based on general considerations how an infrared cut affects perturbative calculations but did not provide a precise quantitative relation. It was, however, argued that the uncertainty in the relation is unlikely to exceed the level of 1 GeV. A similar conclusion was drawn in Ref. [16] where it was argued that m MC t , due to the effects of the hadronization models, may have properties analogous to the mass of a top (heavy-light) meson. Based on the concepts of heavy quark symmetry [17,18] the relation m MC t = m MSR t (R) + ∆ t,MC (R) was conjectured, where m MSR t is the MSR mass [19,20], the term ∆ t,MC contains perturbative as well as non-perturbative corrections and R = 1 GeV represents a factorization scale separating perturbative and non-perturbative effects. From a comparison of B meson and bottom quark masses, and using heavy quark symmetry, it was concluded that ∆ t,MC could in principle be at the level of 1 GeV. We also refer to Ref. [21] for a related discussion. In Ref. [11] the concrete numerical relation m MC t = m MSR t (1 GeV) + (0.18 ± 0.22) GeV was obtained from fitting NNLL (next-to-next-to-leading logarithmic) and O(α s ) matched factorized hadron level predictions for the 2-jettiness distribution in the peak region for boosted top production in e + e − annihilation [14,15] to corresponding pseudo-data samples obtained by PYTHIA 8.2 [22] with the default Monash tune [23] correctly accounting for the dominant top quark width effects in the factorized calculation. Here the quoted error is the theoretical uncertainty of the factorized NNLL+O(α s ) prediction and also includes an estimate for the intrinsic uncertainty of the PYTHIA 8.2 calculation. Using the pole mass scheme in the factorized NNLL+O(α s ) prediction, the corresponding analysis yielded m MC t = m pole t + (0.57 ± 0.28) GeV. While this analysis provided a concrete numerical result, it can only be generalized to LHC measurements if one makes the additional assumption that the MC top mass has a universal meaning covering in particular also the LHC environment and the substantially more complicated observables included in the direct mass measurements, for which currently no first principle calculations exist. In addition, systematic uncertainties in the modelling of non-perturbative effects at hadron colliders, such as multi parton interactions, or the description of the pile-up effects are much harder to control. An analogous analysis for the LHC environment was subsequently carried out in Ref. [24] using factorized NLL soft-drop groomed [25,26] hadron level jet mass distributions showing results that are compatible with, but less precise than those of [11]. We also refer to Ref. [27] for a related analysis.
Aside from the previously mentioned examinations, recently, a number of complementary studies were conducted focusing on various sources of uncertainties in the perturbative description of top production and decay and the non-perturbative modelling of final states involved in top mass measurements. While these studies mainly aimed at examining the potential size of uncertainties in top mass determinations from reconstruction as well as from alternative methods (see Refs. [28][29][30] and references therein), some of their findings may also be relevant for addressing the question how m MC t obtained from reconstruction should be interpreted field theoretically.
In Ref. [31] the sensitivity of m MC t determinations from exclusive hadronic variables such as the B-meson energy E B [32], the B-lepton invariant mass m B [33] or the transverse mass variables m T 2 [34][35][36][37] to variations of the parameters of the MC hadronization models in PYTHIA 8 and Herwig 6 was studied. They found that for top mass determinations based on these distributions to be competitive with direct reconstruction methods these hadronization parameters would have to be constrained significantly more precisely than what is possible from usual multi-purpose tuning. In addition, they made the observation that the top mass dependent endpoints of these distributions are, compared to the overall shape of the distributions, largely insensitive to variations of the hadronization parameters, indicating that these kinematic endpoints only depend on global and inclusive properties of the final state dynamics.
In Ref. [38] top mass determinations from distributions such as the b-jet and lepton invariant mass m b j and the variable m T 2 [34] were analyzed within fixed-order perturbation theory comparing the full NLO QCD result for W + W − bb production with different approximations in the narrow width approximation (NWA) concerning NLO QCD corrections in the production and the decay of the top quarks as well as using the parton shower from SHERPA [39] after top production. Using pseudo-data fits they found that the extracted top mass can depend significantly (at the level of 1 GeV or even more) on the approximation used, indicating that incomplete descriptions of finite-lifetime effects can lead to systematic deviations in the value of the extracted top mass of order Γ t .
In Ref. [40] the NLO-PS matched POWHEG [41,42] top production generators hvq [43], ttdec [44] and bb4 [45] interfaced to PYTHIA 8.2 [22] and Herwig 7.1 [46,47] were studied comparatively examining the peak position of the particle level b-jet and W invariant mass m b j W , the peak of the b-jet energy E b j [32] and moments of various lepton observables [48] in view of an extraction of the top quark mass. They found that the m b j W peak is largely insensitive to variations of the generators and the shower MC as well as to input quantities such as the strong coupling and the PDFs or the b-jet definition, and concluded that changes in the top mass due to these variations do not exceed 200 MeV in the absence of experimental resolution effects. They also indicated that the good agreement between the three POWHEG generators may imply that m b j W is not sensitive to additional finite lifetime effects. Once the smearing due to experimental resolution effects is accounted for, however, they found an increased sensitivity to the differences in the parton showers of PYTHIA 8.2 and Herwig 7.1 that correspond to variations in the extracted top mass at the level of 1 GeV or more. For E b j the dependence of the extracted top mass on the shower type and on the b-jet definition is in general at the level of 1 GeV. For the leptonic observables variations of this size arise from PDF uncertainties and from changing the shower type.

About this work
The aim of this work is to initiate dedicated individual examinations of the different components of MC event generators with the aim to gain insights concerning the field theoretic meaning and potential limitations of the MC top mass parameter m MC t from first principles. In this paper we start with an examination of the parton shower evolution with respect to the dependence on the infrared shower cut Q 0 .
Apart from the perturbative hard interaction matrix elements that encode the basic hard process that can be described by MC generators, the parton shower describes the parton branching for energies below the hard interaction scale and represents the perturbative component of MC generators responsible for the low energy dynamics in MC predictions. While common analytic calculations in perturbative QCD are carried out in the limit of a vanishing infrared regulator, event generators based on parton showers rely on the existence of an infrared cut in order to prevent infinite parton multiplicities and to ensure that the parton shower description does not leave the realm of perturbation theory.
From the field theoretic point of view, Q 0 represents a factorization scale that separates the perturbative components of MC event generators and their hadronization models. While it is generally accepted that a finite value for Q 0 restricts the amount of real radiation and multiplicity generated by the shower evolution, it is not per se obvious to which extend it may also affect the meaning of QCD parameters such as the MC quark mass parameters. Due to the unitarization property of the shower evolution which is responsible for the coherent summation of real as well as infrared virtual radiative corrections for scales above Q 0 , it is also plausible that the MC top mass parameter m MC t should acquire a dependence on the value of Q 0 unless one makes the additional assumption the Q 0 effects are negligible. In this work we examine this dependence and find that is is not negligible. We emphasize that in the discussions of this paper we ignore all issues related to (the shower cut dependence of) hadronization because the primary aim is to concentrate on the perturbative aspects of the relation between m MC t and field theoretic mass schemes. We are aware that the properties of the hadronization modelling in MC event generators may have a significant impact on the interpretation of m MC t , but we believe that examining perturbative and non-perturbative MC components separately in this respect is essential to gain full conceptual insight.
Because the top quark has color charge its mass is -following the principles of heavy quark symmetry -linearly affected by the momenta of ultra-collinear gluons [14,15], which are the gluons that are soft in the top quark rest frame. The role of these ultra-collinear gluons turns out to be essential for our conceptual considerations concerning the shower cut dependence of the top quark mass. Compared to the radiation pattern of massless quarks the additional effects coming from the ultra-collinear gluons is for example responsible for the dead cone effect [49,50] which is generally considered as coming from the top mass regulating the emergence of collinear singularities in the quasi-collinear limit. The radiation in the dead-cone region, however, is still non-zero and to the extent that it is unresolved becomes part of the energy (i.e. mass) of the measured top quark state. It is this quantum mechanical feature that goes beyond the classic picture of an unambiguous top quark "particle" whose total energy could be determined in the direct mass measurements. Since the parton showers in all state-of-the-art MC generators account for the dead cone effect [51], it appears obvious that the meaning of m MC t should naturally have a linear dependence on the shower cut Q 0 restricting the ultra-collinear radiation -unless there is a mechanism that leads to a power suppressed effect of order Q 2 0 or higher which we may then safely neglect for the case of the top quark. Therefore, to examine the intrinsic field theoretical meaning of the MC top quark mass parameter m MC t it is essential to start with a careful examination of the production of the top quarks and the ultra-collinear gluons. From this point of view, studies of the top decay and the treatment of the observable final states are important to quantify to which extent the ultra-collinear gluons are unresolved and how they enter a particular observable.
In this work we aim to focus primarily on the production aspect, and we are therefore studying an observable that is maximally insensitive on details of the final state dynamics and its theoretical modeling. This observable is the peak (i.e. resonance) position of hemisphere jet masses in e + e − annihilation, explained in more detail in Sec. 2. The basic outcome of our considerations concerning the field theoretic meaning of m MC t , however, should be general and shall be systematically extended to other types of observables and to the LHC environment in subsequent work. As a further simplification we consider the narrow width approximation (NWA), i.e. the case of quasi-stable top quarks which allows to rigorously factorize top production and decay, the case of boosted (i.e. large-p T ) top quarks and the coherent branching formalism which is related to angular ordered showers, see Refs. [52][53][54] for massless and Refs. [55] for massive quarks, and also Refs. [56,57]. Since the limit of stable and quasi-collinear heavy quarks is the theoretical basis of all parton shower formulations for top (and other heavy) quarks, it is natural to investigate the physics in this limit first to avoid that the conclusions are affected by the additional approximations that need to be made in the attempt to account for the effects of slow and unstable top quarks. Our current focus on angular ordered showers is, on the other hand, of purely practical nature: Our considerations here require explicit analytic solutions of the shower evolution, and angular ordered showers based on the coherent branching formalism can be more easily tackled by well known analytic methods [58] applicable to global event shapes. So our current results are directly relevant for the Herwig MC generator which employs an angular ordered PS. Generalizations to other MC generators shall be treated elsewhere.
In this context our paper is structured around the following three questions: (A) Can state-of-the-art partons showers in principle describe the single top resonance mass and related thresholds with NLO precision?
(B) What is the impact of the shower cut Q 0 on the resonance value of the jet masses?
(C) Does the shower cut imply that the MC top quark mass parameter m MC t is a low-scale threshold short-distance mass, and how can this be proven from first principles at the field theoretic level? Question A is relevant because, only if parton showers can describe the threshold or resonance mass with NLO precision, the question of which mass scheme is employed can be addressed systematically in a meaningful way. In the course of our examination we show that this is indeed the case as long as NLL order logarithmic terms are resummed, and we also show that the additional NLO corrections implemented by NLO matched parton showers do not further increase the precision. Question B concerns the dependence of the resonance value of the jet mass on Q 0 . We show that the jet mass at the resonance peak depends linearly on Q 0 which means that for the field theoretic meaning of m MC t the finite shower cut is essential and cannot be neglected. Finally, question C addresses to which extent the linear dependence on Q 0 must be interpreted as a Q 0 -dependence of the MC top quark mass. As we will show, only a part of the linear Q 0 -dependence of the peak jet mass is related to ultra-collinear radiation and thus to the top quark mass. Overall, the shower cut also restricts the radiation of large angle soft gluons unrelated to the top quark and the ultra-collinear radiation. Only the latter is related to the top quark mass, and its dominant linear Q 0 -dependence caused by the shower cut can be shown to automatically imply a mass redefinition which differs from the pole mass by a term proportional to α s (Q 0 ) Q 0 . This result implies that m MC t is equivalent to the top quark pole mass, in the limit Q 0 = 0 which is however practically inaccessible for parton showers. In the formal limit m t → 0 the effects of the ultra-collinear radiation and its Q 0 -dependence vanish and only the cutoff dependence on the soft radiation remains. This cutoff dependence represents the factorization interface between perturbative soft radiation and hadronization effects governed by the MC hadronization models. Since in this context the understanding of the shower cut dependence of the soft radiation is a prerequisite to the examination of the ultra-collinear radiation, we also analyze carefully the case of massless quark production in parallel to our discussions on the top quark.

Outline and reader instruction
The outline of our paper is as follows: In Sec. 2 we set up our theoretical framework by explaining the hemisphere mass observable τ and reviewing the corresponding NLL and hadron level factorized QCD predictions in the resonance region for massless as well as massive quark production. We also provide details on the hadronization model shape function which is important for the numerical analyses carried out in the subsequent sections. In this section we also prove, using the factorized predictions, that NLL resummation of logarithms is sufficient to achieve NLO precision for the position of the peak in the τ distribution.
In Sec. 3 we review the coherent branching formalism, provide the analytic evolution equation for the jet mass distribution for massless and massive quark production at NLL order and give some details on the practical implementation of the angular order parton shower based on the coherent branching formalism in the Herwig 7 event generator.
In Sec. 4 we show -in the absence of any infrared cutoff and in the context of strict perturbative computations -that the NLL predictions for the hemisphere mass τ distribution in the resonance region obtained from the coherent branching formalism are fully equivalent to the NLL factorized QCD predictions for massless quark production as well as for massive quark production in the pole mass scheme. This result proves, that in the context of strict perturbative computations for massive quarks in the limit Q 0 = 0 the MC generator mass is equivalent to the pole mass. This conclusion, however, does not apply for MC event generators because their parton shower algorithm strictly requires a finite shower cut Q 0 in order to terminate and to avoid infinite multiplicities.
The impact of the shower cut Q 0 is then analyzed in detail in Sec. 5, which represents the core of this work. Here we analyze the power counting of the relevant modes entering the hemisphere mass in the resonance region in the massless and massive quark case and we focus on a coherent view of the factorized QCD and the coherent branching approach. We calculate analytically the NLO corrections caused by the shower cut Q 0 in comparison with the results without any cut in the coherent branching formalism and the factorized QCD approach focusing on the dominant effects linear in Q 0 . We show that the results obtained for the linear Q 0 contributions in coherent branching and factorized QCD are compatible, and we use the direct connection of the factorized QCD computation to field theory to unambiguously distinguish shower cut effects related to soft hadronization corrections and the quark mass parameter. By coherently examining massless and massive quark production we prove that using a finite shower cut Q 0 in the coherent branching formalism -and thus also in angular ordered parton showers -automatically implies that one employs a shortdistance mass scheme different from the pole mass, called the coherent branching (CB) mass, m CB (Q 0 ). We explicitly calculate the relation of the coherent branching mass to the pole mass at NLO, i.e. O(α s ).
In Sec. 6 the conceptual results obtained in the previous sections are summarized coherently to set up the numerical examinations we carry out in Sec. 7. In Sec. 7 we compare the results obtained in Sec. 5 with analytic methods and conceptual considerations with numerical results running simulations for the hemisphere mass variable τ in Herwig 7 using different values of the shower cut Q 0 . Focusing mostly on the peak position of τ we show that the simulations are in full agreement with our conceptual results. We also show explicitly that NLO corrections added in the context of NLO matched parton showers have extremely small effects in the resonance location and do not modify any of the previous results, confirming that NLL accurate parton showers are already NLO accurate as far as the resonance region is concerned. Furthermore, we also demonstrate that the results we have obtained in the context the hemisphere mass variable τ are also compatible with numerical simulations for the more exclusive kinematic variables m b j and m b j W supporting the view that our results are universal.
Finally, Sec. 8 contains our conclusions and an outlook for some of the remaining questions that should be addressed in the future. There we also provide a brief numerical analysis how m CB (Q 0 ) is related to other mass renormalization schemes. The paper also contains four appendices containing some supplemental material relevant for our work. In App. A we collect all parton level results for NLL+O(α s ) factorized QCD predictions of the τ distribution in the resonance region for massless and massive quark production. In App. B we provide details on the computations of the effects of the shower cut Q 0 in the context of the factorized QCD predictions and in App. C we collect results for loop integrals in the presence of the shower cut Q 0 . Finally, in App. D we give information on the Herwig 7 settings we have employed for our simulation studies.
To the reader mainly interested in the phenomenological implications of our discussions in the context of the Monte-Carlo top mass problem: We recommend to go through our paper by starting with Sec. 1 and Sec. 2 for all basic information concerning our examinations and in particular for important elementary knowledge concerning the hemisphere mass variable τ in the resonance region and its theoretical description. One may then jump directly to Sec. 6, where all of our conceptual results are summarized and continue with our simulations studies in Sec. 7 and the conclusions in Sec. 8.

The observable: squared hemisphere mass sum
The observable we consider in this work is the sum of the squared hemisphere masses defined with respect to the thrust axis in e + e − -collisions normalized to the square of the c.m. energy Q, In the lower endpoint region the τ distribution has a resonance peak which is dominated by back-to-back 2-jet configurations which arise from LO quark-antiquark production, and it is the location of the resonance, τ peak , which we focus on mostly in our study. For massless quarks this resonance region is located close to τ = 0 and represents the threshold region for dijet production. Non-perturbative effects shift the observable peak towards positive τ values by an mount of O(Λ/Q), where Λ is a scale of around 1 GeV. For massive quark production the resonance region and the peak are located close to τ = 2m 2 Q /Q 2 , and for the case of the top quark for Q m t is dominated by boosted back-to-back top quark initiated jets. As for the case of massless quark production non-perturbative effects shift the observable peak towards positive τ values by an mount of O(Λ/Q). The scale of Λ ≈ 1 GeV is generated from non-perturbative effects, but its value is numerically larger than Λ QCD because it accounts for the cumulative hadronization effect from both hemispheres [59]. In the peak region, τ is closely related to the classic thrust variable [60] in the case of massless quark production [58], and to 2-Jettiness [61] for massive quarks [14]. To be concrete, concerning the structure of large logarithms and of terms singular in the τ → τ min limit, which dominate the shape and position of the peak, the hemisphere mass variable τ , thrust and 2-jettiness are equivalent for large Q. We therefore frequently refer to τ simply as "thrust" in this paper.
For our examinations for top quarks we also consider the rescaled thrust variable The variable M τ is peaked close to M τ = m Q and allows for a more transparent interpretation of the shower cut Q 0 -dependence from the point of view of the top quark mass than τ . Note that the scheme dependence of the quark mass parameter m Q appearing in the definition (2.2) represents an effect that is O(α 2 s )-suppressed in the context of our examinations and therefore irrelevant at the order we are working.
An essential aspect of the examinations in this work is that for boosted top quarks events related to top decay products being radiated outside the parent top quark's hemisphere are (m t /Q) 2 power suppressed [14]. So, because thrust depends on the sum of momenta in each hemisphere, effects of the top quark decay in the thrust distribution are power suppressed as well, and the situation of a finite top quark width is smoothly connected to the NWA and the stable top quark limit. This is compatible with the factorized treatment of top production and decay used in contemporary parton showers and also allows us to carry out analytic QCD calculations for stable top quarks which are essential for the chain of arguments we use. In this way thrust is an ideal observable for the examinations made in this work since it allows to study the mass of the top quark accounting in particular for the contribution of the unresolved ultra-collinear gluon cloud around it.
However, in thrust the effects of large angle soft radiation are maximized, and the impact of the shower cut Q 0 on the meaning of the top quark mass parameter interferes with that Q 0 has on large angle soft radiation. Since the latter is not related to the top quark mass, but represents the interface to hadronization effects [58,62], it is important that both effects are disentangled unambiguously. As we will show, for thrust in the peak region this can be carried out in a straightforward way owing to soft-collinear factorization [63,64]. Since the structure of large angle soft radiation is equivalent for the production of massless quarks and boosted massive quarks [14,15], we discuss the case of massless quark production before we examine boosted top quarks.
Since our discussion requires the analytic comparison of the thrust distribution determined from the parton shower evolution based on the coherent branching formalism at NLL order (where we follow the approach of [58,65]) and of corresponding resummed QCD calculations based on soft-collinear factorization, we briefly review the latter in the following two subsections for massless and massive quark production.

Factorized QCD cross section: massless quarks
Resummed calculations for the thrust distribution in the peak region require the summation of terms that are logarithmically enhanced and singular in the limit τ → τ min = 0, where the partonic threshold is located. In the context of conventional perturbative QCD, factorized calculations for massless quarks have been carried out in Ref. [64] at NLL order. In the context of SCET the corresponding results have been obtained at NLL+O(α s ) in Ref. [66] and were extended to N 3 LL+O(α 3 s ) in Ref. [59,67]. Using the notations from Ref. [59] the observable hadron level thrust distribution in the peak region can be written in the form where dσ s /dτ contains the factorized resummed singular partonic QCD corrections (containing δ-function terms of the form α n s δ(τ ) and plus-distributions of the form α n s [ln k (τ )/τ ] + ) and S mod ( ) is the soft model shape function that describes the non-perturbative effects. It has support for positive values of , exhibits a peaked behavior for values around 1 GeV and is strongly falling for larger values. We further assume that it vanishs at zero momentum, S mod (0) = 0. 1 Due to the smearing caused by the non-perturbative function the visible peak of the thrust distribution is shifted to positive values by an amount of order (1 GeV)/Q. The dominant perturbative corrections to the factorized cross section in Eq. (2.3) are coming from so-called non-singular contributions containing terms of the form α n s ln k (τ ). For our considerations in the resonance region these corrections are powersuppressed by a additional factor of order (1 GeV)/Q, i.e. they cause a shift in the peak position by an amount (1 GeV) 2 /Q 2 which we can safely neglect. 1 The typical scale of the non-perturbative function S mod is about twice the typical hadronization scale ΛQCD < ∼ 0.5 GeV as it accounts for non-perturbative from both hemispheres [59]. The property S mod (0) = 0 is assumed for all shape functions treated in the literature and physically motivated from the hadronization gap.
The resummed factorized singular partonic QCD cross section has the form where σ 0 is the total partonic e + e − tree-level cross section. The term H Q is the hard function describing effects at the production scale Q, J (τ ) is the jet function describing the distribution of the squared invariant mass s due to collinear radiation coming from both jets and S (τ ) is the soft function containing the effects of large angle soft radiation. They depend on the renormalizations scales µ H ∼ Q, µ J ∼ Q √ τ and µ S ∼ Qτ , which are chosen such that no large logs appear in hard, jet and soft functions respectively. Large logarithmic contributions are resummed in the different U factors which are evolved from the corresponding renormalization scale µ H , µ J or µ S to a common renormalization scale. Since it most closely resembles the analytic form of the resummation formulae obtained in the coherent branching formalism, we have set in Eq. (2.4) the common renormalization scale equal to the hard scale µ H , so that there is no evolution factor U H for the hard function. So, U J sums logarithms between the jet scale µ J and the hard scale µ H , and U S sum logarithms between the soft scale µ S and the hard scale. For our discussions we need the expressions for the U factors at NLL and the hard, soft and jet function at O(α s ). These formulae (and also the renormalization group equations for the U factors) are provided for convenience in App. A.1.
Expanded to first order in the strong coupling and setting µ H = µ J = µ S = µ in Eq. (2.4) we obtain the well-known O(α s ) singular fixed-order thrust distribution (2.5) Transforming the partonic massless quark thrust distribution of Eq. (2.4) to Laplace space with the conventionσ the NLL thrust distribution can be written in the condensed form where the evolution functions K and ω have the form The QCD beta function is defined as with β 0 = 11− 2 3 n f and β 1 = 102− 38 3 n f , and the cusp and non-cusp anomalous dimensions read The scales µ H,ν , µ J,ν and µ S,ν are given by 3) that are logarithmic or plus-distributions. Dropping a π 2 term arising in the Laplace transform of the (ln τ /τ ) + distributions, in this combination the dependence on the renormalization scales µ H , µ J and µ S cancels and the result shown in Eq. (2.7) with the physical scales given in Eqs. (2.13) emerges. Since the structure of these O(α s ) corrections is already unambiguously known from the NLL renormalization properties, we consider them part of the NLL logarithmic contributions. (We refer to Ref. [68] for an extensive discussion on this issue.) Using in Eq. (2.7) the renormalization scales µ i instead of the scales µ i,ν (i = H, J, S) one recovers the renormalization scale dependent results coming from the U evolution factors alone.
As we show in Sec. 4.1 all terms displayed in Eq. (2.7) are also precisely obtained by the coherent branching formalism at NLL order.

Factorized QCD cross section: massive quarks
In the case of boosted massive quark production the thrust distribution has been determined at NNLL+O(α s ) in Refs. [11,14,15]. Adopting the pole mass scheme, the τ distribution as defined in Eq. (2.1) has its partonic threshold at (2.14) The observable thrust distribution in the resonance region for τ > ∼ τ pole min , can be written in a form analogous to the case of massless quarks and has the form where dσ s /dτ is the resummed singular massive quark partonic QCD cross section, which contains terms of the form α n s δ(τ − τ pole min ) and α n s [ln k (τ − τ pole min )/(τ − τ pole min )] + ). The nonsingular corrections to the factorized cross section in Eq. (2.15) are coming from terms of the form α n s ln k (τ − τ pole min ). In the resonance region these corrections are power-suppressed by a additional factor of order (1 GeV)/Q or (1 GeV)/m and can, in analogy to the case of massless quark production, be safely neglected for top quark production. In an arbitrary mass scheme m with δm = m pole − m we can write the observable thrust distribution in the form The generalization of Eqs. (2.17) and (2.18) to an arbitrary mass scheme is straightforward.
The singular partonic cross section in the resonance region can be written in the factorized form where σ 0 is again the total partonic e + e − tree-level cross section. due to ultra-collinear gluon radiation in the region whereŝ is much smaller than the mass, s m. It depends on the renormalization scale µ B ∼ Qµ S /m ∼ Q 2 (τ − τ min )/m, and its expression at O(α s ) in an arbitrary mass scheme m, J (τ ) B (ŝ, m, δm, µ B ), with δm = m pole − m = 0 is shown in Eq. (A.8). At NLL+O(α s ) the bHQET jet function completely controls the quark mass scheme dependence of the singular partonic cross section. So at this order the singular partonic cross section in an arbitrary mass scheme, dσs dτ (τ, Q, m, δm), is obtained from Eq. (2.19), by employing the bHQET jet function J (τ ) B (ŝ, m, δm, µ B ) and setting m pole → m everywhere else. This is because J (τ ) B has mass sensitivity already at tree level through the dependence on τ min , see Eq. (2.14). Physically the ultra-collinear radiation is, owing to heavy quark symmetry, related to the soft radiation governing the mass of heavy-light mesons. The mass mode factor H m contains fluctuations at the scale of the quark mass µ m ∼ m coming from the massive quark field fluctuations that are off-shell in the resonance region and integrated out. Its expression at O(α s ) is shown in Eq. (A.7) and a detailed discussion on its definition and properties can be found in Ref. [15]. The factor U J B sums logarithms between the ultra-collinear jet scale µ B and the hard scale µ H , U S sums logarithms between the soft scale µ S and the hard scale, and U m sum logarithms between the quark mass scale µ m and the hard scale. Their formulae are for convenience also provided in App. A.2.
From a physical point of view it appears more appropriate to evolve the factors U J B , U S and U m to the quark mass scale µ m (at which point the factor U m could be dropped) rather than the hard scale. This is because the logarithms resummed in U J B and U m physically arise from scales below the quark mass. The form we have adopted here is equivalent due to renormalization group consistency conditions [15] and matches better to the form of the log resummations obtained from the coherent branching formalism as discussed in Sec. 4.2. For our examinations we need the expressions for the U factors at NLL and the hard, mass matching, soft and the bHQET jet functions at O(α s ). Expanding to first order in the strong coupling and setting µ H = µ B = µ S = µ we obtain the O(α s ) singular fixed-order massive quark thrust distribution in the pole mass scheme (L m = ln (m pole ) 2 In Eq. (2.21), changing to another mass scheme m leads to the additional term δ (τ − τ pole min ) 4m δm/Q 2 on the RHS, and this term has to be counted as a NLL contributions as well.
We note that in Eq. (2.21) the dead cone effect [49,50] is manifest as a τ → τ min behavior that is less singular than the τ → 0 limit for massless quark production displayed in Eq. (2.5). However, one can see from the form of the bHQET jet function in Eq. (A.8), that ultra-collinear radiation still involves soft-collinear double-logarithmic singularities which arise from the coherent effect of ultra-collinear gluons physically originating from the associated top quark and its opposite hemisphere [14,15]. So, in the context of QCD factorization based on SCET and bHQET the deadcone effect arises from a cancellation of double logarithmic singularities between the ultra-collinear and the large-angle soft radiation (radiated in the collinear direction and called collinear-soft radiation in the following). This can be seen from the expression for the partonic soft function S (τ ) given in Eq. (A.3) which exhibits the same double-logarithmic singularity as the bHQET jet function, but with the opposite sign. So the origin of the deadcone effect from the perspective of QCD factorization, which is manifestly gauge invariant, is due to a cancellation of ultracollinear and collinear-soft radiation. This is somewhat different (but not contradictory) to the conventional and gauge-dependent view that the deadcone originates from the suppression of collinear radiation off the boosted top quarks due to the finite top quark mass. The relation between these two views is subtle because in the canonical SCET/bHQET approach (ultra-)collinear jet functions are defined with a zero-bin subtraction [69] to avoid a double counting between (ultra-)collinear and collinear-soft radiation.
Transforming the partonic massive quark thrust distribution to Laplace space with the the NLL thrust distribution can be written in the condensed form where the evolution functions K and ω have been given in Eqs. (2.8) and the cusp and non-cusp anomalous dimensions not already displayed in Eqs. (2.11) and (2.12) read and the scales µ H , µ m , µ B,ν and µ S,ν are given by We finally note that all functions and U factors that appear in Eqs. (2.4) and (2.19) have been determined using dimensional regularization to regularize infrared and ultraviolet divergences and the MS renormalization scheme. At this point the partonic soft function S (τ ) (k) does not contain any gap subtraction [70] to remove its O(Λ QCD ) renormalon ambiguity related to the partonic threshold at k = 0.

Importance of the shape function
The soft model shape function S mod appearing in Eqs. (2.3) and (2.15) represents an essential part of the thrust factorization theorems since it accounts for the hadronization effects that affect the observable thrust distribution. The shape function leads to a smearing of the parton level contributions and an additional shift of the peak position since the hadronization effects increase the hemisphere masses by non-perturbative contributions. It is also essential as far as the shape of the distribution in the resonance region is concerned where the thrust distribution is peaked.
Since in this work we are mainly interested in the Q 0 -dependence of the partonic contributions, one may conclude that one should better drop the effects of the shape function S mod in our analysis such that it does not interfere with the perturbative effects. However, this is not possible since analyzing the singular partonic corrections of the thrust distribution (and their Q 0 dependence) alone without any smearing does not allow for a correct interpretation of their contributions to the observable distribution. This can be easily seen for example from the O(α s ) fixed-order parton level results for the massless and massive quark thrust distributions shown in Eqs. (2.5) and (2.21). Here the partonic contributions to the observable distribution contained in the δ-functions and in the regularized singularity structures of the plus distributions at the partonic thresholds at τ = 0 and τ = τ min , respectively, remain invisible if one simply studies the partonic contributions at a function of τ . One may in particular conclude wrongly, that the observable peak position is independent of Q 0 simply because the partonic threshold always remains at τ = 0 and τ = τ min for massless and massive quarks, respectively. The essential point is that the complete set of singular structures in the (infinitesimal) vicinity of the threshold contributes in the resonance region and non-trivially affect the observable peak location. Thus, the partonic thresholds alone do not govern the observable peak position and some smearing is crucial to fully resolve the effects of all parton level contributions.
As a consequence, in our analysis of the partonic effects coming from the shower cut Q 0 , it is still important that we account for the hadronic smearing of the shape function S mod . For the analysis of the partonic effects coming from the shower cut Q 0 we therefore include a shape function that is Q 0 -independent. It has the simple form  where we consider Λ m values between 1 and 5 GeV for our conceptual discussions. (See also our comment after Eq. (2.3).) We use this shape function for our analytic calculations as well as for the parton level numerical results we obtain from the Herwig event generator. This way we can ensure that the smearing is precisely equivalent for both types of results. We note that the exact form of S mod and the size of the smearing scale Λ m affect the form and the absolute value of peak location of the distribution in the resonance region. However, for our analysis only the relative dependence of the peak position on the cut value Q 0 is essential, for which the exact form of the shape function turns out to be irrelevant. We further note that for our numerical studies for top quark production we use the smearing due to S mod to also mimic effects of the top quark width even though the form of S mod does not provide a fully consistent description. As we show in Sec. 5, for making physical predictions the soft model function has to compensate for the dependence of the parton level large angle soft radiation on the Q 0 cut. This is because for large angle soft radiation the shower cut represents a factorization scale that separates the parton level and non-perturbative regions. The point of our examination, however, is not to make physical predictions, but to conceptually quantify the dependence on Q 0 with the aim to disentangle it unambiguously from the effect Q 0 has on the mass parameter. Along the same lines, we also do not account for the possible effects of a finite experimental resolution. The latter results in an additional smearing of the resonance distribution that, particularly in the context of hadron colliders, may by far exceed the smearing caused by the hadronization effects. While the overall norm still remains irrelevant for the peak position, properties of the theoretical distribution far away from the resonance region could then affect the experimentally observed peak position in a non-negligible way. In such a case the non-singular corrections may have to be included for a reliable description. This is straightforward, but beyond the scope of this work.

NLO precision for the resonance location
Within quantum field theory a consistent discussion of a quark mass (renormalization) scheme is only meaningful if the theoretical description of the observable of interest has all or at least the dominant O(α s ) corrections implemented. In the factorization theorems of Eqs. (2.3) and (2.15) we can neglect the nonsingular corrections since they are powersuppressed in the resonance region. To be concrete, they lead to negligible shifts in the peak position of order (1 GeV) 2 /Q 2 and (1 GeV) 2 /m 2 , respectively, upon including the smearing effects coming from the soft model shape function S mod . It is now obvious to ask the question if, apart from the summation of logarithms at NLL order, also the full set of O(α s ) non-logarithmic fixed-order corrections contained in the hard, mass mode, jet and soft functions are needed to achieve O(α s ) precision in the resonance region. These  [59] where NLL order refers to the resummation of logarithms at NLL order combined with all additional fixed-order corrections at O(α s ).
However, the mass sensitivity of the thrust distribution in the peak region mainly comes from the location of the resonance peak, τ peak , and properties such as the overall norm of the distribution are less important. For most practical considerations of such kinematic distributions, the norm is even eliminated on purpose by considering distributions that are normalized to a restricted interval in the kinematic variable. Therefore, in our analysis we mainly focus on the resonance peak position of the thrust distribution and do not consider the overall norm. Interestingly, as we show in the following, when discussing the peak position with NLO (i.e. O(α s )) precision, we only have to account for the NLL resummed cross section, and we can neglect the O(α s ) non-logarithmic corrections. The reason why these non-logarithmic O(α s ) corrections do not contribute to the peak position τ peak at NLO is that they are represent corrections proportional to the LL cross section.
To see this more explicitly let us rewrite the NLL+O(α s ) thrust distributions of Eqs. (2.3) and (2.15) in the generic form where f andf stand for the hadron and parton level thrust distributions, respectively, andS mod for the hadronization shape function after variable rescaling. The NLL+O(α s ) partonic thrust distribution can then be written in the form wheref LL represents the LL cross section (which provides the complete leading order approximation), the term α s ∆f NLL contains all NLL corrections in the NLL resummed cross section, and α s c stands for the non-logarithmic O(α s ) corrections mentioned above. The latter corrections are related to the LL tower of logarithms associated to the term (2π 2 /3 − 2)δ(τ ) in Eq. (2.5) and the term 2π 2 δ(τ − τ pole min ) in Eq. (2.21). Note that corrections arising from a change in the quark mass scheme are proportional to derivatives of δ(τ − τ pole min ) and therefore always contained in the term α s ∆f NLL . The LL peak position τ 0 peak is determined from the equality At the NLL level, writing the O(α s ) correction to the peak position as δτ peak , the corre-sponding equality reads where in the third line we have dropped terms of O(α 2 s ) and in the fourth we used the LL constraint of Eq. (2.30) for the non-logarithmic O(α s ) fixed-order corrections with are proportional to the LL cross section.
The outcome is that the non-logarithmic O(α s ) fixed-order corrections contained in the hard, jet and soft function are not relevant for discussing the peak position τ peak as far as O(α s ) precision is concerned and would only enter when O(α 2 s ) corrections are considered. Since the peak position represents the dominant characteristics of the thrust distribution entering the mass determination, we can therefore conclude that the resummation of logarithmic correction at the NLL level is sufficient to achieve O(α s ) precision for a mass determination based on the resonance peak position. Going along the line of arguments we use in the subsequent sections this important result also means that to the extent that parton showers systematically and correctly sum all NLL logarithmic terms, the peak position of the thrust distribution generated by their evolution is already O(α s ) precise, even without including any additional NLO fixed-order corrections by an NLO matching prescription.

Coherent branching formalism
The coherent branching formalism has proven to be a very powerful tool for analytic resummation of a large number of observables. Besides the analytic use, it forms the core rationale behind coherent parton shower algorithms, notably the angular ordered algorithms of the Herwig family [46,47,71] of event generators. Following earlier work of Refs. [58,62] we use this framework to calculate the parton level jet mass distributions J(s, Q 2 ) for massless quarks and J(s, Q 2 , m 2 ) for massive quarks originating from successive gluon radiation off the progenitor quark and anti-quark pair generated by the hard interaction at c.m. energy Q. Here the variable s = M 2 jet stands for the resulting squared jet invariant mass. This determines the parton level thrust distribution in the peak region as defined in Eq. (2.1) as for massless and massive quark cases, respectively. The jet mass distributions obtained in the context of coherent branching incorporate coherently the dynamic effects of soft as well as (ultra-)collinear radiation and are UV-finite quantities. Thus they differ from the jet functions J (τ ) and J (τ ) B in the QCD factorization approach which describe the factorized collinear and ultra-collinear gluon effects, respectively, and are determined from UVdivergent effective theory matrix elements that need to be renormalized. In order to obtain the observable hadron level thrust distribution, the contributions of the non-perturbative effects are accounted for in exactly the same way as for the QCD factorization approach by an additional convolution with a soft model shape function, as shown in Eqs. (2.3) and (2.15), see Refs. [58,62,72].
We note that in Eqs. (3.1) we have used the superscript 'cb' to indicate the cross sections obtained in the coherent branching formalism. We use this notation throughout this paper, when suitable, to distinguish results based on the coherent branching formalism from those obtained in the factorization approach.
While an analytic treatment of the coherent branching formalism in the strict context of perturbation theory does not rely on the presence of any infrared cutoff, 2 it is, however, required within the realm of an event generator for several reasons. These include the Landau pole singularity of the strong coupling, which emerges because its renormalization scale is tied to shower evolution variables, and that the particle multiplicities diverge when the shower evolves to infrared scales. In addition, in the limit of small scales the perturbative treatment of the parton splitting breaks down anyway, and it is therefore mandatory to terminate the shower at a low scale where the perturbative description is still valid and hand over the partonic ensemble generated through the shower emissions to a phenomenological model of hadronization.
The variables we consider in the following of this section are used both to derive analytic results, but we also stress that they precisely correspond to the variables employed in the angular ordered parton shower of the Herwig 7 event generator. The results obtained from the Herwig 7 event generator only differ from the analytic framework by the implementation of exact momentum conservation with respect to the momenta of all final state particles that emerge when the shower has terminated at its infrared cutoff Q 0 . This implementation of momentum conservation shall not change the jet mass distribution and is explained in more detail in Sec. 3.3. There we also briefly discuss some Herwig 7 (version 7.1.2) specific implementations in its default setting that go beyond the coherent branching p ⊥ z 1 − z Figure 1: A gluon branching off a back-to-back quark/anti-quark system. The radiated gluon is assumed to carry a fraction 1 − z of the parent's momentum and is emitted at a transverse momentum which equals the one acquired by, in this case, the anti-quark after the emission.
formalism and that we do not use in the context of the conceptual studies carried out in this work.

Massless case
Starting from an initial, color-connected qq-pair with momenta p andp, the momenta of the partons emerging from the shower evolution of the quark carrying the momentum p are parametrized based on where k i is the quarks momentum after the i-th emission. In the massless case we usen =p as the reference direction to specify the collinear limit, with k i,⊥ · p = k i,⊥ · n = 0, k 2 i,⊥ < 0 and β i being determined by the virtualities k i · k i = k µ i k iµ = k 2 i as The radiation off the anti-quark with momentump is described similarly with a reference direction n = p. Expressing k µ i in terms of the momentum of the emitter before the i-th branching we find where the physical splitting variables relative to the quark's momentum k i−1 before the i-th emission relate to the global light-cone decomposition Eq. (3.2) as where α 0 = 1 as well as q µ 0,⊥ = 0 are understood. This means that for the first emission the physical branching variables coincide with the global parametrization. We have depicted the variables of one branching in Fig. 1. Soft gluon coherence is encoded through ordering emissions in an angular variable [54],q where p 2 i,⊥ = −q 2 i,⊥ is the magnitude of the transverse momentum, which is purely spacelike and perpendicular to the emitter axis in the centre-of-mass system of the momenta k i and n. The explicit restrictions of decreasing opening angle of subsequent emissions following a branching at scaleq i from the evolving quark or anti-quark at scaleq 2 i+1 , and the radiated gluon at scalek 2 i are imposed by the conditions In the context of these variables, the Altarelli-Parisi splitting functions explicitly show the full Eikonal radiation pattern and the correct collinear limit, see e.g. Ref. [73] for an overview and comparison to dipole-type parton showers. The formalism is appropriate to resum higher order logarithmic corrections for observables that are inclusive concerning the collinear radiation in the same jet and in the sense that the information that large-angle soft gluon radiation originates from a particular collinear parton is unresolved and can hence be described to originate from the net collinear color charge of the whole jet. Momentum conservation in the branching i − 1 → i implies where q 2 i is the virtuality of the emitted gluon, the momentum of which is parametrized in a decomposition similar to Eq. (3.2).
We follow Ref. [58] and start with an analytic approach for which the evolution equation for the jet mass distribution starting at a hard scaleq 2 = Q 2 has the form J(s, Q 2 ) = δ(s) + where J g (s, Q 2 ) is the gluon jet mass distribution defined in analogy to the jet mass distribution J(s, Q 2 ) for the quarks. We have illustrated the evolution schematically in Fig. 2. The splitting function is given by where the second equality makes the cusp and non-cusp terms explicit, which stem from soft (z → 1) and hard collinear emissions, respectively. Figure 2: Graphical representation of the evolution equation Eq. 3.10: Grey blobs denote the quark and gluon jet function at a given jet mass, a single line implies a δ-function at mass zero, while the black dot represents a factor of one and implies an unconstrained integration over the gluon's emission scale and momentum fraction.
We note that the evolution equation for the jet mass distribution shown in Eq. (3.10) can be rendered NLL precise by correctly implementing the analytic form of the two-loop cusp term in quark splitting function P qq . By using the relative transverse momentum of the splitting, as the renormalization scale for the strong coupling the leading ln(1 − z)/(1 − z) behavior of the cusp term in the two-loop splitting function is reproduced exactly. The remaining non-logarithmic term from the two-loop cusp anomalous dimension and can be incorporated by either scaling α s → α s 1 + K g α s 2π , (3.13) or, equivalently, (up to terms of O(α 3 s )) by adopting a change in renormalization scheme through the rescaling The constant K g commonly used in this context relates to the two-loop cusp anomalous dimension as Γ cusp 1 = 8C F K g shown in Eqs. (2.12). This approach to implement NLL precision in parton showers is called the CMW ("Catani-Marchesini-Webber") or Monte Carlo scheme [54]. We note that in the Herwig event generator, the transverse momentum argument (3.12) is used as the scale of the strong coupling, but that in the default settings the CMW scheme of Eqs. (3.13) and (3.14) is not used explicitly. Instead the precise value of α s is obtained from tuning to LEP data along with the parameters of the hadronization model and the shower cut Q 0 . The result, however, numerically resembles the CMW factor in the relation between Λ MS and Λ MC . Indeed, for example for a one-loop running the CMW correction implies that and the larger value is exactly is the tuned value, with a similar converted value for α MS s (M Z ) for the two loop running actually employed in the Herwig shower. For our numerical analyses in Secs. 7.4 and 7.5, where we compare analytic calculations and Herwig results concerning the shower cut Q 0 dependence of the thrust peak position, we therefore use the strong coupling as implemented in Herwig.
The evolution equation for the jet mass distribution given in Eq. (3.10) is an explicit representation of the coherent branching algorithm. Consider the distribution of the first emitter's virtuality k 2 0 ≡ k 2 and one iteration of the branching algorithm, where one choses q 2 ≡q 2 1 , z ≡ z 1 , as well as k 2 ≡ k 2 1 and the gluon's virtuality is denoted by q 2 ≡ q 2 1 as displayed in Fig. 2. There is a contribution without any branching or virtual effects, encoded in the first δ-function term in Eq. (3.10). It describes a vanishing jet mass that corresponds to the tree-level contribution and also constitutes the initial condition for the shower evolution atq 2 = Q 2 . In addition, we need to take into account a resolvable branching at a scaleq 2 below the hard scale Q 2 , which gives rise to a subsequent evolution of the quark and gluon jet mass distributions at the scales imposed by the angular ordering criterion of Eq. (3.8). This contribution is itself constrained by the momentum conservation criterion of Eq. (3.9). The last contribution originates from an unresolved emission, which gives rise to an evolution of the quark mass distribution starting at scaleq 2 but being unconstrained otherwise. Notice that the momentum conservation constraint links the evolution scale to the specific kinematics that is considered. No further constraints to the integration over the momenta involved in the emission are present. As already mentioned, in the context of an event generator the evolution has to be terminated by imposing infrared cutoff Q 0 . This is typically done by a requiring a minimum transverse momentum for the emissions with respect to the momentum direction of the emitter. This restricts the integral overq 2 and z to a region where We note that also other choices are in principle possible and have been discussed in the context of radiation within the 'dead cone' for massive quarks [55]. In principle any prescription that simultaneously cuts off both the collinearq → 0 and soft z → 1 (as well as z → 0 for a gluon branching) limits, and also avoids low transverse momenta appearing in the argument of the strong coupling, is appropriate. We also note that an analogous evolution equation holds for the gluon jet mass distribution J g (s, Q 2 ). The evolution of the gluon jet is governed by the gluon splitting function, and also describes gluon branching into a quark/anti-quark pair. However, as far as the jet mass distributions in the resonance region is are concerned, the contribution of the gluon jet mass to the quark jet mass is at least at NLL precision suppressed due to the angular ordering constraint, see e.g. Ref. [58]. Therefore, at NLL several simplifying approximations are in principle possible to solve the evolution equation for the quark jet mass distribution, which are particularly useful for analytic calculations of the jet mass distribution: (i) we can neglect the contribution to the jet mass due to the branching of emitted gluons by the replacement J g ((1 − z)q 2 ) → δ(q 2 ) for the gluon jet mass distribution and (ii) we can can take the limit z → 1 for some terms that do not acquire an enhancement in the soft limit. Interestingly, this also includes that, once prescription (i) is applied, we can remove the remaining, strict angular ordering constraint in the quark jet mass distribution through modifying the starting scale of the subsequent emission contained in the quark jet mass distribution by the replacement J(k 2 , z 2q2 ) → J(k 2 ,q 2 ). In Sec. 7.3 we explicitly verify these simplifications from numerical simulations using the Herwig 7 event generator.

Massive case
Moving on to radiation off massive quarks, we consider the generalizations of coherent branching developed in Ref. [55], based on splitting functions and factorization in the quasicollinear limit for which the emitted parton's transverse momenta is restricted from above by the mass of the emitting quark and furthermore small compared to the scale of the previous emission, In this case we consider a system of a massive quark and anti-quark, p 2 =p 2 = m 2 . However we still use light-like backward directionsn and n in the momentum parametrization such as (3.2), with three-momenta pointing along the direction of the massive momenta, i.e.n = (| p|, − p) and n = (| p|, p). This modifies the form of the β i variables to take into account the mass effect, , (3.17) while the parametrization of the momenta from the massless case given in Eq. (3.2) and the relation to the branching variables in Eqs. (3.5) and (3.6) remain unchanged. Following Ref. [55] the evolution variable is generalized to the expressioñ Consequently, the generalization of Eq. (3.9) also adopts a mass term and reads The arguments we discussed for the massless quark case concerning the mass of the gluon jet apply in the analogous way in the massive quark case. Therefore we do not have to consider the fully general formalism for our analytic calculations at NLL order and can restrict ourselves to the case of gluon emission from a massive quark. We note that gluon splitting into massive quarks is also a negligible effect for the jet mass distribution in the resonance region since the corresponding splitting function is suppressed with respect to the gluon emission case due to a lack of soft enhancement (even in the absence of angular ordering). The variables considered here are precisely those used in the Herwig 7 angular ordered shower, which, in its current version is not relying on a finite Q g parameter as quoted in [55], but is instead using a cutoff on the transverse momentum.
The evolution equation of the massive quark jet mass distribution then has the form The initial hard scale of the evolution inq 2 is chosen as which amounts to the 'symmetric' phase space choice for the QQ system as suggested in Sec. 3.2 of Ref. [55], so that the shower evolution off the progenitors Q andQ only cover physically distinct phase space regions. For the situation of boosted quarks (m 2 /Q 2 1) we consider in this paper, however, we can safely replaceQ 2 → Q 2 for all analytic calculations. The shower cutoff condition in the massive quark case reads 22) and the splitting function in the quasi-collinear limit generalizes to (3.23) In contrast to the massless quark case where the coherent branching formalism has a solid conceptual basis related to the different kinematics of soft and collinear phase space regions, the corresponding formalism for massive quarks has in its present form higher order ambiguities, which makes e.g. the determination O(α 2 s ) corrections to the quasi-collinear splitting functions ambiguous. This is related to the more complicated structure of collinear, ultra-collinear, mass mode and soft dynamics and phase space regions that emerge in the presence of the quark mass and which (as we show explicitly in Sec. 4.2) depends in addition on the relation between the jet invariant mass √ s and the quark mass m. This is manifest in the fact that, in contrast to the massless quark case, there is no unique choice of the renormalization scale of α s as a function of z,q and the quark mass m. As such, different choices for µ 2 R (q 2 , z) which reduce to Eq. (3.12) in the massless limit may be considered. The default choice is the generalized transverse momentum, µ 2 R (q 2 , z) =q 2 z 2 (1 − z) 2 , which adds an additional mass-dependent contribution relative to the physical transverse momentum given in Eq. (3.22). We demonstrate in Sec. 4.2 that this choice is fully consistent with the QCD factorization approach for massive quarks at NLL order. (See also the power counting shown in Tab. 2: In the soft gluon region the m 2 term is suppressed and irrelevant, and in the ultra-collinear region theq 2 and the m 2 terms are of the same order.)

Coherent branching in the Herwig 7 event generator
The coherent branching formalis and its variables outlined in the previous two subsections form the core of the angular ordered parton shower in the Herwig 7 event generator [46,47,71], covering the massless and the massive quark cases as discussed in Secs. 3.1 and 3.2, respectively. In the Herwig 7 parton shower algorithm, a sequence of random values for the variablesq and z is generated, distributed according to the Sudakov form factor that depends on the splitting function. This provides a solution to the evolution of the jet mass distribution accounting for the branching and no-branching probabilities in terms of explicit events.
A major difference to a purely analytic computation of the jet mass distributions encoded in the evolution equations (3.10) and (3.20), however, is related to the virtualities, i.e. the off-shell invariant masses of the branching partons. While an analytic calculation of the jet mass distribution just focuses on the description of the overall invariant mass of the final state particles produced by the emissions from the progenitor parton originating from the hard process, event generators have to face an additional constraint: they have to evolve the progenitor parton to a final state consisting of partons on their physical mass shell consistent with overall energy-momentum conservation at the point when the shower terminates. This procedure is called 'kinematic reconstruction'. It is the kinematic reconstruction procedure that fixes the virtualities to the partons before showering (which are, however, approximated as on-shell in the splitting function). The kinematic reconstruction is based on the information of the entire evolution tree, the momentum decomposition based on Eq. (3.2), four-momentum conservation at each vertex, and the knowledge of thẽ q and z values of each branching to determine explicit particle momenta and to relate the kinematics of the subsequent emissions to the associated off-shell invariant masses.
In this context an additional important issue the kinematic reconstruction procedure has to deal with is that the sizes of the physical virtualities are kinematically limited by the available phase space. However, this phase space constraint is not imposed by the parton shower evolution itself, such that physically inaccessible (i.e. too large) invariant masses can be generated. Given the decomposition of the momenta based on Eq. (3.2), and a sequence ofq and z values, the kinematic reconstruction algorithms are designed such that one single solution for the final state momenta is obtained. However, physically, the final state momenta cannot be determined uniquely such that ambiguities arise in the way how overall energy-momentum conservation is restored in the event.
To illustrate the kinematic reconstruction procedure more concretely, consider the production of a quark/anti-quark progenitor pair produced in e + e − annihilation carrying onshell momenta respectively, with the initial tree-level process constraint Q = 2 p 2 + m 2 at the starting point of the parton shower evolution. At the end of the parton shower evolution their showered counterparts will have gained virtualities M 2 ≥ m 2 andM 2 ≥ m 2 with momenta and an overall restoration of energy-momentum conservation is mandatory. The strategy in this case (and similarly its generalizations to more final and initial state partons) is to transform the reconstructed momenta of the children coming from the now off-massshell shower progenitors into their common centre-of-mass frame where three-momentum conservation is guaranteed. Their spatial momentum components will then be re-scaled by a common parameter such that the overall invariant mass is consistent with energymomentum conservation, (P +P ) 2 = Q 2 . This procedure is equivalent to specific boosts along the P and theP directions, respectively, for the progenitor quark and anti-quark sides.
In cases that the shower evolution, which -as we have mentioned before has no notion of global energy-momentum conservation -has generated virtualities which are inconsistent with the available centre-of-mass energy Q, the procedure just outlined is not possible. Different choices for re-interpreting the branching variables when setting up the full kinematics, with the aim of reducing the occurrence of unphysically large virtualities have been implemented in Herwig 7. The default setting in the released version of Herwig 7, set ShowerHandler:ReconstructionOption OffShell5, imposes an additional constraint in the intermediate evolution by explicitly altering the intermediate splitting variablesq and z (which are originally obtained in the approximation the partons after the splitting are onshell). This scheme absorbs the invariant mass of the children of the branching parton [74] into a redefinition of the splitting variables to preserve the originally generated virtuality of the splitting parton. This approach, however, intrinsically changes the original form of the coherent branching algorithm as outlined in the previous two subsections, and we therefore do not consider this default option in the numerical analyses carried out in Sec. 7. Instead, the setting set ShowerHandler:ReconstructionOption CutOff is used. It directly uses the variables generated for the splittings, and does not redefine the variables used to set up the full kinematics. Events with unphysically large virtualities are discarded.
An additional difference of the Herwig 7 parton shower to the analytic computation of the jet mass distributions encoded in Eqs. (3.10) and (3.20), is that its default (cluster-type) hadronization model [75], imposes, in addition, constituent mass on-shell conditions for all partons that emerge when the shower is switched off. This includes in particular a constituent mass for the gluons of around 1 GeV. These parton constituent masses represent tunable parameters of the hadronization model and are thus part of the hadronization model even though they enter the Herwig 7 parton level output. In particular, the constituent mass allows for a splitting into quark/anti-quark pairs such that the primary non-perturbative clusters can be formed. Within our parton level examination concerning the dependence on the shower cut Q 0 , parton constituent masses would represent additional infrared cutoff scales that non-trivially interfere with Q 0 and in addition may cause gauge-invariance issues in higher order perturbative QCD calculations. Since we anyway do not use the Herwig 7 hadronization model in our numerical analyses of Sec. 7, as already explained in Sec. 2.3, we do not account for these constituent masses in our analytic calculations and when generating parton level results from Herwig 7. We set all quark constituent masses to m c q = 0.01 MeV, and the gluon mass parameter to m c g = 2m c q , which is the lower bound dictated by constraints from the cluster hadronization model. This effectively eliminates any effect coming from the constituent masses. We note that string hadronization models do not require to assign a mass to the gluons produced by the shower.

Hemisphere mass distribution from coherent branching without cut
In this section we show that -in the context of strict perturbative computations -the coherent branching formalism and the factorized QCD predictions provide identical results concerning the NLL resummation of logarithmic corrections for the thrust distribution in the absence of any infrared cut, i.e. for Q 0 = 0. In the context of our discussions in Sec. 2.4, this equivalence means that for the thrust distribution the coherent branching formalism with NLL log resummation is already O(α s ) precise as far as the peak position is concerned. For the thrust distribution for massive quarks this allows us to identify at O(α s ) the coherent branching (CB) mass parameter and the pole mass m pole as long as we consider the resonance peak location as the observable. We phrase this restricted equivalence by the relation We stress that an exact solution for the jet mass distributions in Eqs. (3.10) and (3.20) (i.e. a solution that does not rely on any perturbative expansion or rearrangement of the expressions) is impossible without imposing any infrared cut because of the singularities in the soft and collinear regions of the (z,q) plane caused by the Landau pole of the strong coupling. So applying the coherent branching formalism without any infrared cut implies (and requires) that the running of the strong coupling is treated strictly perturbatively (see also footnote 2). The equivalence relation (4.1) must therefore be understood strictly in the perturbative sense. From the point of view of an exact solution of the coherent branching formalism the limit Q 0 → 0 is impossible to reach. This illustrates the well-known problem of the pole mass being a purely perturbative concept that, however, cannot be associated directly to any physical process at the exact, non-perturbative level. In the following two subsections we calculate the jet mass distribution in Eqs. (3.10) and (3.20) obtained from the coherent branching formalism analytically at NLL order for massless and massive quark, respectively, and show that the results agree identically with those obtained from the factorized QCD calculations for thrust reviewed in Secs. 2.1 and 2.2. For the case of massless quarks this equivalence is well known and has already been studied thoroughly in the literature, see e.g. Refs. [68,76]. We nevertheless lay out the analysis for massless quarks in some detail because it sets the stage for the more complicated discussion for massive quarks in the resonance region, where -to the best of our knowledge -such a study has never been carried out before. Moreover, the manipulations are setting the stage for Sec. 5.2 where we examine the impact of the infrared shower cut Q 0 on the resonance location τ peak . The reader not interested in these computational details may safely skip these two subsection and continue reading with Sec. 5.
For simplicity we carry out the bulk of the calculations in Laplace space and define the Laplace transform of the jet mass distributions as such that the Laplace space thrust distributions as defined in Eqs. (2.6) and (2.22) adopt the simple formσ To keep our notation simple we write the heavy quark mass paramter simply as m instead of m CB (Q 0 = 0) in the rest of Sec. 4.

NLL resummation for massless quarks
To analytically determine the NLL jet mass distribution for massless quarks in the peak region from Eq. (3.10) we follow Ref. [58] and replace z by 1 in all functions that are slowly varying in the limit z → 1, except in the splitting function. As already discussed at the end of Sec. 3.1, this means that the angular ordering constraint can be dropped in the peak region, giving for the Lapace space integral equation for the jet mass distribution. From this we find the differential equation which gives the solution With the substitutionsq and using the explicit form of the NLL splitting function in terms of the cusp anomalous dimension of Eq. (2.12) and a subleading non-cusp term, we arrive at (4.9) For the second non-cusp term we rewrite α s (q ) in terms of α s (q) and powers of ln(q 2 /q 2 ) and notice that at NLL precision we only have to keep terms that are proportional to α n+1 s (q) ln n (q 2 /Q 2 ) after the q integration. Here in turns out that only a single term for n = 0 has to be kept, which we have, anticipating the form of the final result, identified with the non-cusp anomalous dimension of the jet function in the factorized QCD cross section. Rewriting in the remaining integral α s (q) in terms of α s (Q) and powers of ln(q 2 /Q 2 ), we can further simplify the integral by noticing, that for obtaining all NLL logarithmic terms correctly, we can use the replacement This replacement technically acts like an infrared cutoff for the q integration. It is, however, not a physical cutoff because it is derived in the context of a strict perturbative expansion (where no infrared Landau Pole singularity arises) and is furthermore not correct beyond NLL order. One should therefore better think of the replacement simply as an algebraic relation that simplifies the perturbative analytic NLL resummation calculation. For the remaining double integral with the cusp-anomalous dimension we can now switch the order of integration, and reshuffle the q integrations, (4.14) Noticing the scale identifications according to Eqs. (2.13) and (4.11), we see that at this point we have separated the collinear and soft evolution to the hard scale. The non-cusp term on the other hand, describes only a collinear evolution to the hard scale consistent with our assignment in Eq. (4.10). Accounting for Eq. (2.10) for the QCD beta function and Eq. (4.3), we then arrive at the following form of the NLL Laplace space thrust distribution, In terms of the K and ω evolution factors defined in Eq. (2.8) this can be rewritten as where in the last line we have used the cusp and non-cusp identities of Eqs. (2.11). This agrees identically with the factorized QCD cross section for massless quarks of Eq. (2.7).

NLL resummation for massive quarks
To analytically determine the NLL jet mass distribution for massive quarks in the peak region from Eq. (3.20) we initially proceed in the same way as for the massless quark case. Taking the large z approximation of Ref. [58] and in addition the limit of large boost (Q 2 m 2 ), the Laplace space integral equation for the jet mass distribution adopts the form where we have dropped a factor 1/z from the mass correction term for a consistent expansion in the z → 1 limit, see also Sec. 3. Its solution reads and with the substitutions of Eq. (4.7) we arrive at for the NLL resummed Lapace space thrust distribution. We have already implemented the NLL relation of Eq.(4.11) to simplify the q 2 integration, since it is also valid in the context of massive quarks.
It it easy to see that the massive quark constraint q 2 > q m is irrelevant for w = µ 2 J,ν > m 2 , which refers to the situation where the hemisphere jet masses are larger than the mass of the quark. In this kinematic situation the mass correction in the splitting function represents the only modification due to the quark mass, but the struture of the log resummation is otherwise in complete analogy to the massless quark case. In the context of the factorized QCD calculations, one can then employ usual SCET factorization where the collinear sector of the effective Lagrangian is extended trivially by just accounting for the finite quark mass [14,15]. In this work, however, we are interested in the peak region where the hemisphere jet masses are close to the quark mass, i.e. where w < m 2 . Here the ultracollinear sector emerges and the QCD factorization requires that the off-shell fluctuations of the massive quark field are integrated out [14,15]. So the quark mass effects are much more complicated and lead to a substantial rearrangement of the structure of the resummed logarithms. The physical meaning of w is also modified and the scale identifications read Let us now have a closer look at the calculation for the cusp term. Reversing the order of integration we have to distinguish three integration regions and find (4.24) The q integrations can be reshuffled using such that we get For the non-cusp term, rewriting the constraint q 2 > q m in terms of integration limits, we can use the considerations already applied in the massless quark case and find that only the second integration contributes at NLL order. This gives where we have written the expression in terms of the non-cusp anomalous dimensions of the hard, the mass mode and the bHQET jet functions, anticipating already the form of the final result.
For the mass correction term in the splitting function we reverse integration order in analogy to our manipulation for the cusp term in Eq. (4.24), In the limit of a boosted massive quark (Q 2 m 2 ) only the second term can contribute NLL logarithms. Using the contribution from the mass correction term at NLL accuracy then reads Taking the sum of the NLL contributions from the non-cusp term in Eq. (4.30) and the mass corrections term in Eq. (4.33) we obtain where in the second line we have rewritten the result using the evolution function ω of Eq  5 Hemisphere mass distribution with shower cut Q 0 In this section we study analytically the impact of the shower evolution cut Q 0 on the thrust distribution in the resonance regions for massless and massive quark production. The main focus is on the effects that cause a dependence of the hemisphere masses that is linear on Q 0 . The Q 0 cut is defined as the restriction p 2 ⊥ > Q 2 0 on the transverse momentum of the emission with respect to the momentum of the emitter. In the context of the coherent branching formalism the dependence of the transverse momentum on the shower evolution variablesq and z for the massless and the massive quark cases are given in Eqs. (3.7) and (3.18), respectively, leading to the constraints in Eqs. (3.16) and (3.22).
Since in the framework of strict perturbative calculations the Q 0 cut represents an artificial restriction of the radiation generated by the shower, we call the emissions that are allowed by the Q 0 cut released and the emissions that is not allowed by the Q 0 cut unreleased. As we will show, the dominant (linear in Q 0 ) effect of removing the unreleased 0 Λ Δ ℓ S mod (ℓ) radiation from the calculation in the resonance region must be reinterpreted as a redefinition of parameters in a perturbative calculation without Q 0 cut.
To elucidate this we compare the effects of the unreleased radiation in the context of the coherent branching formalism for the jet mass distributions as described in Secs. 3.1 and 3.2, and in the context of QCD factorization using the SCET approach for the thrust distribution as described in Secs. 2.1 and 2.2. This comparison, together with the facts that in the absence of the Q 0 cut coherent branching and QCD factorization provide equivalent results at NLL order and both are O(α s ) precise for the resonance peak mass, allows us to relate the quark mass parameter of the coherent branching formalism with Q 0 cut (and thus of angular ordered parton showers) to an explicit field theoretic mass renormalization scheme at O(α s ).
In subsection 5.1 we outline the collinear and soft phase space regions and QCD modes relevant for the thrust distribution in the resonance region in the context of coherent branching and QCD factorization, respectively, and we show where a linear Q 0 dependence can possibly emerge. In subsections 5.3 to 5.2 we calculate the effects of the unreleased radiation each for massless and massive quarks for QCD factorization and coherent branching and analyze in detail the effects linear in Q 0 .

Phase space regions with and without Q 0 cut
To initiate the analytic examinations it is illustrative to first discuss the structure of the phase space and the QCD modes relevant for the resonance region. To define our counting variable we start from the hadron level thrust distributions given in Eqs. (2.3) and (2.15), where the partonic thrust distribution is convoluted with the soft model shape function S mod ( ). The parameters of the shape function may be either determined from fits to experimental data or from non-perturbative methods. The shape function has support for positive values and exhibits a peak for ≈ Λ, where Λ parametrizes the overall energy the non-perturbative effects add to the parton level hemisphere masses. For larger values the shape function falls quickly and one usually assumes an exponential behavior. A typical generic form for S mod is displayed in Fig. 3. The effect of the shape function on the hadron level prediction is twofold: it smears out the distributive and singular structures of the partonic cross section, and it leads to a shift of the observable resonance peak position in the thrust distribution towards larger values with respect to the partonic thresholds, τ min = 0 for massless quarks and τ min = 2m 2 /Q 2 for massive quarks: It is therefore natural to adopt Λ/Q as the counting parameter in the resonance region.
In Fig. 4a we show the generic form of the (z,q) phase space populated by coherent branching for the jet mass distribution for massless quarks, see Eq. (3.10). The gray area represents the phase space without Q 0 cut and the hatched area the phase space with Q 0 cut. In the peak region the thrust distribution is dominated by soft and collinear gluon radiation, which are also indicated. In QCD factorization the soft and the collinear modes are separated at the operator level by imposing powercounting contraints on the momentum fluctuations these operators can generate. These constraints are most efficiently formulated in the light cone basis where where n andn are back-to-back light-like vectors than can be directed along the momenta of the progenitor quark-antiquark pair produced by the primary hard scattering. The momentum components in this basis are then denoted by p µ = (p + , p − , p ⊥ ) = (n·p,n·p, p ⊥ ) where the momentum square reads p 2 = p + p − − p 2 ⊥ , see also Secs. 3.1 and 3.2. The Λ counting of the collinear and soft regions formulated in the coherent branching and in the QCD factorization approaches can be connected by the relation for soft and n-collinear modes. For then collinear modes, the plus and the minus components on the RHS have to be swapped. The momentum power counting for both approaches for massless quark production is summarized in Tab. 1. Imposing the Q 0 cut, one has to note that it represents a cut on the transverse momentum of the emission with respect to the momentum of the emitter and not with respect to the momenta of the progenitor quarks. In the coherent branching approach this is automatically taken care of in the definition of the transverse momentum variable q µ i,⊥ of Eq. (3.6) which parametrizes the i-th branching. In QCD factorization, on the other hand, the constraint has a more complicated structure, because the momenta of all radiated partons are usually formulated in one common frame based on Eq. (5.2). Fortunately at NLL+O(α s ) precision, the order we consider in this work, only the first emission has to be calculated in the QCD factorization approach to determine the effects linear in Q 0 . At this level the transverse momentum variable in coherent branching defined in Eq. (3.6) and the transverse momentum in QCD factorization defined in Eq. (5.2) agree and can be identified. So at NLL+O(α s ) precision we can implement the shower cut constraint in the factorized calculation by simply imposing a cut on the transverse momentum in Eq. (5.2) without further complications. For the rest of this paper we therefore identify the transverse momenta in both apporaches to keep the presentation simple, and we frequently refer to the shower cut Q 0 also as the cut on the transverse momentum p ⊥ without further specification.
From a conceptual point of view the numerical value for Q 0 should be chosen such that it is unresolved, i.e. it should be smaller than the typical values p ⊥ can adopt for the observable of interest. From Tab. 1 we can see that soft radiation imposes the strongest constraint and requires that Q 0 should in principle be smaller than Λ ∼ 1 GeV. This is the hierarchy we assume for some of the arguments presented below. For practical parton showers, however, this constraint cannot be satisfied in terms of a strong hierarchy (if at all) due to computational reasons and the proximity to the Landau pole of the strong coupling. As we show in our numerical analysis in Sec. 7 using the approximation Q 0 Λ in our analytic calculations does very well, even for cases where the both scales are similar in size. In any case, since Q 0 represents the smallest scale for the strong coupling, integrations over its Landau pole are prevented as long as Q 0 is chosen larger than Λ QCD , and, moreover, for finite Q 0 there are no renormalon ambiguities in perturbative calculations.
In the context of QCD factorization we can see already at the level of the factorization theorem (2.4) that a linear dependence on the p ⊥ cut Q 0 can only arise in the partonic soft function S because it is linear in the light cone momentum . In the jet function J, however, we expect a quadratic behavior for simple dimensional reasons. This consideration can be confirmed explicitly applying the soft and collinear (z,q) counting shown Tab. 1 to the quark jet mass distribution defined in Eq. (3.10): In the collinear region z ∼ (1 − z) ∼ 1 and the cut-dependence arises whereq 2 ∼ Q 2 0 . This leads to changes proportional to Q 2 0 on the invariant mass s due to the δ function constraint. In the soft region we haveq ∼ Q and z ∼ 1, and the cut dependence arises where (1 − z) ∼ Q 0 /Q, and . This then leads to changes in s proportional to QQ 0 due to the δ function constraint. This simple counting is confirmed by the explicit calculations carried out in Secs. 5.2 and 5.3.
In Fig. 4b we show the generic form of the (z,q) phase space populated by coherent branching for the jet mass distribution in the resonance region for a massive quark, see Eq. (3.20), where the coloring is the same as for the massless quark case. Again the gray region represents the allowed phase space without Q 0 cut and the hatched region when the Q 0 cut is imposed. We see that the allowed phase space is considerably different from the massless quark case and overall confined to the region of large z. This is particular to the resonance region, where s − m 2 m 2 . Here the massive quark thrust distribution is dominated by soft and ultra-collinear gluon radiation, which are also indicated. While the soft region is equivalent to the case of massless quarks, the ultra-collinear region differs substantially from the collinear region for massless quarks because it is related to gluon radiation that is soft in the massive quark rest frame and only becomes collinear due to the massive quark boost. As such the ultra-collinear radiation originating from a boosted massive quark with a given energy is substantially less energetic than the typical collinear radiation originating from a massless quark with the same energy. The resulting power counting is shown in Tab. 2, where we see e.g. that ultra-collinear gluons have a typical energy of order Q 2 Λ/m 2 , compared to collinear gluons which have a typical energy of order Q. Note that if we would consider the situation s − m 2 m 2 the allowed phase space would look similar to the massless case and we would recover the collinear counting. It is a remarkable fact that, despite its limitations, the coherent branching formalism for massive quarks is capable of describing both limits correctly and provides a smooth connection between them. We also note that, since (p 2 ⊥ + (1 − z) 2 m 2 ) 1/2 is the renormalization scale of the strong coupling, integrations over its Landau pole are strictly prevented as long as Q 0 is chosen larger than Λ QCD . Therefore there are no renormalon ambiguities in perturbative calculations in the presence of the p ⊥ cut Q 0 . To conclude this section let us also discuss in which sectors we should expect a linear dependence on Q 0 for the case of massive quark production. In the context of QCD factorization, inspecting the factorization theorem (2.19), we see that a linear dependence on the p ⊥ cut Q 0 can arise not only in the partonic soft function S but also in the bHQET jet function J B because it has, in contrast to the massless quark jet function, a linear kinematic dependence on the reduced invariant mass variableŝ, see Eq (2.20). This simple dimensional argument can again be confirmed applying the ultra-collinear momentum counting shown in Tab. 2 to the quark jet mass distribution defined in Eq. (3.20): We have z ∼ 1, q ∼ m and the cut-dependence arises where (1 − z) ∼ Q 0 /m. This leads to changes in the squared invariant mass relative to the threshold of order s − m 2 ∼ mQ 0 . This simple counting is confirmed by the explicit calculations carried out in Secs. 5.2 and 5.4.

Unreleased radiation: coherent branching
To calculate the effects of the p ⊥ cut Q 0 on the thrust distribution in the peak region in the context of the coherent branching formalism we can start from the corresponding Lapace space expressions given in Eq. (4.6) for massless quarks and Eq. (4.21) for massive quarks. In contrast to the calculations we carried out for our examinations concerning the summation of logarithms in Secs. 4.1 and 4.2, where the finite quark mass represented a non-trivial modification, we can treat the massless and the massive quark case within the same computation because Q 0 < m. We can therefore begin from the Laplace space thrust where we have implemented the p ⊥ cut Q 0 according to Eqs. (3.16) and (3.22). In the second line we changed the integration variable fromq to q ⊥ and used that m 2 /Q 2 1. From the second line one can see that we can write the Laplace space thrust distribution with Q 0 cut asσ whereσ cb (ν, Q, m) is the distribution without Q 0 cut and the function describes the contributions of the unreleased radiation. Since we are interested in the dominant contribution linear in Q 0 we can expand to linear order in ν and change variables to where we have dropped terms which are down by additional powers of Q 0 /m and m/Q. In addition, to linear order in Q 0 we can extend the upper limit of the second integral up to infinity. From this we obtain at O(α s ) the final result for the unreleased radiation, where we can fix the scale of the strong coupling to Q 0 because it represents the only scale the integral depends. For the case of massless quark production the term proportional to m is zero. A similar calculation for the massless quark case (relevant for an analysis in the effective coupling model) was carried out in Ref. [65]. For the thrust distributions obtained from the coherent branching formalism the relations (5.5) and (5.8) in connection with Laplace space identities imply that up to terms quadratic in Q 0 , the strong coupling and m/Q the effect of the p ⊥ cut is a simple shift in τ with respect to the thrust distribution without p ⊥ cut: Within the coherent branching formalism it is, however, not possible to systematically address the question if these shifts should be interpreted as modifications of QCD parameters such as the mass. This is because the coherent branching formalism provides a convenient computational method to sum cross section contributions that are singular in the soft and collinear limits, but does not provide a field theoretic background where this issue can be discussed conceptually from first principles. We will therefore examine the effects of p ⊥ cut Q 0 again in the next two subsections in the framework of the factorization theorems (2.4) and (2.19) for massless and massive quarks, respectively.

Unreleased radiation for massless quarks: QCD factorization
In the context of QCD factorization the hard, soft and collinear effects are separated at the operator level and the modifications caused the by the p ⊥ cut Q 0 can be determined in each sector individually. Possible cross terms and exponentiation effects are automatically taken care of by the multiplicative structure of the factorization theorem (2.4). It is then straightforward to see that there is no change in the U factors which sum the large logarithms, since the p ⊥ cut acts in the infrared and does not lead to any new types of UV-divergences. As far as the hard function H Q is concerned, the p ⊥ cut contributes only through terms of order Q 2 0 /Q 2 , which are strongly power-suppresed and negligible at the order we are working. So we only have to analyze the jet function J (τ ) and the soft function S (τ ) as they describe radiation where the p ⊥ cut Q 0 can leave a non-trivial impact.
We write the jet function J (τ ) and the soft function S (τ ) in the presence of the p ⊥ cut Q 0 in the form where J (τ ) (s, µ J ) and S (τ ) (k, µ S , Q 0 ) are the renormalization scale dependent jet and soft functions from Eq. (2.4) determined using dimensional regularization for the momentum ur (k, Q 0 ) represent the unreleased radiation coming from regions below the p ⊥ cut Q 0 , i.e. they describes the perturbative radiation that is prevented if Q 0 is finite. Since the p ⊥ cut does not lead to any genuine UV divergences, J In Fig. 5a the O(α s ) corrections for jet function without p ⊥ cut, J (τ ) (s, µ J ) (solid red line), the unreleased jet function J (τ ) ur (s, Q 0 ) (dotted green), and the full jet function with p ⊥ cut, J (τ ) (s, µ J , Q 0 ) (dashed blue line) are shown for µ J = Q 0 using arbitrary units. For this scale choice the p ⊥ cut completely eliminates the plus distributions for s < 4Q 2 0 in J (τ ) (s, µ J , Q 0 ) and slightly reduces the collinear jet mass distribution for s larger than 4Q 2 0 . As already argued in Sec. 5.1, the unreleased radiation in the collinear sector depends quadratially on Q 0 except for the δ-function term, which is however, not affecting the peak location τ peak at O(α s ), see the discussion of Sec. 2.4. The contributions from s < 4Q 2 0 as well as from s > 4Q 2 0 lead to effects of order Q 2 0 in the observable thrust distribution upon integration over the soft model shape function, which corresponds to a smearing in s over an interval of order QΛ which is much larger than Q 2 0 , see Eq. (2.3). Since we are interested in effects that are linear in Q 0 , the unreleased radiation in the collinear sector can thus be ignored in our discussion. The result for the unreleased soft function reads [k = k/Q 0 ] In Fig. 5b the O(α s ) corrections to the soft function without p ⊥ cut, S (τ ) (k, µ S ) (solid red line), the unreleased jet function S (τ ) ur (k, Q 0 ) (dotted green), and the full jet function with p ⊥ cut, S (τ ) (s, µ S , Q 0 ) (dashed blue line) are shown for µ S = Q 0 for arbitrary units. Similar to the case of the jet function, for this scale choice, the p ⊥ cut just eliminates the plus distributions for k < Q 0 in S (τ ) (k, µ S , Q 0 ), and but has no effects for k > Q 0 . As already anticipated on general grounds in Sec. 5.1, the p ⊥ cut indeed leads to a linear dependence on Q 0 .
As can be seen from the factorization formula (2.3), the soft model shape function causes a smearing in k over an interval of order Λ which we assume to be larger than Q 0 . Since the unreleased soft function has support only for light cone momenta k < Q 0 , we can therefore quantify its effect more transparently in terms of a multipole expansion, where the term ∆ soft (Q 0 ) is the first moment of the unreleased soft function, Mathematically, this multipole term appears to cause a shift of the partonic soft function threshold by −∆ soft (Q 0 ) since it can absorbed into the tree level soft function, In the context of the thrust factorization theorem (2.3) we thus see that this shift agrees identically with the result which we determined from the coherent branching formalism given in Eq. (5.9). However, as we have already mentioned before, in the coherent branching formalism there was no rigorous field theoretical background that strictly enforced this view in the context of perturbation theory because a perturbative modification of the threshold of a kinematic variable can only be implemented by a renormalization scheme change of a dimensionful parameter. Such a parameter does also not exists for the soft function because is arises from the dynamics of massless gluons and only depends on a light-cone momentum. In the context of the factorization formula (2.3) the correct view is that the linear effect caused by the p ⊥ cut Q 0 can be reinterpreted as a shift of the soft model shape function S mod [58,62,72], called "gap" in Ref. [70]. Following the presentation of Ref. [70] we can write the convolution of the partonic soft function and the non-perturbative shape function as This relation shows that the dominant effect of the p ⊥ cut Q 0 is to modify the interface between perturbation theory and non-perturbative physics and -from the point of view of a partonic computation carried out without Q 0 cut -acts as a modification of the hadronization contribution from S mod ( ) to S mod + ∆ soft (Q 0 ) as shown in the last line of Eq. (5.17). As long as the scale Q 0 is in the perturbative regime, this scheme change can be described perturbatively. This shows that the correct way to deal with a change in Q 0 when making physical predictions -from the point of view of a partonic computation with a Q 0 cut -is to modify the non-perturbative effects by a corresponding change of the shape function gap in order to leave the physical prediction unchanged.
One of the motivations of discussing "gapped" soft functions in Ref. [70] was to devise a way consistent with QCD factorization and field theory to eliminate the O(Λ QCD ) renormalon from the partonic soft function. This O(Λ QCD ) renormalon arises from large factorially growing coefficients in its perturbation series and renders, from the non-perturbative, i.e. beyond perturbation theory point of view, the partonic threshold ambiguous to an amount of order of Λ QCD . While for a massive particle threshold this can be achieved by a modification of the mass scheme, there is no such parameter for gluonic thresholds. Our argumentation that the effects linear in the p ⊥ cut Q 0 should be interpreted as a soft function gap are therefore further supported, if the p ⊥ cut eliminates the O(Λ QCD ) renormalon behavior of the partonic soft function. To examine this we restrict our discussion to the effects of dressing the gluon propagator with massless fermion bubble chains using the replacement [77] 1 to compute the Borel transform, using the convention β[α s ] = −2β 0 (α 2 s /4π)+. . . for defining the coefficients of the β-function (see also Eq. (2.10)), and focusing on poles in the Borel variable u located a u = 1/2. The term e 5/3 is related to using the usual MS renormalization scheme for the strong coupling. In passing we note that using the bubble chain method does not represent a strict all order proof that the p ⊥ cut eliminates the O(Λ QCD ) renormalon. However, it is sufficient for our discussion that focuses on angular ordered showers which have NLL order precision.
As was shown in Ref. [70], the Laurent expansion of the Borel transform of the partonic soft function S (τ ) (k, µ S ) around u = 1/2 reads The O(Λ QCD ) renormalon is canceled by the p ⊥ cut, if the unreleased soft function S  20) and is identical to Eq. (5.19) when, consistently, the same scale choice is adopted for the strong coupling. The agreement shows that in the presence of the p ⊥ cut Q 0 the O(Λ QCD ) renormalon is indeed removed from the partonic soft function due to Eq. (5.12). Thus the p ⊥ cut eliminates the O(Λ QCD ) renormalon and leads to a more convergent large-order behavior of the partonic soft function. This analysis also reconfirms the view that soft gluon radiation in (at least angular ordered) parton showers used in MC event generators does not suffer from O(Λ QCD ) renormalon ambiguities, in contrast to perturbative calculations without finite infrared cuts.

Unreleased radiation for massive quarks: QCD factorization
For the massive quark thrust distribution factorization theorem (2.4) we proceed in a way analogous to the massless quark case. The p ⊥ cut does not lead to any modifications for the U factors that sum large logarithms since it does not lead to any new types of UV-divergences. The hard function H Q is the same as for massless quarks, and the p ⊥ cut contributes terms of order Q 2 0 /Q 2 . The mass mode factor H m , which arises from offshell massive quark fluctuations, obtains modifications of order Q 2 0 /m 2 . Both effects are strongly power-suppressed and negligible at the order we are working. Since the massive and massless quark factorization theorems contain the same partonic soft function S (τ ) and the same non-perturbative model shape function S mod , the effects of the p ⊥ cut we have discussed for them in the massless quark case also apply for massive quarks: the p ⊥ cut leads to a linear sensitivity to Q 0 can can be associated to a gapped soft function, as shown in Eqs. (5.16) and (5.17). This takes care of the m-independent shift contribution shown in Eq. (5.10).
What remains to be examined is the bHQET jet function J (τ ) B which contains the dynamics of the ultra-collinear radiation and which, as we have argued in Sec. 5.1, can also have a linear sensitivity to the p ⊥ cut Q 0 . The aim is to show from the field theory perspective that we can associate the m-dependent term in Eq. (5.10) to a modification of the quark mass scheme different from the pole mass. This examination of the bHQET jet function represents the central part of our discussion because at NLL+O(α s ) order the bHQET jet function completely controls the quark mass scheme. We note that the bHQET jet function dominates the mass dependence also at higher orders, while the mass dependence coming from other parts of the factorization formula is subleading.
We write the bHQET jet function J where J In Fig. 6 the O(α s ) corrections to the bHQET jet function without p ⊥ cut, J We see that the effect of the p ⊥ cut has features common to the massless quark jet function: the p ⊥ cut eliminates the plus distributions forŝ < 2Q 0 and slightly reduces the ultra-collinear jet mass distributions forŝ larger than 2Q 0 , compare to Fig. 5a. However, the difference is that the overall dependence on Q 0 is linear, as anticipated in Sec. 5.1, and the singular structure atŝ = 0 is more complicated due to the appearance of the term proportional to the derivative of the delta function, δ (ŝ). This term arises from the on-shell cuts of the self-energy diagram of the heavy quark with the p ⊥ cut Q 0 , see App. B.3 for details.
To understand the result for the unreleased bHQET jet function in Eq. (5.22), it is important to recall that for the soft function the interpretation of the effects of the p ⊥ cut is related to the interface between partonic cross section and the non-perturbative shape function that describes hadronization effects and that there is no partonic parameter involved in the argumentation. This differs from the bHQET jet function which contains the quark mass as a partonic parameter that depends on an explicit decision about its renormalization condition. In the expression for the O(α s ) corrections to the bHQET jet function in Eq. (A.8) this dependence is manifest in the term − 4δm m pole δ (ŝ), where δm = m pole − m is the difference of the employed mass renormalization scheme to the pole mass.
From the structure of the convolutions in the factorization formulae (2.15) and (2.19), due to the combinationŝ m/Q − k appearing in the partonic soft function S (τ ) , it is also evident that the effects linear in Q 0 contained in Eq. (5.22) cannot be associated to a universal (i.e. m/Q-independent) change of the soft function model gap. It is therefore mandatory to interpret these contributions from the point of a perturbative mass change alone.
In the absence of the p ⊥ cut, i.e. when only dimensional regularization is used to regularize infrared and ultraviolet divergences, the bHQET on-shell heavy quark self energy is a scaleless integral and vanishes to all orders. So in bHQET the quark mass renormalization scheme is automatically the pole mass when we set δm = 0. A change to another scheme is realized by explicitly adopting a finite expressions for δm (which is a series that starts at O(α s )). In the presence of the p ⊥ cut Q 0 , however, the on-shell self-energy depends on the scale Q 0 and does not vanish any more, see App. B.3 for details of this calculation. This is the origin of the δ (ŝ) term in Eq. (5.22), and it means that in the presence of the p ⊥ cut Q 0 the pole mass m pole , as defined in perturbation theory without any infrared cut, does not any more represent the pole position of the heavy quark propagator. 3 Rather, the pole is located at the Q 0 -dependent mass We stress that this means that the pole of the heavy quark propagator is not physical and implicitly depends on the infrared regularization scheme employed. The pole of the heavy quark propagatpr is unique only in the limit of vanishing infrared regulators. We call m CB (Q 0 ) the scale-dependent coherent branching (CB) mass. It is possible to absorb the δ (ŝ) correction term into the mass scheme (of the tree-level bHQET jet function) which changes it from m pole to the coherent branching mass m CB (Q 0 ). The essential point is that this scheme change is implicitly carried out within the coherent branching formalism (and in angular ordered parton showers) because there the δ (ŝ) term never arises. This means that the mass parameter in the coherent branching formalism in the presence of the p ⊥ cut Q 0 agrees with the pole of the heavy quark propagator which is the CB mass m CB (Q 0 ). As we show in the following, only within this context we find that the result of Eq. 3 At this point one may object that in the calculation of the unreleased bHQET jet function one can decide whether one applies the p ⊥ cut Q0 in the on-shell self-energy diagram or not. However, this corresponds to using different infrared regulators for virtual and real radiation corrections which is inconsistent.
In fact, dropping the p ⊥ cut Q0 in the on-shell self-energy diagram only and keeping it in the rest of the calculation is just equivalent to switching from the pole mass scheme to m CB (Q0).
The subtle issue to fully understand (and appreciate) our conclusion is that all the terms shown in Eq. (5.22) are required to allow the interpretation that the effects of the p ⊥ cut that are linear in Q 0 represent a modification of the mass scheme. The crucial consistency requirement for this interpretation is that the sum of all modifications due to the contributions linear in the cutoff scale Q 0 given in Eq. (5.22) vanish. This is because a change of the quark mass scheme (and of the renormalization scheme of any QCD parameter) leaves the theoretical prediction invariant and essentially represents a mutual exchange of perturbative corrections between the mass parameter and the dynamical matrix elements. It is therefore mandatory that the contributions linear in Q 0 of the remaining corrections (other than the δ (ŝ) term) in the unreleased bHQET jet function given in Eq. (5.22) have the same magnitude but the opposite sign as the contribution coming from the δ (ŝ) term. Since the soft model in the factorization theorem (2.15) causes a smearing inŝ of order QΛ/m Q 0 , we can -in analogy to our discussion on the unreleased soft function in Sec. 5.3 -use again the multipole expansion to proceed. In contrast to our discussion on the soft function, we do not have to argue about the validity of the multipole expansion because for boosted top quarks we have Q/m 1 so that the multipole expansion is well applicable even if Q 0 and Λ are similar in size. The outcome is that we need to show that the total integral (i.e. the zeroth moment) as well as the first moment of the unreleased bHQET jet function vanish identially. If these conditions are satisfied, we can interpret all effects of the p ⊥ cut that are linear in Q 0 as a change in the quark mass renormalization scheme.
It is straightforward to check from the result in Eq. (5.22) that these properties are indeed satisfied: where for the first moment we have indicated by subscripts the contributions from the δ (ŝ) term and the rest. Given the complicated structure of the result for the unreleased bHQET jet function in Eq. (5.22), the results appear highly non-trivial. From the physical point of view, however, the vanishing zeroth moment is related to the fact that the total (e + e − hadronic) cross section is not linearly sensitive to infrared momenta, which is well known. The vanishing of the first moment expresses that, physically, the mass-dependent kinematics threshold generated by the ultra-collinear radiation is not linearly sensitive to infrared momenta either. Linear sensitivity to infrared moments is only introduced by hand when one imposes the pole scheme for the heavy quark mass (defined in the common way by the one-particle irreducible on-shell self energy diagrams in the absence of any infrared regulator) 4 . This feature is well known since a long time see e.g. Ref. [78]. We can therefore 4 In this work we define the pole mass scheme m pole strictly in the generally accepted canonical way, namely in the context of perturbation theory in the limit of vanishing infrared regularization.
expect that the zeroth and the first moments of the unreleased bHQET jet function vanish to all orders in perturbation theory. At this point our prove is complete and we have field theoretically shown that -if one always employs a mass scheme that agrees with the pole of the perturbative heavy quark propagator -all effects of the p ⊥ cut that are linear in Q 0 not only can, but rather must be interpreted as a change of the quark mass scheme from the pole mass to the coherent branching mass: It is now natural to ask if the change from the pole mass to the scale-dependent CB mass cures the O(Λ QCD ) renormalon problem of the thrust distribution in the pole mass scheme. We address this question using again the dressed gluon propagator approach of Eq. (5.18) to determine the Borel transform in the region around u = 1/2. As was shown in Refs. [79,80], the Laurent expansion of the Borel transform of the perturbative series in α s (µ) for the pole mass in terms of the MS mass around u = 1/2 reads (5.28) The corresponding result for the perturbative series in α s (µ) for the pole mass in terms of the CB mass is calculated in App. B.3 and reads We see that the result is identical to Eq. (5.28). This shows that the scale-dependent CB mass m CB (Q 0 ) is a low-scale short-distance mass. This is not unexpected, of course, because the CB mass is defined from the bHQET on-shell massive quark self-energy with a transverse momentum infrared cut which prevents the low-virtuality contributions from the evolution of the strong coupling that are responsible for the emergence of infrared renormalons. It is also straightforward to check that the Borel ambiguities coming from the δ (ŝ) self-energy term and the other contributions in the unreleased jet function (calculated from the perturbative series in α s (µ)) cancel exactly: This reconfirms the relation (5.27) also beyond the NLO precision level (at least in the large-β 0 approximation). As a consequence, imposing the p ⊥ cut Q 0 in the massive quark thrust distributions implies that one uses the CB mass scheme of Eq. (5.23) and that all O(Λ QCD ) infrared renormalon issues are removed.

Summary of all theoretical considerations
In this section we summarize all theoretical and conceptual results we have obtained in the previous sections in the context of the massless and massive quark thrust distributions (see Eq. (2.1) and (2.2) in Sec. 2) obtained in the coherent branching formalism and the QCD factorization approach. These findings provide the basis of the field theoretic reinterpretation of the effects of the p ⊥ cut Q 0 that are linear in Q 0 as a modification of hadronization contributions and a redefinition of the heavy quark mass scheme, valid for boosted massive quarks in the narrow width approximation. We also discuss the meaning of these results in the context of angular ordered parton showers, which are based on the coherent branching formalism and for which a p ⊥ cut on the parton shower evolution is mandatory. These considerations set the stage for the numerical studies we carry out in Sec. 7 using the Herwig 7 event generator [46,47,71].
Since the QCD factorization approach provides the closest relation to field theory and allows to systematically address issues concerning the interpretation of partonic and nonperturbative parameters, the examinations in the previous sections were built around establishing a one-to-one correspondence between the factorized cross sections for thrust and the corresponding results obtained from the coherent branching formalism. For massive quarks the latter is known to be valid for quasi-collinear and the former for boosted massive quarks, which here correspond to equivalent kinematic situations. Because the peak resonance region of the thrust distribution, and in particular the peak position, provide the strongest and cleanest top mass sensitivity we have focused our considerations on the thrust resonance peak position.
In Sec. 2.4 we have shown that, for the factorized predictions, resummed results at full NLL order (where the dynamical logarithmic terms in the fixed-order matrix elements of the factorized predictions are understood to be part of full NLL) are sufficient to describe the peak position with NLO precision, i.e. up to higher order terms that enter only at O(α 2 s ) and beyond. In Secs. 4.1 and 4.2 we then established for massless and massive quarks, respectively, that in the absence of any infrared cut the NLL resummed results provided by the coherent branching formalism and by the usual factorized approach are equivalent. Since the massive quark results in the factorized approach we were using for the comparison were determined in the strict pole mass scheme m pole , we could prove that in the coherent branching formalism with NLL resummation of logarithms and in the absence of an infrared cut (i.e. for Q 0 = 0) the quark mass parameter is equivalent to the pole mass m pole at O(α s ): where m CB is the quark mass parameter in the coherent branching formalism and called the coherent branching (CB) mass. This relation, however, is only valid in the context of strict QCD perturbation theory, i.e. in calculations based on expanding in α s at a constant renormalization scale such that evolution effects are encoded entirely in powers of logarithms and virtual loop and real radiation phase space integrals can be carried out down to zero momenta. Such a strict perturbative approach, however, cannot be applied for angular ordered parton shower algorithms implemented in state-of-the-art MC event generators, so that it is not possible to use them without an infrared cut on the parton shower evolution. There are two main reasons for that. The first is related to the fact that for the parton showers implemented in multi-purpose MC event generators the renormalization scale of the strong coupling is a function of kinematic variables that decrease in the course of the shower evolution. In this way parton showers can account for important subleading NLL information. Without an infrared cut the strong coupling would therefore run into its Landau pole once the evolution reached virtualities and momenta close to Λ QCD . The second reason is that in the absence of the infrared cut the particle multiplicities generated by the shower became infinite and made event generation impossible for pure computational reasons. Thus, relation (6.1) does not apply for parton showers that are used in MC event generators.
In Sec. 5 we then analyzed the impact of the transverse momentum p ⊥ cut Q 0 that is imposed on angular ordered parton showers. In the evolution described by the coherent branching formalism this cut is paraphrased in the conditions (3.16) and (3.22) for the massless and massive quark case, respectively. In the factorization approach it represents, at NLL+O(α s ), a simple cut on the transverse momentum of (virtual or real) gluons with respect to the thrust axis in the hard, soft and jet functions. In the presence of the cut Q 0 the descriptions provided by angular ordered parton showers, based on the coherent branching formalism and the one provided by the factorized approach are all equivalent, and we were thus able to unambiguously track the field theoretic meaning and interpretation of the dominant contributions linear in the p ⊥ cut Q 0 through the results obtained in the factorized approach. At this point we emphasize that our conclusions related to the meaning and reinterpretation of QCD parameters in the context of computations with the finite p ⊥ cut are made from the perspective of computations without any infrared cut, since the canonical way how perturbative calculations and the renormalization procedure are carried out in collider physics applications is in the limit of zero infrared cutoff. Based on our examinations in Secs. 5.2, 5.3 and 5.4 we proved the following two statements valid in the peak region of the thrust distribution: (1) For massless quark production the dominant linear effects of the shower cut Q 0 repre-sent a factorization scale 5 at the interface of perturbative and non-perturbative large angle soft radiation, and changes in Q 0 can be reinterpreted as a modification of the non-perturbative contributions in the resonance peak region. In the coherent branching formalism and in the QCD factorization approach this modification is related to a shift in the non-perturbative model shape function, called "gap" in Ref. [70] 6 , that can be computed perturbatively. For the thrust distribution in the peak region obtained in QCD factorization this is expressed quantitatively by the relation where dσ/dτ stands for the partonic and dσ/dτ for the hadron level distribution, S mod is the soft model shape function incorporating the hadronization effects (see Sec. 2.3), and having Q 0 in the argument of a function refers to a calculation with the Q 0 cut imposed. Here, ∆ soft (Q 0 ) is the Q 0 -dependent gap that has the form The gap function ∆ soft (Q 0 ) satisfies the renormalization group equation which, due to the appearance of the scale R on the RHS, describes evolution that is linear in the renormalization scale and is called R-evolution [19,20,81,82]. R-evolution differs from usual renormalization group equations such as for the strong coupling, which describe logarithmic evolution. In the context of multi-purpose MC event generators, where an angular ordered parton shower is combined with a hadronization model, the relation means that a change of the shower cut Q 0 needs to be compensated by a retuning of the hadronization model parameters in order to keep physical predictions effectively unchanged. At the level of the hadron level thrust factorization theorem valid in the peak region, which involves the convolution of the partonic distribution dσ dτ with the soft model shape function, this feature is quantitatively encoded in the relation ) , (6.5) 5 We adopt the canonical approach of factorization where the factorization scale that separates perturbative and non-perturbative effects is chosen small, but also sufficiently large such that the interface can be described within perturbative QCD. 6 The name "gap" is motivated by the hadronization gap of the hadron mass spectrum.
where the difference of the gap functions at the scales Q 0 and Q 0 is which is manifestly infrared insensitive. Relation (6.5) states that the dominant linear effects of a change of the shower cut form Q 0 to Q 0 can be compensated, to keep the prediction unchanged, by a modification of the soft model shape function of the form We note that relations (6.2) and (6.5) also have the important implication that the size of hadronization corrections for event-shape distributions that are encoded in MC event generators (i.e. the difference between parton and hadron level output) depends the value of the shower cut. A discussion of the feature is, however, beyond the scope of this work. We also remark that in practice a change in the shower cut Q 0 may not be entirely compensated by a modification of the gap function alone because of additional non-linear dependence on the shower cut.
(2) For massive quark production, the dominant linear effects of the shower cut Q 0 on the thrust distribution at the resonance peak can be interpreted, from the perspective of a computation in QCD factorization without infrared cutoff in the pole mass scheme m pole , as a modification of the non-perturbative contribution from large angle soft radiation and a change of the quark mass scheme from m pole to the scale-dependent coherent branching (CB) mass scheme m CB (Q 0 ). The modification concerning the non-perturbative effects from large angle soft radiation is universal and the same as for massless quark production. The modification concerning the quark mass scheme originates from the restriction the shower cut Q 0 imposes on the ultra-collinear radiation, which corresponds to soft radiation in the massive quark rest frame and which has to be partly considered as an unresolved contribution to the observable top quark state. The shower cut Q 0 changes the position of the pole of the massive quark propagator to m CB (Q 0 ) and also provides the associated scheme change corrections.
Starting from a QCD factorization computation of the thrust distribution in the pole mass scheme, this is expressed quantitatively by the relation where dσ s /dτ stands for the parton level and dσ/dτ for the hadron level distribution, S mod is the soft model shape function incorporating the hadronization effects, having Q 0 in the argument of a function refers to a calculation with the Q 0 cut imposed coherently in virtual and real radiation calculations, and the argument δm CB (Q 0 ) in dσ s /dτ indicates the modification of the perturbative series due to the scheme change from m pole to m CB (Q 0 ). The soft function gap ∆ soft (Q 0 ) is given in Eq. (6.3) and the scale-dependent CB (coherent branching) mass scheme is defined by The scale-dependent CB mass m CB (Q 0 ) is a short-distance mass and thus does not suffer from the O(Λ QCD ) renormalon ambiguity inherent to the pole mass m pole . It satisfies the R-evolution equation [19,20,82] and evolves linearly in R in the same way as the soft function gap ∆ soft (R). The difference of the CB masses for the cutoff scales Q 0 and Q 0 can then be expressed by solving the R-evolution equation which is manifestly infrared insensitive. In the context of angular ordered partons showers with a transverse momentum cut Q 0 the result implies -because the parton shower quark mass parameter is implicitly identified with the pole of the quark propagator -that the parton shower quark mass parameter is the scale-dependent CB mass m CB (Q 0 ). In the context of multi-purpose Monte-Carlo event generators, where an angular ordered parton shower is combined with a hadronization model this means that a change of the shower cut from Q 0 to Q 0 needs to be compensated by a retuning of the hadronization model parameters compatible with Eq. (6.7) and a change of the value of the CB mass from m CB (Q 0 ) to m CB (Q 0 ) according to Eq. (6.12) in order to keep physical predictions unchanged. This puts a stringent field theoretic constraint on properties of the hadronization models, since it is forbidden that they modify by themselves the mass scheme through the retuning procedure.
Statements (1) and (2) can be cross checked numerically from the side of MC event generators by the analysis of the thrust peak position τ peak as a function of the shower cut Q 0 when leaving the hadronization model as well as the numerical value of the generator mass unchanged. In that case the sizable linear effects in the p ⊥ cut Q 0 remain uncompensated and are directly visible in a characteristic dependence of the thrust peak position, τ peak (Q 0 ), on Q 0 . The resulting Q 0 -dependence of τ peak (Q 0 ) can be directly read off Eqs. (6.2), (6.3), (6.8) and (6.10) giving the relation where m is the generator mass and Q 0 is some reference cutoff scale. Here it is understood that only cutoff values Q 0 m are employed, and we also remind the reader that the results have been derived in the limit of boosted massive quarks where m Q. For the rescaled thrust variable M τ , see Eq. (2.2), which is suitable for an analysis for top quarks, the analogous relation reads (6.14) We note that in relations (6.13) and (6.14) the cutoff dependence coming from the large angle soft and the ultra-collinear radiation have an opposite sign. This is a characteristic property of these two different types of effects, which may be used to differentiate between them in the context of quark mass sensitive observables which are more exclusive concerning the soft radiation. In the next section we confront these relations numerically with partonlevel simulations carried out with the Herwig 7 event generator [46,47,71].

Event generation with Herwig 7
In this section we confront the conceptual and theoretical considerations summarized in Sec. 6 and in particular our predictions for the shower cutoff dependence of the peak position of the thrust distributions given in Eqs. (6.13) and (6.14) and our main conclusion that the presence of a shower cutoff Q 0 implies that the top quark mass parameter used in an angular ordered parton shower is the scale-dependent CB mass given in Eq.(6.9) with numerical simulations for e + e − collisions using the Herwig 7 event generator [46,47,71] in version 7.1.2. The angular ordered parton shower algorithm of Herwig 7 implements the coherent branching algorithm outlined in Secs. 3.1 and 3.2, for massless and massive quarks, respectively. Since the treatment of the top quark decay goes beyond the coherent branching formalism outlined in these sections, we provide some more details of event generation in Herwig 7 for top quarks in Sec. 7.1. In Sec. 7.2 we explain a number of special setting we use for our Herwig 7 simulations such that they are precisely in accordance to the coherent branching formalism. In Sec. 7.3, using simulations results obtained with Herwig 7, we reconfirm some approximations used in our analytic calculations in Secs. 4.1, 4.2 and 5.2 within the coherent branching formalism, and the insensitivity of thrust to the cut governing the parton shower evolution of the top decay products. Our predictions for the shower cutoff dependence of the thrust peak position for the massless quark and top quark case are then confronted with Herwig 7 in Secs. 7.4 and 7.5, respectively. Here we demonstrate that our conceptual predictions for the shower cut dependence of the peak position of the thrust distributions given in Eqs. (6.13) and (6.14) are indeed reproduced by the Herwig 7 simulations. In Sec. 7.6 we address the universality of our findings for thrust by discussing the reconstructed (b-jet and W boson) top quark invariant mass m b j W and the endpoint region of the b-jet and lepton invariant mass m b j . Finally, in Sec. 7.7 we comment on the (ir)relevance of NLO-matched simulations with respect to the cutoff dependence of the thrust distribution in the resonance region and the kinematic mass sensitivity of the reconstructed observables m b j W and m b j .

Event generation for top quark production
Within Herwig 7 events with top quarks account for the top quark decay in a factorized narrow width approach: The top quarks are considered stable at the stage of their production, with momenta p µ which satisfy the on-shell condition p 2 = m 2 t , where m t is the Herwig top mass parameter. In the default setting for the LEP-Matchbox.in simulation setup, no smearing with any Breit-Wigner-type distribution is applied, so that off-shell effects coming from the finite top quark width are absent. This default setting is mainly rooted in considerations related to NLO matched predictions, where the smearing disrupts the cancellation of virtual and real infrared cancellations. The angular ordered parton shower then attaches radiation to the production process terminated by the p ⊥ cutoff Q 0 , including radiation off the top quarks (and possibly other colored partons involved in the hard scattering). After the kinematic reconstruction following the production stage parton shower, the final state top quarks have definite momenta p µ which satisfy the on-shell condition p 2 = m 2 t , and the progenitor top quarks, which initiated the showering, have acquired a virtual mass, see the discussion in Sec. 3.3. At this point the top quarks decay, where we for simplicity only consider leptonic decays of the W bosons coming from the top decays assuming perfect neutrino identification. This is not a restriction for the thrust distributions we examine, but simplifies their numerical analyses. The partons originating from the top decays, t → b W + andt →b W − , then radiate according to the decay parton shower algorithm from Refs. [55] and [83] which is terminated by the p ⊥ cutoff Q 0,b . The radiation from the decay stage parton shower exactly preserves the 4-momenta of the decaying top and antitop quarks, respectively, and hence their mass shell condition, in a separate kinematic reconstruction procedure. Within this procedure the b-quark shower progenitor that initiates the b-jet is allowed to acquire a virtuality according to the decay stage parton shower.
In the conceptual considerations of the preceding sections we were discussing the effects of the production stage parton shower cutoff Q 0 . The thrust variable is by construction independent of details of the top decay and therefore also insensitive to the value of the decay state parton shower cutoff Q 0,b . In Herwig 7 the values of Q 0 and Q 0,b can be chosen independently, which allows us to explicitly check the insensitivity of thrust to variations of Q 0,b . This check is carried out in Sec. 7.3.

Settings for MC simulations
To compare the predictions obtained from analytic examinations of the preceding sections with the Herwig predictions, which are based on the previously described algorithms and methods, we use a number of special settings. These settings are used to eliminate default features in Herwig 7 which go beyond the coherent branching formalism as described in Sec. 3 or interfere with the Q 0 dependence of the parton level predictions we aim to analyze. We emphasize that the purpose of these settings is to allow for a direct comparison of Herwig simulations with our analytic results at the parton level in a conceptually clean and controlled setup. So these special settings may serve as the starting point of further examinations, also accounting for the effects and properties of hadronization models, where the impact of default settings used in Herwig (or other MC event generators) can be studied in more detail, or for upcoming releases. We emphasize, however, that these special Herwig 7 settings should be taken with some care since they are not appropriate to carry out full hadron level simulations.
As already explained in Sec. 3.3 we set the (constituent) masses of all quarks and the mass of the gluons that have emerged after the parton showers have terminated to very small values to effectively remove their effects in the parton level results. 7 Zero constituent quark and gluon masses are required to allow a comparison with our analytic QCD calculations; they are, however, not compatible with the default Herwig 7 cluster hadronization model. Furthermore, in our Herwig simulations we do not include any QED radiation or any matrix element corrections, except in our discussion of NLO matching carried out in Sec. 7.7. As already discussed in Sec. 3.3 we also choose the CutOff option for the kinematic reconstruction as this does not alter the correspondence to the underlying coherent branching algorithm as described in in Secs. 3.1 and 3.2. Finally we note that most analyses we have developed are based on Rivet [84], except those focusing on particle multiplicities for which an entirely in-house analysis code is used. In App. D we give the complete set of input file changes required to reproduce the parton level results within our special settings, both for the massless and massive case.

Monte Carlo tests of approximations for analytic thrust calculations
In our analytic calculations of the parton level massless and massive quark jet mass distributions at NLL order in Secs. 4.1, 4.2 and 5.2 within the coherent branching formalism we used two approximations which were crucial to allow for an analytic all order exponentiation of the computation, see e.g. Eqs. (4.4) to (4.6). In the integral equations for the jet mass distributions shown in Eqs. (3.10) and (3.20) these approximations involve (i) neglecting the parton branching of the gluon (i.e. switching off the g → gg and g → qq branchings) and (ii) using the z → 1 limit in the parts which are slowly varying in the soft limit. These approximations were already discussed (and used for analytic calculations) in the seminal coherent branching papers for massless quarks, see e.g. Ref. [58,62]. The former approximation is -for the thrust distribution in the peak region -related to the fact that due to angular ordering the showered gluons originating from the progenitor quarks can themselves not radiate to pick up any significant virtuality. The latter approximation 7 We note that in Herwig all light quarks (i.e. up, down and strange quarks) and gluons are treated as exactly massless during the shower evolution and that constituent quark mass and gluon mass conditions are only imposed kinematically for the partons that emerge after shower terminations. The constituent quark and gluon masses have to be considered as part of the hadronization model. implies -again for the thrust distribution in the peak region -that once gluon splitting is turned off, also strict angular ordering can be dropped from the calculations. For simplicity reasons we therefore refer to the latter approximation as "angular ordering switched off" in the following discussion.
Adopting the settings discussed in Sec. 7.2, these two approximations can be explicitly verified numerically using Herwig 7 to generate the parton level thrust distribution for massless and massive quark production. In Fig. 7a the parton level thrust distribution, defined in Eq. (2.1), obtained from Herwig 7 for massless quarks at c.m. energy Q = 91 GeV is displayed for shower cuts Q 0 = 1 GeV (right set of curves), Q 0 = 1.5 GeV (middle set of curves) and Q 0 = 2 GeV (left set of curves) with gluon splitting and angular ordering both turned on (solid red curves), with gluon splitting turned off, but angular ordering turned on (dashed blue curves) and with gluon splitting and angular ordering both turned off (dotted green curves). All curves are normalized such that at their respective maximum they evaluate to unity, which is particularly suitable to discuss the peak region. We also remind the reader that all curves are produced by convolution of the Herwig 7 parton level results with the soft model shape function of Eq. (2.26) for Λ m = Λ with Λ = 1 GeV according to Eq. (2.3). As discussed in Sec. 2.3, this is essential to obtain a smooth distribution in the peak region that can be interpreted properly. In Fig. 7b the parton level rescaled thrust distribution, as defined in Eq. (2.2), obtained from Herwig 7 for top quarks at c.m. energy Q = 700 GeV and with the generator mass set to m t = 173 GeV is displayed in the same way and for the same choices for the shower cut Q 0 and concerning gluon splitting and angular ordering. For the top quark case we employed a convolution over the same shape function according to Eq. (2.17) with Λ m = Λ + 4m t Γ t /Q and Γ t = 1.5 GeV. For the top quark case the smearing parameter is larger than for the massless quarks in order to simulate the additional smearing effects of the top quark width. Note, however, that this does not represent a systematic treatment of width effects for the top quark.
From the results in Figs. 7a and 7b we clearly see that the impact of the gluon splitting is very small in the peak region and, furthermore, that once gluon splitting is turned off the numerical effects of angular ordering are very small as well. For the rescaled thrust distribution in the case of top production these three settings lead to variations in the peak position of less than ∆τ peak ∼ 10 −3 in the massless case and less than ∆M τ,peak ∼ 100 MeV in the massive case for Q 0 between 1 and 2 GeV. In any case, these variation are considerably smaller than the variations caused by changes in the shower cut Q 0 which we focus on in our subsequent examinations. While the validity of the two approximations concerning gluon splitting and angular ordering for thrust for massless quarks has already been known since Ref. [58], our analysis shows that they are also applicable for the massive quark case, which is new. We note that in our analysis of the dependence of the thrust peak position on the shower cut Q 0 in Secs. 7.4 and 7.5, we consider Herwig 7 simulations using all three options: (i) full simulation, (i) simulations with gluon branchings switched off and (iii) simulations with gluon branchings and angular ordering both switched off. The differences of the Herwig 7 results obtained from these three options should be viewed as an illustration of possible subleading effects even though they should not be overinterpreted as a systematic error estimate.
In the context of these results an obvious question to ask is whether the suppression of effects coming from the gluon branching in the thrust peak region is only a cumulative effect visible in the distribution upon accounting for the sum of all emissions, or whether the suppression takes place literally at the level of the individual parton multiplicities. To answer this question we can analyze the parton level massless quark thrust distribution for a fixed number of final state parton multiplicity, where we define the multiplicity n as the total number of partons emitted from the progenitor quark-antiquark pair. Interestingly, for the Laplace space parton level distribution (2.6) for massless quarks the contribution for a given multiplicity n can be determined analytically, in the approximation that gluon splitting and angular ordering are switched off, simply from Eq. (4.18) by taking the n-th term in the Taylor expansion of the exponential function. In Fig. 8 the Laplace space parton level thrust distribution for massless quarks at Q = 91 GeV with shower cut Q 0 = 1.25 GeV is shown as a function of 1/ν in the peak region 1/ν ∼ τ peak 1 for multiplicities n = 1, 2, 3, 4. Shown are the Herwig 7 full simulation results with gluon splitting and angular ordering both turned on (solid red curves), with gluon splitting turned off, but angular ordering turned on (solid blue curves) and with gluon splitting and angular ordering both turned off (solid green curves) and the analytic result from Eq. (4.18), which is calculated in the approximation with gluon splitting and angular ordering both turned off (dashed black curves). The curves do not include any smearing effects from the shape function because the Laplace integral of Eq. (2.6) already provides a sufficient amount of smearing. We see Displayed are the analytic results (dashed black curves) and simulation results with gluon splitting and angular ordering both turned on (red curves), with gluon splitting turned off, but angular ordering turned on (blue curves) and with gluon splitting and angular ordering both turned off (green curves). that Herwig 7 and the analytic results in the various approximations agree very well. The outcome shows that that the approximations we used in our analytic calculations are also appropriate at the level of fixed parton multiplicities and may therefore have a more general validity.
At this point we emphasize that the examination of the effects of gluon splitting and angular ordering we have just carried out solely serves as a cross check for the approximations we used in our analytic calculation for thrust using the coherent branching formalism in Sec. 4 and 5.2 and that these approximation are not a viable option for general phenomenological studies. These approximations do also not in any way constitute conceptual guidelines for predictions based on QCD factorization (or SCET). In addition, the consistent use of these approximations for thrust involves that the effects angular ordering are only small once the gluon branchings are already switched off. Indeed, the converse, a simulation with gluon branchings but strict angular ordering switched off leads to a dramatic increase of parton radiation and multiplicities and to physically meaningless outcomes.
As already elaborated in Sec. 7.1, for event generation involving top quarks Herwig 7 uses a factorized treatment of production and decay stage parton shower evolution. As we argued in Sec. 2 the thrust variable is by construction independent of details of the top decay and should therefore be insensitive to the value of the decay state parton shower cutoff Q 0,b . In Fig. 9a the parton level distribution of the rescaled thrust M τ in the peak region obtained from Herwig 7 is shown for the c.m. energy Q = 700 GeV and generator mass m t = 173 GeV with production stage shower cuts Q 0 = 1 GeV (right set of curves), Q 0 = 1.5 GeV (middle set of curves) and Q 0 = 2 GeV (left set of curves), and decay state shower cuts of Q 0,b = 1 GeV (solid red curves), Q 0,b = 1.5 GeV (dashed blue curves) and Q 0,b = 2 GeV (dotted green curves). In Fig 9b a ratio plot for the curves for the three choices of Q 0,b is shown for Q 0 = 1 GeV. For all curves a shape function smearing with Λ = 1 GeV has been included following the prescription given above. The results confirm that the dependence of the thrust distribution on the decay stage shower cut Q 0,b is extremely weak and in particular significantly smaller than the corresponding dependence on the production stage shower cut Q 0 . In the resonance region variations due to changes of Q 0,b are at the percent level and negligible as far as the peak position is concerned. The results confirm that the thrust variable is ideal to study the production stage shower cutoff dependence and essentially insensitive to differential details of the top quark decay. For our studies of the shower cutoff dependence of the thrust peak position in Secs. 7.4 and 7.5 we set Q 0,b = Q 0 , which is the default Herwig 7 setting.

Thrust peak position for massless quarks
In this section we confront our analytic parton level prediction for the Q 0 dependence of the thrust peak position for massless quarks, with parton level simulations in Herwig 7 using the specific settings discussed in Sec. 7.2.
To determine the distribution for a given c.m. energy Q and shower cut Q 0 we generated 10 9  events. The resulting binned distribution (with bin size ∆τ = 2 × 10 −4 ) was numerically convoluted using a discretized version of Eq. (2.3) with the soft model shape function S mod given in Eq. (2.26) for a given smearing parameter Λ m . The peak position was then determined from fitting a quadratic function to the bin values in the peak region with heights that differ from the maximum by at most 1 per mille. This leads to statistical uncertainties in the peak position well below 10 −3 which is an order of magnitude smaller than the size of the Q 0 variations we obtain in our analysis. The results can thus be considered exact for all practical purposes and we refrain from quoting any statistical uncertainties in the results we obtain in the simulations. In Fig. 10 the peak position τ peak obtained from Herwig 7 is shown as a function of the shower cut Q 0 for Q = 91 GeV (upper panels) and Q = 300 GeV (lower panels) for the smearing parameter Λ m = 1 GeV (left panels) and Λ m = 3 GeV (right panels). The (center of the) colored squares show the corresponding results from the full simulation, i.e. with gluon splitting and angular ordering both turned on (red squares), with gluon splitting turned off, but angular ordering turned on (blue squares) and with gluon splitting and angular ordering both turned off (green squares). The solid blue line represents the analytic prediction of Eq. (7.1) with Q 0 = 1.25 GeV as the reference peak position taken form the Herwig 7 simulation and using the strong coupling employed by the Herwig 7 parton shower to calculate τ peak for Q 0 different from Q 0 . We have shown the results for shower cut values in the range between (0.5 GeV) < Q 0 < (2.0 GeV) even though the perturbative treatment is expected to break down for scales below 1 GeV. Nevertheless, Herwig 7 can carry out simulations for values of Q 0 even below 0.5 GeV since for scales below 1 GeV the strong coupling used in its parton shower is frozen to the value at 1 GeV. The choice of Q 0 in the simulations is in practice limited from below only by computation time since the parton mulplicities strongly increase for decreasing shower cut. For theoretical considerations, however, only shower cut values of 1 GeV and larger can be considered seriously, because Q 0 conceptually represents a factorization scale and should be located well within the regime of perturbation theory. In fact, indications of a breakdown of the perturbative description for Q 0 < 1 GeV are visible in Figs. 10 (and also in Figs. 11 and the corresponding results for top quarks in the following subsection) as the increased spread in the Herwig 7 results arising from the different choices concerning gluon branching and angular ordering.
Overall we see quite good agreement between the Herwig 7 simulations and the analytic prediction for τ peak for Λ m = 1 GeV and excellent agreement for Λ m = 3 GeV. While Λ m = 1 GeV corresponds to the actual size of hadronization effects compatible with experimental data, the choice Λ m = 3 GeV is motivated by the possible size of additional experimental resolution effects. That we find a much better agreement for larger smearing scale is related to the fact that the evolution equation (7.1) only accounts for the dominant linear dependence on Q 0 which was in our analytic calculations in Secs. 5.2, 5.3 and 5.4 derived by employing the multipole expansion for the contributions of the unreleased radiation, i.e. the radiation originating from below the shower cut. This expansion is formally an expansion in Q 0 /Λ m , and we see from the agreement between simulation and analytic prediction in Fig. 10 that this expansion works already well for Q 0 ∼ Λ m and even better for Λ m > Q 0 . Since in realistic simulations and actual experimental measurements there are additional resolution effects that always lead to a smearing scale that is effectively larger than 1 GeV, we can conclude that the linear dependence on the shower cut Q 0 expressed by Eq. (7.1) represents the dominant effects in all cases and that effects quadratic in Q 0 or of even higher power are small in practice.
At this point it is also illustrative to explicitly show the quality of relation (6.5) which states that the observed peak position can be rendered shower cut independent, if the gap of the soft model function used in the convolution is modified as described in Eq. (6.7). In Fig. 11 the thrust peak positions obtained from the Herwig 7 simulations already shown in Fig. 10 are displayed once again as a function of Q 0 , however, with the corresponding modification of the soft function gap for the reference shower cut value Q 0 = 1.25 GeV. As expected, we see that the shower cut dependence is substantially reduced for the smearing scale Λ m = 1 GeV and almost vanishes for the smearing scale Λ m = 3 GeV in the region   Figure 11: Peak position τ peak at the parton level obtained from Herwig 7 for the parameters used in Fig. 10, but including the soft function gap calculated analytically to achieve results that eliminate the linear dependence on the shower cut Q 0 taking Q 0 = 1.25 GeV as the reference. The blue solid line represents the corresponding analytic prediction.

Thrust peak position for top quarks
In this section we finally confront our analytic prediction for the Q 0 dependence of the rescaled thrust peak position for top quarks with simulations in Herwig 7. We again used the specific settings discussed in Sec. 7.2 and generated 10 9 events for a given c.m. energy Q and shower cut Q 0 . For the convolution with the soft model shape function S mod given in Eq. (2.26) we employed the discretized version of Eq. (2.17) with Λ m = Λ + 4m t Γ t /Q with Γ t = 1.5 GeV for the soft function smearing parameter. It effectively accounts for the additional smearing caused by the top quark width. Since the resonance region in τ is substantially more narrow compared to the   Q=1000 GeV, Λ=3.0 GeV (d) Figure 12: Peak position M τ,peak at the parton level obtained from Herwig 7 for the top quark generator mass m t = 173 GeV as a function of the shower cut Q 0 for Q = 700 GeV (upper panels) and Q = 1 TeV (lower panels) for smearing Λ = 1 GeV (left panels) and Λ = 3 GeV (right panels). Displayed are the results from the full simulation (red squares), with gluon splitting turned off, but angular ordering turned on (blue squares) and with gluon splitting and angular ordering both turned off (green squares). The blue solid line is the analytic prediction of Eq. (7.2) taking the Herwig 7 result for Q 0 = 1.25 GeV as the reference. The dashed blue line is the analytic prediction of Eq. (7.2), but only accounting for the large angle soft radiation contributions which are multiplied with the Q/m t factor. massless case we used a bin size that corresponds to ∆τ = 8 × 10 −6 and used the same method to determine the peak position as for the massless quark case.
In Fig. 12 the peak position M τ, peak obtained from Herwig 7 with the top quark generator mass m t = 173 GeV is shown as a function of the shower cut Q 0 for Q = 700 GeV (upper panel) and Q = 1 TeV (lower panel) for the smearing parameter Λ = 1 GeV (left column) and Λ = 3 GeV (right column). The (center of the) colored squares show the corresponding results from the full simulation, i.e. with gluon splitting and angular ordering both turned on (red squares), with gluon splitting turned off, but angular ordering turned on (blue squares) and with gluon splitting and angular ordering both turned off (green squares). The solid blue line represents the analytic prediction of Eq. (7.2) with Q 0 = 1.25 GeV as the reference peak position taken form the Herwig 7 simulation and using the strong coupling employed by Herwig 7 parton shower to calculate M τ, peak for Q 0 different from Q 0 . The dashed blue line represents the analytic prediction of Eq. (7.2), but only accounting for the large angle soft radiation contributions which are multiplied with the Q/m t factor, in order to visualize the size of the Q 0 dependence coming from the ultra-collinear radiation that affects the interpretation of the mass scheme alone. As in the massless quark case we have shown the results for shower cut values in the range between (0.5 GeV) < Q 0 < (2.0 GeV) and remind the reader that the results for Q 0 below 1 GeV are only shown for illustration, as already explained in Sec. 7.4.
We observe that the agreement between the Herwig 7 simulations and the analytic prediction for M τ, peak is very good for Λ m = 1 GeV as well as for Λ m = 3 GeV. This shows that for the top quark case, where the width provides an additional irreducible smearing effect, the linear dependence on the shower cut Q 0 , which we have determined in our analytic calculations, fully captures the complete Q 0 dependence and that contributions proportional to higher powers of Q 0 are negligible for all practical purposes.
It is now illustrative to explicitly demonstrate that the observed peak position can be rendered shower cut independent, if -taking Q 0 = 1.25 GeV as the reference -the gap of the soft model function used in the convolution is modified according to Eqs. (6.6) and (6.7) and if the generator mass m t is modified by according to Eq. (6.12). In Fig. 13 the rescaled thrust peak positions obtained from the Herwig 7 simulations for Q = 700 GeV (upper panel) and Q = 1 TeV (lower panel) are displayed once again as a function of Q 0 for Λ = 3 GeV. In the left panels we have in addition to the corresponding curves shown in Fig. 12 included the corresponding modification of the soft function gap for the reference shower cut value Q 0 = 1.25 GeV. This removes the shower cut dependence coming from the large angle soft radiation such that the remaining Q 0 dependence explicitly illustrates the shower cut dependence of the generator mass alone. 8 Compared to the results shown in Fig. 12 we see that the slope in Q 0 has an opposite sign which means that the Q 0 dependent CB mass scheme that has to be employed to keep the physical prediction unchanged is decreasing with Q 0 as expressed by the renormalization group equation (6.11). In the right panels we have then also modified, in addition to the figures in the left column, the generator mass according to Eq. (7.3) and taking m CB (Q 0 = 1.25 GeV) = 173 GeV as the reference top quark mass. We see that once both modifications are implemented, the shower cut dependence has essentially disappeared.
It is interesting to also analyze to which extent the shower cut dependent modifications of the soft function gap and the generator mass we have just discussed for the thrust peak position also holds for the whole distribution function in the resonance region. This is shown in Figs. 14 were the rescaled thrust distributions in the peak region are shown for Q = 700 GeV for Q 0 = 1 GeV (dotted green curves), Q 0 = 1.   Figure 13: Peak position M τ,peak at the parton level obtained from Herwig 7 for the top quark generator mass m t = 173 GeV as a function of the shower cut Q 0 for Q = 700 GeV (upper panels) and Q = 1 TeV (lower panels) for smearing with Λ = 3 GeV. Left panels: In addition to the results shown in Fig. 12 we have included the soft function gap calculated analytically to remove the shower cut dependence due to the large angle soft radiation. Right panels: In addition to the results of the left panels we have set the generator mass to m CB (Q 0 ) such that the peak position becomes independent of the shower cut Q 0 . The blue solid line represents the corresponding analytic prediction for the remaining cutoff scale dependence. For all results we used Q 0 = 1.25 GeV as the reference scale. and Q 0 = 2 GeV (dashed blue curves) obtained from the full simulation. The left panels show the distributions in the peak region for fixed generators mass m t = 173 GeV with smearing parameter Λ = 1 GeV (upper left panel) and Λ = 3 GeV (lower left panel). The corresponding right panels show, using Q 0 = 1.5 GeV as the reference scale, the distributions including the Q 0 dependent soft function gap according to Eq (6.7) and the Q 0 dependent generator mass according to Eq. (7.3) to keep the peak position cutoff independent. We see that the resonance distribution tends to be narrower for increasing cutoff Q 0 , but that this effect is weaker for a larger smearing. This behavior can be explained from the fact that for increasing cutoff Q 0 the no-branching probability (which describes production stage multiplicity n = 0 events and contributes to the coefficient of the tree-level δ-function Q=700 GeV, Λ=3 GeV (d) Figure 14: Parton level rescaled thrust distribution in the peak region obtained from Herwig 7 full simulations for Q = 700 GeV and smearing with Λ = 1 GeV (upper panels) and Λ = 3 GeV (lower panels) for shower cut values Q 0 = 1 GeV (dotted green curves), Q 0 = 1.5 GeV (solid red curves) and Q 0 = 2 GeV (dashed blue curves). Left panels: Simulations with generator input mass m t = 173 GeV and using the same soft model shape function for all shower cut values. Right panels: Same distributions, but using a Q 0dependent soft function gap to eliminate the shower cut dependence due to large angle soft radiation and using m CB (Q 0 = 1.0) = 173.22 GeV (green), m CB (Q 0 = 1.5) = 173 GeV (red) and m CB (Q 0 = 2) = 172.86 GeV (blue) as the generator masses, according to Eq. (6.12), to render the peak location independent of the shower cut Q 0 . located at the partonic threshold) is becoming bigger and, correspondingly, the weight of events with branching (which correspond to production stage multiplicities n > 0 and lead to jet masses above the partonic threshold) is becoming smaller. For a larger smearing this width effects is washed out and therefore less pronounced for Λ = 3 GeV. Thus depending on the size of the experimental resolution the effects that a variation of the shower cut Q 0 has on the whole peak distribution may be more complicated than a simple modification of the soft function gap and the generator mass. Since the contributions from ultra-collinear radiation in this context are m t /Q-suppressed, see Eq. (7.2), these width effects mostly originate from large angle soft radiation. One can therefore conclude that these effects may be properly taken into account during the retuning procedure which has to be carried out upon a change of the shower cut Q 0 in MC event generators used for experimental analyses, and which is substantially more involved than an a simple modification of the soft function gap.

Reconstructed observables and universality
After our analysis of the Q 0 shower cut dependence of the MC generator top mass for angular ordered parton showers using the thrust distribution in the resonance region there is one obvious question to be asked: Is our main conclusion concerning the equivalence the MC generator top quark mass and the shower cut dependent CB mass defined in Eq. (6.9) only valid for thrust (or very similar event shape type observables), or is it universal? Clearly, examinations at the same level of depth as we carried out for thrust, where we employed analytic calculations within the coherent branching formalism and the QCD factorization approach together with numerical MC simulations, will be difficult for most other observables with strong kinematic quark mass dependence -most notably because hadron level first principles and factorized predictions (which would allow directly for conclusions at the field theoretic level) are not available for them. The question of universality is also made difficult by the fact that the shower cut dependence not only affects the meaning of the generator mass for heavy quarks (or potentially other QCD parameters), but also modifies the description of non-perturbative effects through its effects on large angle soft radiation (or other types of long-range gluon effects), so that the issue may not be resolved completely restricting the considerations only to partonic cross sections.
At this point one may also have to define general criteria to prove universality systematically. Although we hope to address this issue in forthcoming work, at this time such a systematic and universal approach is lacking. However -if universality applies -the dependence of MC parton level predictions on the shower cut Q 0 , which was one of the main instrument of our thrust examinations, should be visible in a predictable, simple and universal way also for other observables and furthermore allow for non-trivial consistency checks. While consistency concerning the Q 0 dependence among thrust and other kinematic observables represents only a necessary condition for claiming universality, it still provides some evidence that universality indeed applies. Furthermore, computing the shower cut Q 0 dependence analytically for general observables and carrying out the corresponding MC simulations as a cross check is a relatively straightforward and easy task and may even be testable in consistency checks confronting MC generators with experimental data or in the context of pseudo-data analyses. In this section we therefore examine exemplarily two completely different observables with very strong kinematic top mass dependence and which are based on a jet clustering procedure acting on the full set of partons after production and decay stage parton showers have terminated. In this work we restrict our examinations to a numerical analysis of the shower cut dependence of these observables, and we demonstrate that it can be easily predicted and interpreted. Interestingly, we find that the results are compatible with our examinations for the thrust distribution. A more coherent test of consistency in the context of pseudo-data analyses which specifically addresses the shower cut dependence of the generator mass shall be addressed elsewhere. The first observable is the b-jet and lepton invariant mass m b j and the second the reconstructed b-jet and W invariant mass m b j W . Both types of observables have been studied intensely in the context of top quark mass measurements at the LHC. The kinematic sensitivity of m b j to the top quark mass m t arises from the upper endpoint of its distribution, which is, for stable W bosons and at tree-level, located at (m 2 t − m 2 W ) 1/2 neglecting the mass effects of the b-jet. But also the bulk of the m b j distribution has kinematic top mass sensitivity because the region where m b j is maximal depends on the boost of the W boson in the top rest frame which depends kinematically on the top quark mass. The direct kinematic sensitivity of m b j W to the top mass arises simply from the kinematic location of the resonance which is tied to m t in a way very similar to thrust, see Eqs. (2.1). In the following we refer to the top mass sentivities of the endpoint location for m b j and the peak location for m b j W simply as 'the kinematic top mass dependence' of these two variables. Typical results for the m b j and m b j W distributions using the b-jet clustering described below and generated with Herwig 7 are displayed in Fig. 15 for Q = 700 GeV and top quark masses 172, 173 and 174 GeV. Overall, we see that m b j W has a somewhat stronger top mass dependence than m b j .
We consider the production of boosted top quarks at Q = 700 GeV in e + e − annihilation and use Herwig 7.1.2 with the same settings as for the thrust analyses discussed in the previous sections (see Sec. 7.2). For simplicity we again generate only leptonically decaying W bosons and assume perfect neutrino identification. Furthermore we neglect any combinatorial background, i.e. we assume perfect b-jet lepton pairing and perfectly reconstructed top or antitop quarks. While these simplications are not fully realistic, they are, however, fully adequate for our examination of the shower cut Q 0 dependence. For the b-jet clustering we use the FastJet package [85] and employ the generalized k t algorithm for e + e − collisions in the inclusive mode with the inter-particle and inclusive jet distance measures where E i refers to energy, R is the jet radius 9 , and θ ij is the relative angle between two momenta. The exponent p = 1 corresponds to the k t -type generalized clustering algorithm, p = 0 to the Cambridge-Aachen, and p = −1 to the anti-k t -type variant, and we consider all three types of algorithms in our analysis. In Fig. 15a and 15b we show the m b j distribution and the m b j W distribution in the peak region, respectively, generated by Herwig 7 at the parton level with jet radius R = 0.5 and Cambridge-Aachen-type jet clustering for generator masses m t = 172 GeV (green dashed curves), m t = 173 GeV (solid red curves) and m t = 174 GeV (dashed blue curves). For m b j W we have smeared the distribution according to Eq. (2.17) using smearing parameter Λ = 1 GeV as described in Sec. 7.5. Since the m b j distribution is already smooth by itself at the parton level we did not account for any additional smearing. For both distributions we see that the top mass dependence is essentially linear and particularly strong in the endpoint region for m b j and in the peak region for m b j W . The interesting conceptual aspect of the reconstructed observables m b j and m b j W is that, due to the b-jet clustering, they are more exclusive than the hemisphere masses entering the thrust variable of Eq. (2.1). In particular, m b j and m b j W depend on the b-jet radius R. For large R ∼ π/2 we can expect their shower cut dependence to be very similar to the one for thrust since the ultra-collinear as well as major portions of large angle soft radiation are clustered into the b-jet. On the other hand, due to the boosted top kinematics which confines the top decay products as well as the ultra-collinear radiation inside a cone with angle ∼ m t /Q with respect to the top momentum direction, the clustering should always retain most of the ultra-collinear radiation that is soft in the top rest frame and thus inherently tied to the physical top quark state. Thus for small R ∼ m t /Q we can expect that the majority of the ultra-collinear radiation is still clustered into the b-jet while the majority of the large-angle soft radiation is removed. As a consequence we can expect that the shower cut dependence coming from the large-angle soft radiation is reduced when R is lowered, while the one from ultra-collinear radiation is kept.
To quantify the dependence of m b j and m b j W generated from Herwig 7 on the shower cut we use the following procedure: For a given jet radius R and clustering algorithm (as well as matching scheme for the analysis in Sec. 7.7) we take the results for Q 0 = Q 0,b = 1.5 GeV as the default and generate m b j and m b j W distributions for different generator masses m t in the range between 172 and 174 GeV, which we subsequently use to fit the top quark mass from the distributions generated for m t = 173 GeV but with different choices of resonance region are shown as colored triangles and have been obtained from fits using the highest 20% of the distribution around to the peak. To allow for an easier visual identification we have slightly displaced the stars and the triangles horizontally. We have carried out the analyses for all three jet clustering algorithm where we use green color for the k t -type algorithm (p = 1), blue color for the Cambridge-Aachen-type algorithm (p = 0) and red color for the anti-k t -type algorithm (p = −1). We see that for large hemisphere-type b-jet cones the fitted top mass decreases with the shower cut. This means that the mass of the reconstructed top quark state, to which m b j and m b j W are kinematically sensitive (and which for hemisphere-type b-jets includes the effects of large angle soft radiation), decreases when Q 0 is increased. So the behavior indeed follows the one of thrust we have observed in Sec. 7.5. Analytically, the expected Q 0 dependence for ideal hemisphere-type b-jets has the form m R=π/2 t, fit and is ploted in the lower right panel of Fig. 16 as the blue solid line using Q 0 = 1.5 GeV as the reference scale. The RHS of Eq. (7.5) is a factor two smaller than the one for the rescaled thrust M τ, peak in Eq. (6.14) since the reconstructed top mass is linear in the top mass while the rescaled thrust variable M τ, peak is quadratic in the top mass, see Eqs. (2.1) and (2.2). We see that the expected behavior agrees very well with the results obtained from the fit. The actual fit results for all clustering algorithms except for anti-k t tend to have a slightly smaller slope than Eq. (7.5), which is mainly due to the fact that even for R = π/2 the b-jets are typically not full hemisphere jets because they are in general not exactly back-to-back and compete with each other in the clustering process. For decreasing jet radius R, on the other hand, we see that the slope in Q 0 of the fitted top mass increases continuously and becomes positive for R < ∼ 0.5. This confirms the expectation that the shower cut dependence originating from large-angle soft radiation (which is the contribution proportional to Q/m in Eq. (7.5) becomes suppressed when R is reduced, while the shower cut-dependence associated to the ultra-collinear radiation is kept. For visualization we have plotted in upper left panel Fig. 16 the relation m R∼mt/Q t, fit with Q 0 = 1.5 GeV as the reference scale as the blue solid line. This is just Eq. (7.5) but with the Q/m t term dropped, that originates from large angle soft radiation. Again we see excellent agreement between the expected shower cut dependence and the actual fit results. It is also conspicuous that the shower cut dependence of the fitted top quark masses we obtain from m b j and m b j W for the different jet radii and jet algorithms are essentially equivalent and do not exhibit any systematic difference. This analysis thus fully supports  universality concerning the equivalence of the MC generator top quark mass and the shower cut dependent CB mass defined in Eq. (6.9). However, in the absence of a systematic factorized analytic approach to the kinematic top mass dependence of the m b j and m b j W this universality cannot be strictly proven because, in contrast to thrust, m b j and m b j W are affected substantially by the MC modelling and the dynamics of the final state and, in particular, by the choice of the decay stage shower cut Q 0,b . This makes the conceptual background to be examined more involved. In particular, for our parton-level studies a strict proof would require that we could analytically track the role played by the decay stage shower cut Q 0,b for the interpretation of the generator top quark mass in a systematic manner.
To visualize the relevance of the decay stage shower cut Q 0,b for small b-jet radii we show in Figs. 17 again the dependence of the fitted top mass on the production stage shower cut Q 0 for the same cases displayed in Figs. 16, but using a fixed decay stage shower cut Q 0,b = 1.5 GeV. We see that for a large hemisphere-type b-jet radius R = 1.5 the results are equivalent to the corresponding ones for Q 0,b = Q 0 shown in lower right panel of Figs. 16. For decreasing jet radii we see that the dependence of the fitted top mass on Q 0 decreases continuously remains essentially flat for R < 0.5 in contrast to Figs. 16 where a positive slope in Q 0 was emerging. This shows that for small jet radii the shower cut dependence of the kinematic top mass sensitivity of m b j and m b j W arises from the decay stage shower cut Q 0,b . Even though it appears hard to believe that the good agreement we observed for small b-jet radii and Q 0,b = Q 0 between the fit results and the naive expectations is purely accidental, the case of small jet radii is strictly speaking not covered by the conceptual considerations we have carried out for thrust.
To conclude the question of universality, at the present stage, we can say that the shower cut dependence we observe for the kinematic top mass dependence of m b j and m b j W is compatible with the one we have proven for thrust and thus supports universality. This is quite encouraging and motivates further systematic and more general consistency studies that may be carried out with MC simulations and relatively simple analytical computations alone. However, a strict conceptual proof would also involve a precise quantification of the role of the decay stage shower cut Q 0,b (and maybe other issues relevant for exclusive observables with strong kinematic top sensitivity), preferably in the context of a factorized approach where the types of radiation relevant for the interpretation of the top quark mass can be ambiguously separated from other types of radiation and discussed at the field theoretic level. This strongly motivates the development of factorized predictions for reconstructed and exclusive observables such as m b j and m b j W .
At this point we would also like to remind the reader that all our examinations above have been carried out for boosted top quarks. The direct reconstruction top mass measurements at the LHC are, on the other hand, based on top quarks with p T values in the range of 50 to 100 GeV, which corresponds predominantly to unboosted top quarks. We stress that for unboosted top quarks a classification of the radiation modes relevant for a systematic discussion of the meaning of the generator mass is currently lacking and that, in particular, the concepts of large angle soft and ultra-collinear radiation do not apply. Therefore, none of the above considerations or argumentations are applicable for the reconstructed observables m b j and m b j W for unboosted top quarks.

Impact of NLO matching
A crucial precondition of our examinations on the shower cut dependence was that NLL precise angular ordered parton showers, based on the coherent branching formalism described in Secs. 3.1 and 3.2, are already NLO precise as far as the dominant linear shower cut dependence of the thrust peak position is concerned, see Sec. 2.4. This means in turn that the O(α s ) QCD corrections added to simulations in NLO matched MC setups should show very small or even negligible effects in the numerical studies that we have carried out in Secs. 7.3, 7.4 and 7.5. It is the purpose of this section to demonstrate this explicitly by comparing Herwig 7 simulations with and without NLO matching. Furthermore we show by general considerations that NLO matched MC simulation can a priori not modify the shower cut dependence present in MC simulations at NLL for which NLO matching is not accounted for. Since we believe that this discussion may add to a more refined understanding of NLO matching for the general reader, we explain some generic features of NLO matching in the following with a special focus on the role of the shower cut. The reader familiar with the details of NLO matched MC simulations may skip this conceptual discussion and jump directly to the numerical discussion starting after Eq. (7.29).
Matching parton showers to NLO QCD corrections for the hard process has by now become a default requirement for event generation in LHC data analysis. In the following we review the prototypical structure of a parton shower NLO matching algorithm with particular focus on the effects related to the parton shower cutoff. The core of general matching algorithms for parton showers is based on a careful analysis of a single shower emission in order to avoid double counting with the corresponding NLO cross section prediction. At this point, one has to keep in mind that the main aim is to improve the hardest emission, which in general may not be the first one to occur, particularly in the case of angular ordered parton showers. However, to first subleading order in α s , the first and the hardest emission are the same. At this level, global recoil schemes like the kinematic reconstruction of the angular ordered shower tree discussed in Sec. 3.3, as well as local recoil schemes which restore the kinematics after each successive emission can be discussed in a unified fashion. Given the n-parton phase space point associated to the initial hard process, a phase space point with one additional emission off the progenitor leg i can be parametrized in terms of the momentum scaleq of the emission and the momentum fraction z, where for simplicity throughout our discussion we suppress an additional azimuthal angle variable required to set up a complete momentum for the emission: 1 (φ n ,q, z), ..., q (i) n (φ n ,q, z), q (i) n+1 (φ n ,q, z)} . (7.8) At this point it is useful to also introduce the inverse mapping from the (n + 1)-parton phase space point to the n-parton phase space Φ (i) n (φ n+1 ) and associated evolution variablesQ (i) (φ n+1 ), Defining the general infinitesimal m-parton phase space volume element for a total momentum p e + + p e − as and using abbreviations dφ n ≡ dPSP n (p 1 , ..., p n ) , dφ n+1 ≡ dPSP n+1 (q 1 , ..., q n+1 ) , (7.12) the kinematic mapping implies a factorization of the (n + 1)-parton phase space volume element of the form where the term J (i) is the associated Jacobian factor. At the cross section level the parton shower splitting rate is then given by combining the (n + 1)-parton phase space element with the propagator factor and the corresponding splitting function P i : 14) and the arguments of the mapped momenta q i = q i (φ n ,q, z) are understood implicitly. The exponent of the Sudakov form factor, which quantifies the no-branching probability for possible emissions between the scaleQ andq, is then simply given by 15) with a cut on the transverse momentum P n+1 (φ n ,q, z)) , (7.16) where u(φ m ) = δ(η − f η,m (φ m )) is the measurement function. Starting from a given nparton (LO) cross section the one-emission action of the parton shower can then be expressed as Expanding the one-emission action to first subleading order in α s we find (7.19) which, with the help of the inverse mapping (7.10), can be cast into the form , and where the starting point of the evo-lutionQ and the infrared shower cut Q 0 are now made explicit in terms of θ-functions at the level of the squared matrix element, and we have introduced the parton shower approximation to the NLO fixed-order correction to the cross section: Expression (7.20) is particularly useful to formulate NLO matching since it is very close to the generic form of the corresponding NLO fixed-order cross section obtained in full QCD using the subtraction approach: 10 Here the terms dσ (i) A are the subtraction cross sections to cancel the infrared divergences of the real emission cross section dσ R coming from progenitor leg i, and dσ V +I denotes the combination of the NLO virtual corrections and the subtraction cross sections integrated over the emission phase spaces, which is free of poles in in dimensional regularization. This concludes our notation to discuss the generic formalism of NLO matched parton showers.
NLO matching, see [41,[86][87][88][89] for initial development concerning multi-purpose event generators as well as a general review, including the modified hardest emission approach as employed in the POWHEG formalism [41], is then performed by subtracting the O(α s ) contribution of Eq. (7.20) from the NLO fixed-order cross section. To be specific, one sets up a subtracted (or 'matched') NLO cross section σ NLO−PS , such that having the important condition in mind that the total NLO inclusive cross section precisely agrees with the NLO fixed-order calculation: where the first equality arises from the unitarity property of the parton shower. This can be achieved by where σ LO and σ V +I are directly taken from Eq. (7.22) and the matching subtraction term is of the form where we have introduced the shorthand notationsQ The important point of Eq. (7.26) is that within the NLO matched parton shower evolution algorithm the expression in the first line constitutes together with σ LO [u] + σ V +I [u] the tree-level n-parton cross section with n progenitor partons, while the second line represents a new (n + 1)-parton tree-level configuration with n + 1 progenitors. Both types of configuration are therefore showered separately and need to be independently infrared safe and numerically stable without relying on any cross talk between both contributions. However, there are by construction infrared divergences that only cancel in the combination of dσ R and the dσ (i) A because of the presence of the infrared cuts contained in the parton shower contributions. As a consequence the coefficients of the individual u(Φ (i) n (φ n+1 )) and u(φ n+1 ) diverge for emissions at scales below the cutoff, and σ R−A−PS cannot be used in this form for event generation.
Therefore, at this point, all NLO matching algorithms supplement the matched cross section of Eq. (7.26) by an additional contribution which avoids the divergences individually contained in the parton shower evolution starting from the n-and the (n + 1)-parton configurations. This modified version of the matching subtraction term has the generic form The term dσ is an additional subtraction (auxiliary) cross section that is designed to reproduce locally in phase phase the singularities of the subtraction terms dσ (i) A , and the real emission contribution dσ R . We note that the formalism could also be implemented in a different way by simply removing the shower cut θ-functions in Eq. (7.26) if care is taken that the parton shower approximation precisely reproduces, locally in the phase space, all singularities in the fixed-order real radiation and subtraction cross sections. 11 This approach, however, just corresponds a particular choice of dσ X in the matching subtraction term already shown in Eq. (7.27).
Only after the modified subtracted (or 'matched') NLO cross section is constructed, events can be generated with finite weights and leading to finite cross section. We stress that depending on the construction of the additional subtraction in the modified matched NLO cross section it is in principle possible that the parton shower allows emissions below the cutoff Q 0 . However, the weights in these regions of phase space are without any logarithmic enhancement. Their contributions are typically very small, but depend on the form of the auxiliary cross section dσ X . An important consequence is that the consistency relation for the total inclusive cross section of Eq. (7.24) is still satisfied, but that for differential cross sections in dynamical kinematic variables η such as thrust, where η refers to the difference to a threshold or an endpoint where linear sensitivity to the shower cut can arise, the relation between parton shower approximation to the NLO cross section and full fixed-order NLO cross section reads 28) and the cutoff dependence is linear and can be significant compared to achievable experimental precision. It is these contributions which were the focus in the preceedings parts of this paper, and the bottom line is that in NLO matched partons showers they are still present and do not modify the principle precison with respect to the unmatched NLL parton showers.
Within the Herwig 7 event generator's Matchbox module [87] subtractive (which call MC@NLO-type [86]) as well as multiplicative (which we call POWHEG-type [41]) matching can be performed, which both are particular incarnations of the matching principles just described above. In the latter case, however, an additional matrix element correction is employed in Eq. (7.26) to change the hardest emission to be described by a splitting function given by the ratio of exact real emission and Born matrix elements, The terms w (i) are weight functions that partition the phase space into different emitter regions, for which in practice we choose dipole-type factors, each of which has a collinear divergence only if the emission becomes collinear to the emitter i. Both types of matching A . This provides a more transparent and stable implementation of the matched cross section. Furthermore, for the Powheg-type matching, the first emission off the n-parton Born configuration is generated using the splitting kernel and Sudakov form factor determined with Eq. (7.29) and the transverse momentum of all subsequent emissions (with respect to the parent parton momentum) is vetoed not to exceed the transverse momentum of the first. At this point emissions with larger angles but transverse momenta smaller than the emission generated according to Eq. (7.29) are included using in addition a so-called vetoed, truncated shower [41,94].
Let us now compare numerical results obtained with Herwig 7 without NLO matching -referred to as 'LO' ('leading-order') for the rest of this section -(which is the setup we have used for our simulation studies in Secs. 7.3, 7.4, 7.5 and 7.6) and with NLO matching using the MC@NLO-type and the POWHEG-type matching. In Figs. 18 we show the thrust distribution for massless quark production at Q = 91 GeV (left panel) and the rescaled thrust distribution for top quark production with m t = 173 GeV at Q = 700 GeV (right panel) for Q 0 = Q 0,b = 1.0 GeV (right set of curves), 1.5 GeV (middle set of curves) and 2.0 GeV (left set of curves) at LO (solid red), with MC@NLO-type matching (dashed blue curves) and with POWHEG-type matching (dotted green curves). All curves are normalized to unity at the peak position. We hardly see any difference between the LO and NLO matched simulations in the resonance regions. Visible effects arise only in the tails away from the resonances, which can be understood from the fact that the hardest gluon emission, which is improved to full NLO precision by the matching procedure, only obtains sizable NLO corrections away from the singular resonance region. In the resonance region, however, the NLL splitting function approach already provides a fully adequate description and the genuine non-singular NLO corrections are very small. For the cases shown in Figs. 18 the peak shifts due to NLO effects are typically less than ∆τ peak ∼ 10 −4 in the massless case and less than ∆M τ,peak ∼ 100 MeV in the massive case which is considerably smaller than the effects of the shower cut dependence we consider. We have checked that this property is generic and valid for all energies and shower cut values we have examined in our earlier studies. The results confirm that NLO matched parton shower simulations do not add more precision in the thrust resonance region and, in particular, do not modify the shower cut dependence of the simulations without NLO matching.
At this point it is also instructive to examine the impact of NLO matching to the reconstructed observables m b j and m b j W , which we have already examined at LO in Sec. 7.6. Within Herwig 7, concerning the description of top quarks, the MC@NLO-type matching provides only NLO improved simulations concerning the production of the top quarks while the POWHEG-type matching provides NLO improved simulations concerning the production and the decay of the top quarks, where we refer to Ref. [83] for more details. In our LO examination in Sec. 7.6 we have already seen that m b j and m b j W are quite sensitive to the modeling of the decay for b-jet clustering for small jet radii as they are used in experimental reconstruction analyses. At the same time, for small jet radii m b j and m b j W are by construction insensitive to details of the top quark production. We can therefore expect that the LO and MC@NLO-type simulation results are very similar, while the POWHEG-type results may receive notable NLO corrections. This is shown in Figs. 19 where the m b j (left panel) and the m b j W distributions are displayed for Q = 700 GeV, m t = 173 GeV, R = 0.5 and Q 0 = Q 0,b = 1.5 GeV at LO (solid red curve), with MC@NLO-type matching (dashed blue curve) and POWHEG-type matching (dotted green curve). As expected, we see that the MC@NLO-type matching for top production has essentially no impact, while we find visible effects in the distribution for POWHEG-type matching. However, in the top mass sensitive regions these are substantially smaller for m b j W than for m b j , which is particularly conspicuous when comparing the curves in Figs. 19 to the corresponding ones shown in Figs. 15, where the dependence on the top quark mass was illustrated.
Focusing on the shower cut dependence of the kinematic top mass sensitivity of m b j and m b j W we again use the approach of Sec. 7.6 by fits of the top quark mass with respect to the default shower cut setting Q 0 = Q 0,b = 1.5 GeV (see the paragraph prior to Eq. (7.5) in Sec. 7.6 for the description of the fitting approach). In Figs. 20 and Figs. 21 the dependence of the fitted top mass obtained from the m b j endpoint region (stars) and from the m b j W resonance region (triangles), respectively, is displayed at LO and with NLO matching using the same settings as in Figs. 16 where we only displayed the LO results. We again show the results for shower cuts Q 0,b = Q 0 = 1.0, 1.5 and 2.0 GeV for jet radii R = 0.25 (upper left panels), R = 0.5 (upper right panels), R = 1.0 (lower left panels) and R = 1.5 (lower right panels), and we have carried out the analyses for b-jet clustering using the k t -type algorithm (green symbols), the Cambridge-Aachen-type algorithm (blue symbols) and the anti-k t -type algorithm (red symbols). To allow for an easier visual identification of the We see that the NLO matching has essentially no impact on the fitted top mass for large jet radii and the cutoff dependence agrees again very well with Eq. (7.5), which is displayed in the lower right panel (R = 1.5) as the solid blue line with Q 0 = 1.5 GeV as the reference scale. This is expected since m b j and m b j W with large b-jet clustering radius are by construction neither sensitive to the top production mechanism and nor to details of the top quark decay. It is conspicuous, however, that there is also very good agreement between the LO and NLO fitted top masses for small jet radii. For comparison we have displayed again Eq. (7.6) with Q 0 = 1.5 GeV as the reference scale in the upper left panel (R = 0.25). We recall that Eq. (7.6) describes the expected shower cut dependence for R ∼ m t /Q with the contributions coming from large angle soft radiation being removed while those from the ultra-collinear radiation being kept. So we see that, even though the POWHEG-type matching has sizable nominal effects on the distributions for the reconstructed observables, particularly for m b j , the relative shower cut dependence itself it essentially unchanged.
This outcome again fully supports the idea of universality of the shower cut dependence and its independence concerning NLO matched predictions, and it is precisely what is to be expected if the equivalence of the MC generator top mass and the shower cut dependent CB mass of Eq. (6.9) is universal. However, as already noted in Sec. 7.6, a strict proof would require a a thorough quantitative (and preferably analytic) understanding of the b-jet clustering for exclusive observables such as m b j and m b j W to unambiguously track the shower cut dependence. We emphasize again, that such quantitative understanding should at best be achieved in the context of a QCD factorization approach as it allows for a direct, clean and unambiguous field theoretical association of the different types of radiation concerning dynamical physical effects and contributions affecting the interpretation of QCD parameters such as the top quark mass.

Conclusions
The emergence of infrared divergences and their proper treatment to achieve meaningful physics predictions represents one of the major conceptual and technical issues in modern applications of perturbative QCD in the context of collider physics. These divergences   emerge in partonic computations in the (unphysical) limit of infinitesimally small resolution concerning infrared energies and momenta and are resolved by treating partonic configurations below the resolution scale as contributions to the same observable configuration. Within this approach, infrared cuts used to regulate the infrared divergences at the intermediate steps of the perturbative calculations, can then be sent to zero, where the limit of this procedure typically defines what is commonly perceived as the perturbative component of cross section predictions. In the context of multi-purpose MC event generators, where the parton showers are responsible for the description of the parton dynamics below the hard interaction scale, the same principles are applied. However, the infrared shower cut Q 0 which terminates the parton shower evolution is finite, typically in the range of 1 GeV, and leads to a power-like dependence of the parton-level predictions on Q 0 depending on the mass dimension and the infrared sensitivity of the observable. As we have discussed in this work for observables with kinematic top mass sensitivity this dependence on the shower cut Q 0 turns out to be linear and non-negligible given that the current experimental precision in top quark mass determinations based on direct reconstruction methods already reached the level of 0.5 GeV.
In this work we analyzed in detail the role of the shower cut Q 0 in angular ordered parton showers based on the coherent branching formalism for quasi-collinear, i.e. boosted, massive quarks at NLL. We have demonstrated, using an eventshape-type observable based on hemisphere masses and closely related to thrust (see Eqs. (2.1) and (2.2)) in the resonance region where the highest kinematic top mass sensitivity is located, that the finite shower cut automatically implies that the generator top quark mass is the Q 0 -dependent coherent branching (CB) mass, m MC t = m CB t (Q 0 ), even though the underlying analytic expressions that go into the formulation of the parton shower are derived in the pole mass scheme. The CB mass is a low-scale short-distance mass and free of an O(Λ QCD ) renormalon ambiguity. At O(α s ) its relation to the pole mass m pole The inclusion of NLO corrections in the context of NLO-matched parton showers does not add more precision to this relation as the additional NLO information improves the perturbative description of configurations that are located outside the resonance region, i.e. outside the region where the main kinematic top mass sensitivity arises. In Sec. 6 we have provided a detailed summary of all our theoretical findings and in Sec. 7 we have confronted them with parton-level simulations carried out using the Herwig 7 event generator. The simulation results fully confirm our conceptual conclusions concerning the equivalence of the generator top mass and the shower-cut-dependent CB mass, and we also gathered evidence that the equivalence is universal and also applies for other more exclusive observables such as the b-jet and lepton invariant mass m b j and the reconstructed top invariant mass m b j W in the limit of boosted top quarks. In the course of our examinations we also analyzed in detail the shower cut dependence coming from large-angle soft radiation which is universal for the production of massless quarks and boosted top quarks and which represents an interface to the hadronization model used in the MC event generator. These results have implications for the hadronization corrections in event-shape distribution and the extraction of α s which we have, however, not addressed in this work and will be discussed elsewhere.
To conclude this work we address two important questions which have not been addressed in the main body of the paper. The first is about the remaining conceptual issues that have to be resolved to universally explore the meaning of the MC generator top quark mass in the context of state-of-the-art MC event generators that are used in the experimental analyses. The second is about how to best convert m CB t (Q 0 ) to other top quark mass schemes. This issue gains particular importance if one assumes that the MC top quark mass m MC t determined in direct reconstruction methods at hadron colliders is indeed equal to the coherent branching mass m CB t (Q 0 ).
Before we address these issues we would like to emphasize that the proper field theoretic specification of the generator top quark mass m MC t as a particular mass renormalization scheme does not touch in any way the important questions how MC modeling uncertainties such as for the description of multi parton interactions or relevant for the event selection and the description of hadronization effects in the final states such as color reconnection affect the top mass measurements. These uncertainties are and shall continue to be under scrutiny, and their study may lead to improved MC generators in the future. The focus of the present work, on the other hand, is that the principle field theoretic meaning of the cutoffdependence of the generator top quark mass can be studied and resolved independently of these issues and thus deserves particular attention by itself. Associated dedicated studies cover subtle effects that are, however, already relevant in view of the current experimental uncertainties in top quark mass determinations and may in a complementary way contribute to improved MC generators.
Let us now address the first issue. The basic simplifications for the examinations carried out in this work were that we used (i) parton level studies, (ii) the narrow width approximation, (iii) boosted (quasi-collinear) top quark kinematics and (iv) hemiphere masses in e + e − collisions closely related to the thrust/2-jettiness event-shape. In the context of hemisphere mass studies, the extension to MC hadron level studies is straightforward and shall be carried out in forthcoming works. Here the main question to be addressed is how well the MC hadronization models are compatible with the parton-hadron level factorization of Eqs. (2.3) and (2.15), which is an intrinsic property of QCD. The main point then to be clarified is, whether MC hadronization models have the capability to retune the top quark mass -a property that would make the MC top quark generator mass a hadronization parameter und mean that there are additional MC dependent non-perturbative contributions that have to be accounted for in the relation between the MC generator top mass m MC t and the coherent branching mass m CB t (Q 0 ). Concerning the narrow width approximation, we note that state-of-the-art parton showers for massive quarks do not have the capability to describe unstable particle effects from first principles. These unstable particle effects include the top quark intrinsic Breit-Wigner smearing of its invariant mass as well as interference effects connecting top and non-top processes through equivalent top decay final states. Accounting for unstable particle effects may be possible in the context of MC generators matched to calculations including the full top production and decay process or in the context of new MC generators which incorporate unstable particle effects in terms of systematic expansions that are more general than the narrow width approximation. Such studies are in reach and may be addressed in the near future. Concerning the approximation of boosted top quark kinematics, imagining systematic studies for slow top quarks comparable to the examinations carried out in this work, the prospects are far less clear. This is because the existing parton shower formalisms for massive quarks are by construction designed to be valid in the quasi-collinear limit -even though the bulk of the top quarks entering current experimental analyses have relatively low transverse momenta between 50 and 100 GeV and cannot be considered to be quasi-collinear. So progressing into this direction involves general studies of the MC modeling for top quarks that in principle go beyond the problem of the MC top quark generator mass. Finally, concerning the extension of examinations at the level of those carried out in this work to other types of observables covering also hadronic collisions at the LHC, such studies require the development of new types of factorization theorems. For groomed fat jet masses for boosted top quark production at the LHC a factorization approach was recently developed [24], but factorization theorems for more exclusive variables such as m b j or m b j W , which are currently absent, are desirable as well. Furthermore, pushing the existing factorization approach for thrust and the description of the shower cut dependence to one higher order would be useful as well since it would allow for an explicit check of the O(α 2 s ) corrections to relation (8.1). Let us now address the second issue. Assuming that the currently most precise top quark mass measurements of m MC t can be identified with a measurement of the CB mass m CB t (Q 0 ) defined in relation (8.1), how well can it be converted to other mass schemes? Given that most theoretical predictions for top quark physics at the LHC are carried out in the pole mass scheme, one may simply convert the CB mass to the pole mass using Eq. s ) corrections and indicates that the convergence is not particularly good. This is, however, expected since the pole mass has an O(Λ QCD ) renormalon ambiguity. From the analysis of Ref. [82], where a determination of the pole mass from a short-distance mass at the scale 1.3 GeV was studied in detail, we can expect that O(α 2 s ) and O(α 3 s ) corrections are also needed to determine the pole mass value and that at O(α 3 s ) there is a remaining irreducible uncertainty of around 250 MeV due to the O(Λ QCD ) renormalon ambiguity of the pole mass (see also [95] for an alternative view on the size of the renormalon ambiguity of the pole mass). Thus the determination of the currently unknown O(α 2 s ) and O(α 3 s ) corrections to Eq. (8.1) is important to reliably determine the pole mass. To determine the O(α 2 s ) corrections in the factorization approach the effects of the shower cut need to be implemented into the bHQET jet function at O(α 2 s ). To determine the O(α 2 s ) corrections in the context of coherent branching formalism (or angular ordered parton showers) the effects of the shower cut have to be analyzed in the context of a fully consistent next-tonext-to-leading order evolution. The overall conclusion is that the difference between the pole mass and the CB mass, m pole t − m CB t (Q 0 = 1.25 GeV), is likely at least as large as the current uncertainties in top quark mass measurements from direct reconstruction of around 500 MeV (see Sec. 1.1) and requires the determination of two and three-loop corrections. Even when these corrections become available, there will be is an irreducible uncertainty of 250 MeV. So, for a reliable determination of the pole mass the unknown higher order corrections to Eq. (8.1) are very important, and the ultimate uncertainty in the pole mass is at the same level as the precision of 200 MeV that may be achieved for measurements of m MC t in the future high-luminosity run of the LHC [96,97]. Alternatively, since physical observables are not tied conceptually to the pole mass scheme in any way and its O(Λ QCD ) ambiguity is a pure artefact of the pole mass definition, one can as well parametrize calculations using suitable short-distance top quark mass schemes. In this approach the sizeable corrections and the renormalon ambiguity associated to the pole mass scheme -as well as any controversial discussion on the actual size of this ambiguity -can be avoided entirely. To illustrate this approach let us consider a determination of the MSR mass m MSR t (Q 0 ) from a given value of the CB mass m CB t (Q 0 ). At O(α s ) the relation between the scale-dependent MSR mass [19,20]  As expected from the fact that the difference of MSR and CB masses does not contain any O(Λ QCD ) renormalon ambiguity, the scheme corrections to obtain the MSR mass are small, and one can also expect that they exhibit good convergence because MSR and CB masses are both short-distance mass schemes. The difference of 70 MeV can be viewed as an illustration of the currently unknown O(α 2 s ) corrections and indicates that the knowledge of the two-loop corrections in Eq. (8.1) may be sufficient to convert the CB mass to the MSR with a precision of better than 50 MeV. Compared to the current uncertainties in top quark mass measurements from direct reconstruction of around 500 MeV these corrections are small and the knowledge of these two-loop corrections is not required. Furthermore, as was shown in Ref. [20], one can convert the MSR mass to all other commonly used short-distance mass schemes, such as the 1S [98][99][100], the PS [101] or the MS schemes, with a precision of 10 MeV. The overall conclusion is that, when using only short-distance mass schemes, the achievable precision in converting the MC/CB mass to other mass schemes is already at this stage substantially higher than the current experimental uncertainties and also than extrapolations concerning the future high-luminosity run of the LHC which indicate that a precision of 200 MeV [96,97] for a determination of the top quark mass can be reached.  the bHQET jet functions read [15] H m (m, µ) = 1 + α s (µ)C F 4π 2 ln 2 m 2 µ 2 − 2 ln m 2 µ 2 + 8 + where the coeffients at NLL precision are given in Eqs. (2.24), see also Eqs. (2.11) and (2.12). These results have been obtained using dimensional regularization to regulate ultraviolet as well as infrared divergences and do not account for any other infrared cutoff. Ultraviolet renormalization has been carried out in the MS scheme.
B Jet and soft functions in SCET and bHQET with a p ⊥ cut at O(α s )

B.1 Unreleased soft function for thrust
In this section we provide details on the calculation of the unreleased thrust soft function S (τ ) ur at O(α s ). The unreleased soft function describes large angle soft radiation originating from below the p ⊥ cut Q 0 . We carry out the calculation using the dressed gluon propagator of Eq. (5.18) which is suitable to obtain the soft function in Borel space (accounting for fermion bubble resummation to all orders). From this we can easily identify the O(Λ QCD ) renormalon pole located at u = 1/2. To obtain the usual one-loop result one can take the limit u → 0 in the end and multiply back the factor (α s β 0 )/(4π) effectively removed by the dressed gluon propagator in this limit. We note that at O(α s ) all integrations can be readily carried out in 4 dimensions because the unreleased radiation does not result in any ultraviolet divergences. However, in contrast to the calculations without any p ⊥ cut we also have to consider the contributions from the virtual diagrams, because the scale Q 0 constitutes an additional scale such that the virtual diagrams may lead to finite contributions. Interestingly, as we show below, the virtual diagrams lead to vanishing results even for finite Q 0 .
The Borel space contribution from the real radiation diagrams (including the mirror diagram) shown in Fig. 22a reads with c = 5/3 in the MS renormalization scheme for the strong coupling. In Eq. (B.1) we introduced the rapidity regulator α on the q − light-cone component. This is regulator is useful because the upper bound for the transverse momentum integration leads to intermediate 1/α rapidity divergences, which, however, cancel when summing the contributions from the two hemispheres (defined by the contributions associated to the two θ step functions). So, overall there are no rapidity divergences in the O(α s ) thrust soft function with an upper p ⊥ cutoff. Doing the trivial delta function integrations gives Next we can calculate the q ⊥ integral and take the imaginary part employing the relation 1 π Im where in the last line we have used the fact that qk > 0. This leaves us with the sum of three integrals where we have already taken the limit α → 0 in the first two terms since they are finite for α → 0, and B[z; a, b] is the incomplete beta function. For the third term the α → 0 limit has to be taken more carefully, using With this we finally arive at where H n = ψ(n + 1) + γ E is the harmonic number function and B z; a, b the incomplete Beta function. As already discussed before, there are no rapidity divergences in the thrust soft function and all 1 α poles cancel in the final result. Since the virtual diagrams turn out to vanish (see below), this represents already the full result for the unreleased thrust soft function.
To identify the leading renormalon pole we Laurent expand Eq. (B.6) around u = 1/2. Using the relation we find the pole contribution To obtain the O(α s ) unreleased soft function one has to take the limit u → 0 of Eq. (B.6) and include back again the factor (α s β 0 )/(4π). The result is We will now show that the virtual contributions to unreleased soft function vanish even in the presence of a p ⊥ cut. The Borel space contribution from the sum of the virtual diagrams shown in Fig. 22b reads B S (τ,virt) ur (k, Q 0 ) = i 64 C F π 2 β 0 µ 2 e −c u δ(k) d 4 q (2π) 4 θ(Q 0 − q ⊥ ) (−q 2 ) 1+u (n · q)(n · q) 1+α , (B.10) where we have again introduced the α regulator and used n 2 =n 2 = 0 and n ·n = 2. This integral is most conveniently solved by using Feynman parameters of the form such that one finds The q integral is solved by using Eq. (C.6) and leads to The integral is scaleless and thus vanishes. Thus virtual diagrams do not contribute to the unreleased soft function at O(α s ).

B.2 Unreleased soft function for angularities and C-parameter
It is straightforward to determine soft functions for other event shape variables using the method described in Sec. B.1. In the following we provide the corresponding results for the angularities τ α and the C-parameter for future use. For all results we have c = 5/3 in the MS scheme andk ≡ k/Q 0 .

B.3 Unreleased bHQET jet function
In this section we calculate the unreleased bHQET jet function at O(α s ). The unreleased bHQET jet function arises from ultra-collinear radiation off the massive quark coming from below the p ⊥ cut Q 0 . As for the unreleased soft function we carry out the calculation using the dressed gluon propagator of Eq. (5.18), which is suitable to obtain the Borel transform accounting for fermion bubble resummation to all orders and to obtain the usual O(α s ) result from the limit u → 0 and accounting for the additional factor (α s β 0 )/(4π). All integrals can again be carried out in 4 dimensions since in the presence of the p ⊥ cut they are ultraviolet finite. In contrast to the calculation for the soft function, there are no rapidity divergences at intermediate steps of the calculation. We note that in the following we determine the O(α s ) corrections to the unreleased bHQET jet function matrix element J (τ ) B,ur (ŝ, Q 0 ) with andŝ =ŝ + i0 following the conventions from Ref. [15]. The actual unreleased bHQET jet function J B,ur (ŝ, Q 0 ) appearing in the factorization theorem of Eq. (2.19) is then obtained by taking the imaginary part: , (B.23) with v 2 = 1 and c = 5/3 in the MS renormalization scheme. The integral is evaluated in 4 dimensions because in the unreleased contributions there are no divergences that need to be regularized by dimensional regularization. It can be calculated by using Feynman paramters of the form ) .
The diagram in Fig. 23b in Borel space reads (including a factor two to account for both hemispheres and a factor two for the mirror diagram) , with v 2 = 1, n 2 =n 2 = 0, n ·n = 2 and n · v = Q/m. Again the prescriptionŝ =ŝ + i0 is implied. The second term under the integral is the 0-bin that needs to be subratcted to avoid a double counting between the soft and the collinear regions. We can again use The q integral is solved by using Eq. (C.6) and leads to (after doing the additional substitution λ 1 → 2Q 0 λ 1 , λ 2 → Q 0 mλ 2 /Q ands =ŝ/Q 0 ) We note that the integral does not lead to a pole at u = 1/2, because the λ integral is finite. This can be seen by investigating the behavior of the integrand for the small and larger λ: for the unreleased contribution.

D Simulation settings
In this appendix we document the changes relative to the default settings in Herwig version 7.1.2 [102] with which these studies have been carried out. All of the results are parton level simulations, with special settings to make contact with the analytic approach and are not advocated to be used in a full simulation. All simulation is based on the default LEP-Matchbox.in input file, which is prepared to generate both leading order and next-to-leading order matched simulations.

D.1 Common settings
In all of the simulations we consider, we use light quarks u, d, s, c, b by setting their nominal mass to zero, and their consitutent masses, as well as the gluon's constituent mass to be effectively zero, No other special settings are applied.