NNLL resummation of Sudakov shoulder logarithms in the heavy jet mass distribution

The heavy jet mass event shape has large perturbative logarithms near the leading order kinematic threshold at ρ=13\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$ \rho =\frac{1}{3} $$\end{document}. Catani and Webber named these logarithms Sudakov shoulders and resummed them at double-logarithmic level. A resummation to next-to-leading logarithmic level was achieved recently. Here, we extend the resummation using an effective field theory framework to next-to-next-to-leading logarithmic order and show how to combine it with the resummation of dijet logarithms. We also solve the open problem of an unphysical singularity in the resummed momentum space distribution, in a way similar to how it is resolved in the Drell-Yan qT spectrum: through a careful analysis of the kinematics and scale-setting in position space. The heavy jet mass Sudakov shoulder is the first observable that does not involve transverse momentum for which position space resummation is critical. These advances may lead to a more precise extraction of the strong coupling constant from e+e− data.


Introduction
Event shapes in electron-positron collisions have long been valuable testing grounds for quantum field theory and quantum chromodynamics. High quality event shape data predominantly from the Large Electron-Positron Collider (LEP) have allowed for a detailed comparison between theory and experiment and engendered improvements on both fronts. On the theory side, inspiration from data has lead to higher-precision fixed-order computations, up to the current state of the art of NNLO (α 3 s ) [1][2][3][4][5][6]. Data has also inspired exponential progress in resummation, up to N 3 LL precision using Soft-Collinear Effective Theory (SCET) [7][8][9][10][11] in the dijet region for thrust, heavy jet mass, and the C parameter [12][13][14][15], and NNLL precision on other event shapes [16]. The comparison between theory and data has also motivated improvements in the treatment of non-perturbative corrections from hadronization, including using matrix elements defined in quantum field theory [17][18][19], an improved understanding of universality relations [20,21], and the inclusion of hadron mass corrections [22,23].
Until recently, however, there has been very little understanding of the trijet region, with 3 hard jets, in e + e − events. This region has traditionally been assumed to be wellmodeled in fixed-order perturbation theory, since it is well-separated from the dijet region where large threshold logarithms spoil the convergence of the perturbation expansion in α s . However, fixed-order perturbation theory fails spectacularly right at the 3-jet threshold where it predicts kinks and discontinuities. The failure of perturbation theory in the trijet region was understood by Catani and Webber [24], and also Seymour [25], as due to the appearance of Sudakov shoulder logarithms. Although the resummation of the Sudakov shoulder logarithms at the double-logarithmic level was achieved analytically for the C parameter, little additional progress was made in the last 25 years. Recently, Sudakov shoulder resummation was revisited using SCET where a factorization theorem was derived and next-to-leading logarithmic (NLL) resummation was achieved for thrust and heavy jet mass [26]. In addition, there has been recent progress in understanding power corrections near the 3-jet region using renormalon methods [27,28] and the dispersive approach [29]. In this paper, we focus on understanding and improving perturbative Sudakov shoulder resummation for heavy jet mass. Heavy jet mass is special in that it has a Sudakov shoulder on the left of the trijet threshold, so the shoulder logarithms propagate into the region usually used for fits to α s . In contrast, thrust only has a right shoulder, so that large shoulder logarithms do not impact the region used for such fits.
Thrust is defined in the center-of-mass frame of an e + e − collision as where the sum is over all particles in the event and the maximum is over 3-vectors n of unit norm. The vector n that maximizes thrust is known as the thrust axis. The thrust axis splits the event into two hemispheres. One can then add all the momenta in each hemisphere and compute the hemisphere invariant masses M L and M R . Heavy jet mass is Conventionally one also defines τ = 1 − T . So at zeroth order with two massless partons τ = ρ = 0. We call the region near τ = ρ = 0 the dijet region. At leading order in α s the thrust and heavy jet mass distributions are the same (for massless particles) with a kinematic maximum at τ = ρ = 1 3 . Both thrust and heavy jet mass have non-analytic behavior near τ = ρ = 1 3 . To be concrete, the leading non-analytic behavior of thrust near τ = 1 3 at NLO was computed in [26] and found to be in where t ≡ τ − 1 3 , with σ 0 denoting the Born cross section. Thrust only has a right shoulder: its behavior for τ 1 3 is smooth. To get the full leading order thrust distribution in this region one must add to Eq. (1.3) the analytic linear function − αs 4π 144C F t which effectively turns tθ(t) into (−t)θ(−t) with support only for τ < 1 3 . For heavy jet mass near ρ = 1 3 we define r = 1 3 − ρ. The leading non-analytic behavior of its distribution near r = 0 is Heavy jet mass has large logarithms on both sides of ρ = 1 3 , giving it a left (r > 0) and right (r < 0) Sudakov shoulder.
Physically, the origin of the large logarithms can be understood as follows. At leading order (three partons) there is only a single phase space point with ρ = 1 3 , namely the "trijet configuration" where the partons have energy Q 3 and are spaced 2π 3 apart; two partons are clustered into one hemisphere and one parton remains in the other. The thrust axis can be aligned with any of the partons. As these three partons emit soft and collinear gluons, the cross section becomes logarithmically enhanced. These soft and collinear logarithms in the cross section in the trijet region become Sudakov shoulder logarithms after the thrust or heavy jet mass measurement function is applied.
A careful analysis of soft and collinear kinematics near the trijet region leads to a factorization formula for Sudakov shoulders. We can treat collinear emissions as transforming the three massless partons into jets with masses M 1,2,3 . The thrust axis will align with one of the jets, but which one it aligns with depends on the exact kinematics. If M 1 is the mass of the jet in the light hemisphere, and M 2 and M 3 are the masses of the jets in the heavy hemisphere, then, defining m i = M i Q , the bound is (1.6) The double differential distribution d 2 σ dm 2 dm 2 h can then be resummed using standard techniques, see [30].
A key feature of the heavy jet mass Sudakov shoulder factorization formula in Eq. (1.6) that distinguishes it from typical event shape factorization formulas is the relative minus sign between m 2 h and m 2 . In particular, for small r, the region where m 2 1 and m 2 h 1 contributes, even though this region is not described by soft and collinear physics. Such contributions could generically lead to non-global logarithms, or worse, a failure of the factorization theorem to describe the threshold behavior. However, in this case, as explained in [26], we still expect all the shoulder logs to be described correctly. The reason is that when m h and m are large, the point r = 0 is not special: contributions to this region should go smoothly from r < 0 to r > 0. The kinks and large logarithms in the shoulder region emerge only when m h and m are small so that the lower bounds m h > 0 and m > 0 become relevant. Despite this, when one uses standard resummation methods with canonical scale choices, spurious "Sudakov Landau pole" singularities result. Similar poles have been seen also in the small q T Drell-Yan spectrum [31][32][33][34][35][36]. A main result of this paper is a demonstration that these Sudakov Landau poles are avoided by carrying out the resummation with canonical scales in position space. Additional results are the extension of the Sudakov shoulder resummation to NNLL, and smoothly combining the dijet resummation, shoulder resummation, and fixed-order corrections for the heavy jet mass.
An outline of the paper is as follows. In Section 2, we review the factorization theorem for the Sudakov shoulder region, and describe the necessary ingredients for NNLL resummation. Following that, in Section 3, we review the appearance of spurious Sudakov Landau poles when setting scales in momentum (r) space, and address how these can be carefully removed by setting scales in position (Fourier) space. In Section 4, we describe our resummation procedure for Sudakov shoulder logs, and set up a matching procedure to account for dijet logs as well, thereby forming a consistent picture of resummation across both the dijet (ρ 0) and trijet region (ρ ∼ 1/3) in heavy jet mass. In Section 5, we provide our numerical results which combine NNLL shoulder resummation and NNLL dijet resummation matched to NLO. Finally, we conclude in Section 6.

Factorization formula
In this section we review the factorization formula from [26]. In the trijet limit, there are three independent channels associated with a gluon jet in the light hemisphere, a quark jet in the light hemisphere or an anti-quark jet in the light hemisphere. The contribution of -4 -these channels to the total cross section add incoherently, leading to a form where we have used symmetry under charge-conjugation to identify the quark and antiquark channels. In our notation, when σ has a g or q subscript it automatically identifies it as a leading power factorized contribution near the shoulder.

Factorization formula
We first focus on the gluon channel for concreteness. In [26], it was shown that the factorization formula for heavy jet mass in the shoulder region has the form where σ 0 is the Born cross-section for e + e − → qq. Here m 2 h and m 2 are the (dimensionless) mass-shift variables, representing the change in the heavy or light jet masses-squared due to soft and collinear emissions relative to Q 2 ; the evolution kernel is, the fixed-order matching operator is and after taking derivatives, a and b are set to Here we have allowed for different jet and soft factorization scales in the heavy hemisphere (µ jh , µ sh ) and light hemisphere (µ j , µ s ), slightly generalizing the expression in [26]. As we discuss in Section 3.4 below, this separation is not sensible beyond NLL, and ultimately we will set µ s and µ sh to a common scale µ s . For the quark channel the evolution kernel Π g is replaced by -5 -and sj g is replaced by For both channels, the resummation is expressed in terms of [37] S Γ (ν, µ) = − αs(µ) and (2.10) The heavy jet mass distribution near the shoulder has the form dσ dρ = c n,m α n s r ln m r. Because the integrals over m and m h in Eq. (2.2) are UV divergent (as discussed in [26]), we consider first the third derivative d 3 σ dρ 3 . Taking an extra two derivatives turns the measure- . Using additional boundary conditions obtained from fixed-order matching, this third derivative can be integrated back up to the physical spectrum, as we will demonstrate in Section 4.1. The expansion of d 3 σ dρ 3 near the shoulder yields a series in distributions of the form ln n r r + , analogous to how the expansion of dσ dρ in the dijet limit yields a series of distributions of the form ln n ρ ρ + . The resummed form of d 3 σ dρ 3 is for the gluon channel, and similarly for the quark channel.

NNLL resummation
For NNLL resummation, we need the three-loop cusp anomalous dimension, the three-loop β-function coefficients, the two-loop regular (non-cusp) anomalous dimensions for the hard, jet and soft functions, and the one-loop finite parts of the hard, jet and soft functions. The two-loop gluon and quark jet functions [38,39] are the same as for other processes, see e.g. [40,41], and we summarize all perturbative ingredients in Appendix B. The Sudakov shoulder hard function is obtained by matching the QCD vector current onto the corresponding SCET current operator. The virtual diagrams in SCET are all scaleless in dimensional regularization, so the hard function can be extracted from the virtual contribution in QCD. For the case of matching onto three hard partons in the final state, the QCD results can be found in [42]. The result is summarized in Appendix B. The hard function is the same up to crossing as for any qqgγ process, such as direct photon production [40] or 3-jettiness [41], restricted to the trijet configuration. We have checked the hard function against these results as well.
The soft function for the heavy jet mass Sudakov shoulder is defined as a matrix element of Wilson lines in the trijet configuration. The soft function involves collecting emissions into 6 separate sextant regions shaped like wedges of a 6-slice orange. Then the soft gluons in the sextants within the light or heavy hemispheres are combined separately with appropriate projections. This leads to a two-argument soft function, which in momentum space is defined as Here, Y n denotes a path-ordered exponential of gauge fields (a Wilson line) extending from x = 0 to x = ∞ along the n µ direction. As written, all of these Wilson lines are in the fundamental representation of SU (3). We conventionally choose n µ 1 to point along the thrust axis towards the light hemisphere so that Eq. (2.12) corresponds to a soft function with a gluon jet in the light hemisphere. The other channels arise from permuting the Wilson line indices. For heavy jet mass, the trijet measurement functionM (k , k h ) is defined aŝ where the sum runs over the momenta of all particles in the state |X s . The notation θ 1 (k) refers to a projection onto the sextant wedge centered on the direction n µ 1 (see [26]), i.e. θ 1 (k) = θ(n 2 · k −n 2 · k)θ(n 3 · k −n 3 · k), and analogously for θ 2 (k) and θ 3 (k). Similarly, θ1 refers to a projection onto the wedge in the directionn µ 1 , which is opposite to the one in the direction n µ 1 , θ 1 (k) = θ(n 2 · k − n 2 · k)θ(n 3 · k − n 3 · k), and similarly for θ2(k) and θ3(k).
In each region only one component of the momentum k i is retained. Among the six projections, four of them are 2 3 n · k with n µ a lightlike vector pointing to the center of the sextant. However, for the two sextants adjacent to the light-hemisphere direction n µ 1 , the projections involve two spacelike directions (2.14) Ref. [26] computed the one-loop soft anomalous dimension needed for NLL resummation. The complete one-loop soft function including the soft constant is calculated in -7 -Appendix A. For NNLL resummation, we also need the 2-loop regular soft anomalous dimension γ s . This can be fixed by RG invariance and is expressible in terms of the hard and jet anomalous dimensions. The result is also summarized in Appendix A.

Scale setting
In this Section, we show that by choosing scales in position space, the Sudakov Landau pole is avoided. We also explore reasons why this should be true.

Momentum space vs. position space
One can do the integrals over m 2 h and m 2 exactly before setting scales. The relevant integral has the form where we shorten notation by denoting x ≡ m 2 , y ≡ m 2 h . This expression has poles at a + b ∈ Z. The pole at a + b = 1 is the Sudakov Landau pole frustrating the numerical resummation of the Sudakov shoulder logarithms. It comes from the region where x, y 1 and x − y = r.
Although the function f (r) in Eq.(3.1) seems complicated, its Fourier transform is not. It is easiest to compute the Fourier transform before integrating over x and y, doing the integrals in a different order in the r > 0 and r < 0 region: So the Fourier transform is remarkably simple. The simple form of the Fourier transform belies the complicated non-analytic behavior of (−iz) −a (iz) −b . In taking the inverse Fourier transform, one must carefully adjust the contour to avoid the singularity at z = 0. For example, to take the Fourier transform of a single power z −c we must adjust the contour above or below the branch point. Then closing the contour either encircles the branch point or gives zero, depending on the sign of the argument. The result is In a similar manner, the branch point in the position space function (−iz) −a (iz) −b is ultimately responsible for the complicated piecewise form in Eq. (3.1). 1 The pole at a + b = 1 comes from the large y region: This is the region of large jet mass shifts m 2 = x 1 and m 2 h = y 1 where their difference r = x − y is small. If one sets scales in momentum space, so a ∼ b ∼ α s ln r then the pole at a + b = 1 corresponds to one value of r. Large x and y in momentum space corresponds to small |z| in position space. For small |z| the integral is approximately This would indeed be singular for a + b = 1 at small |z|. However, when scales are set in position space, so a ∼ b ∼ α s ln |z|, then a and b change over the integration region, in contrast to momentum space scale setting where a and b are fixed. Moreover, the region where a + b = 1 does not correspond to small z. In this way, the singular region is simply avoided if one sets scales in position space.

Scale setting
Above we showed how setting scales in position space should avoid the Sudakov Landau poles. Now we will explore the differences between momentum and position space-setting in more detail. The idea behind resummation is to include systematically terms in the perturbation expansion of the form c n,m α n L m for some infinite series of c n,m where L is a relevant logarithm and α the relevant coupling. It can be helpful to distinguish logarithmic counting schemes. All these schemes agree to some order in resummed perturbation theory and differ only by higher-order terms. However, these higher-order terms can be important, and in this case, are the source of the spurious Sudakov Landau poles as we will see.
A common scheme counts logs in the log of the cumulant [43] R(α, L) = exp[Lg 1 (α s L) + g 2 (α s L) + α s g 3 (α s L) + · · · ] . (3.7) Here R(α, L) is the cumulant of some observable. This is in contrast to the differential distribution, which is the derivative of R(α, L) with respect to the observable and which generally has an expansion in terms of plus functions and other distributions. With this 1 As an aside, it is worth noting that the result of applying the commonly used one-sided Laplace transform to f (r) is not so simple. The Laplace transform is defined with a semi-infinite integration ∞ 0 dre −νr and so for fixed ν cannot be convergent for both positive and negative r. The best one can do is flip the sign of the exponent by hand between the r > 0 and r < 0 regions leading to This form is not simple at all, and moreover still has the Sudakov Landau pole singularities at a + b ∈ Z.
-9 -organization, g 1 (αL) is the leading-logarithmic series, g 2 (αL) is the next-to-leading logarithmic series, and so on. An alternative set of schemes is based on the perturbative expansion of anomalous dimensions. Here, the leading cusp anomalous dimension Γ 0 and the one-loop β function are included for LL resummation, Γ 1 , the two-loop β function, and the one-loop regular anomalous dimensions γ 0 are included for NLL, and one higher order for each to go to NNLL, and so on. In addition, the finite parts of the hard, jet, and soft functions are required to one-loop (two-loop) order to achieve NNLL (N 3 LL) resummation. Another commonly used combination is that of NNLL RG evolution (i.e., two-loop regular anomalous dimensions) with two-loop fixed-order finite parts, which we refer to as NNLL and which typically features improved perturbative uncertainties over NNLL. Within this set of schemes one still has the freedom to set scales in distribution space, cumulant space, or position space. Ultimately, we will be setting scales in position space for our final results.
In [44] different log-counting schemes are compared for a number of dijet-observables (we review the case of dijet thrust below) and it is shown that results in all spaces coincide up to subleading logarithmic terms. However, it is precisely these subleading terms which are responsible for the spurious divergences in the Sudakov shoulder case. What we will show is that there is an exact correspondence between the terms associated with the leading cusp anomalous dimension and the double logarithmic series (α s L 2 ) n only if scales are set in position space. In contrast, when scales are set in momentum space, problematic subleading terms are produced.

Dijet thrust
Before discussing the shoulder case, let us review scale setting in thrust in the dijet region. Let us work in the simplest possible setting, the "Γ 0 approximation" where we take the leading cusp anomalous dimension Γ 0 = 0 but we set β j = 0 and γ j = 0 for j ≥ 0 and set Γ j = 0 for j > 0. We also take H = 1 and set µ h = Q throughout as the hard scale is not of particular interest. This Γ 0 -approximation suffices to show how one obtains different results depending on in which space scales are set. All choices correctly predict the result at the fixed-scale double-logarithmic order where R ∼ exp(α s L 2 ), but give different results for higher order terms. It is exactly this issue of how to treat terms beyond the logarithmic accuracy one it targeting which we wish to address, since for some observables the higher order terms can induce spurious singularities. In this section, we defineα = αs 4π Γ 0 . In the Γ 0 -approximation, the thrust cumulant is µs . From this, we can read off the canonical scales µ s = Qτ µ j = √ Qµ s . Using these values and settingj =s = 1 the resummed cumulant in momentum space takes the form -10 -with η = −2C Fα ln τ . R(τ ) has the form in Eq. (3.7) with g 1 = −αC F ln τ and g 2 = ln e −γ E η Γ(1+η) . On the other hand if we Laplace transform before setting scales, then (3.10) Here we can read off canonical scales as µ s = Qe −γ E ν and µ j = √ Qµ s . With the Laplacespace scale choice which is a purely LL expansion. The inverse Laplace transform gives This agrees with the momentum space form up to terms formally of order NNLL and higher, so the results of setting scales in either space are equivalent. (In practice, setting scales in Laplace space requires some care because one has to numerically integrate over the Landau pole in QCD [45,46].) In summary, we saw from the example of thrust in the dijet limit that setting scales in position space allows for exact agreement between the logarithmic counting and the solution of the RG equations. In momentum space, some subleading terms are generated even if only Γ 0 is kept as in this simplified example. It is these subleading terms which are problematic in the Sudakov shoulder case, as we will see next.

Heavy jet mass shoulder
Now we return to the Sudakov shoulder of heavy jet mass. We work again in the Γ 0 approximation with µ h = Q, for which the factorization formula in Eq. (2.11) for the gluon channel becomes Note that |r| must be used to obtain canonical scales for both the r < 0 and r > 0 regions, and hence the scales are continuous but not smooth functions. With these choices, the momentum space distribution becomes with a = −C Aα Γ 0 ln |r| and b = −2C Fα Γ 0 ln |r|. This has the form of a NLL distribution exp[Lg 1 (αL) + g 2 (αL)]. The Sudakov Landau pole is in the NLL contribution, g 2 (αL). Now let us look at the distribution in position space. Before setting scales, it has the form We can read off that the canonical scales are With these scale choices, Thus in position space this is exactly an LL expansion with no higher terms. Although we cannot Fourier transform back to momentum space analytically, we can do so numerically. A comparison between the inverse Fourier transform of σ g (z) with µ Can (z), and with µ Can (r), is shown in Fig. 1. We see that when canonical scales are chosen in position space, µ Can (z), the Sudakov Landau pole is gone. When specifying the canonical scales, we are always free to make alternate choices that agree up to O(1) factors, since such choices still minimize large logs. An alternative is to -12 -use real scales, such as Choosing real scales like this in position space will still remove the Sudakov Landau pole. Real scales are additionally easier to implement since they avoid analytically continuing α s (µ) to complex scales, as e.g. done for the resummation of so-called timelike logarithms in gluon-fusion Higgs production [47][48][49][50][51][52]. Moreover, real scales are also preferable since they guarantee that the distribution will be real when Fourier transformed back into momentum space. For generic complex scales the higher order terms, beyond the logarithmic order one is working, can in general be complex. With the scales in Eq. (3.19) the position space distribution is .
(3.20) The numerical inverse Fourier transform of this function is also shown in Fig. 1.
When setting scales in position space and transforming back, one might be concerned that an entirely new expression is generated, with no relation to the one with scales set in momentum space. First of all, we know that if the computations and matching are done to common fixed order α n s , then the momentum and position space expressions must both be scale-independent to that order. Thus when we Fourier transform the position space expression back to momentum space it must agree to that order with the momentum space expression. Although it is harder to show directly, the series of large logarithms must also agree. In the Γ 0 approximation, we can derive an analytic relation between the resummed expressions that also gives insight into how the Sudakov Landau pole is removed. To Fourier transform the distribution using canonical position space scales, we can write formally that (3.21) To get to the last line, we used f (∂ a )r a = r a f (∂ a + ln r) and then expanded (∂ a + L) 2 and used e x∂a f (a) = f (a + x). In this final form, we must set a = −C Aα Γ 0 ln |r| and b = −2C Fα Γ 0 ln |r| after taking the derivatives as in Eq. (3.16). Since a ∼ b ∼α ln r each application ofα∂ 2 a orα∂ 2 b gives a contribution suppressed byα which makes the position and momentum space distributions differ at the NNLL level and beyond. Nevertheless, these derivatives cure the Sudakov Landau pole: near the pole ∂ 2 a ∼ ∂ 2 r → ∞ and the divergence gets exponentially suppressed.

Comparison to q T resummation
It has been known for a long time that color-singlet transverse-momentum (q T ) spectra in hadron collisions feature poles at finite values of q T when strictly truncating the logarithmic expansion in momentum space [31], or equivalently when naively setting scales [36] in momentum space. 2 In both q T and the Sudakov shoulder, these spurious poles are due to cancellations between energetic emissions that contribute to the observable: through momenta with opposite direction in the transverse plane for q T = | q T |, and through mass shifts with opposite sign for the Sudakov shoulder, as in Eq. (1.6). For the q T distribution, it is in fact standard practice to perform the resummation by setting scales (or truncating the logarithmic expansion) as a function of the Fourier-conjugate variable 46,55], and then performing a numerical [56] or approximate analytic [57] inverse Fourier integral. This procedure is well known to lift the spurious poles by retaining an appropriate set of terms that would be subleading when counting logarithms directly in momentum space, in close analogy to our discussion in Section 3. It is thus interesting to ask in what ways our results here differ from the q T case. Two intriguing differences are • In the q T case, each individual beam and soft function (or TMD PDF) is an azimuthally symmetric function of the magnitude k T = k T of the total transverse momentum of soft or collinear radiation. Accordingly, the resulting spectrum is azimuthally symmetric and only a function of | q T |. This is distinct from the case of the heavy jet mass shoulder, where individual jet functions always contribute with a given fixed sign and the soft function itself is not symmetric under m h ↔ m . In position space, this physical asymmetry between hemispheres is encoded in an imaginary part of the Fourier transform of the cross section, cf. Fig. 4, which is an interesting novel feature of the Sudakov shoulder case.
• To our knowledge, the heavy-jet mass distribution in the shoulder region is the first invariant mass-like observable which features these spurious poles, and for which position-space resummation is required to arrive at physical results. Examples of transverse momentum-like observables exist whose resummation can be carried out in momentum space, either because they are strictly additive scalar quantities (transverse energy E T ) or because they involve an ordering that vetoes harder emissions (leading jet p T ). Our analysis shows that an observable being transverse is neither sufficient nor necessary for the appearance of spurious poles, and that position-space resummation is a crucial tool beyond the case of q T . In other language [58], we can say that this is the first SCET I observable where position space scale-setting is critical, while earlier examples are in SCET II .

Complex scales and non-global logarithms
In the Γ 0 approximation, the HJM shoulder distribution receives distinct contributions from the light and heavy hemispheres. These separate contributions are well-defined since the leading Sudakov double logarithms come from the soft-collinear limit where the soft emissions are always near a jet and therefore uniquely associated with a hemisphere. This means we can choose distinct complex soft scales as in Eq. (3.17).
Beyond the leading-log approximation, when single logs come into play, it is no longer possible to cleanly separate left and right hemisphere contributions. In the dijet limit, the hemisphere soft function factorizes as [13,14,30,[59][60][61] where ν 1 and ν 2 are the Laplace conjugates of the left and right hemisphere masses. This factorization is non-trivial. It follows from RG invariance and symmetry of the hemisphere soft function under the interchange of the two hemisphere masses. It is also not uniquely defined -one can shuffle constants among the three factors. We conventionally definẽ s µ (ν, µ) to be all the terms determined by RG invariance. The factors f (ν 1 , ν 2 ) describes the non-global structure of the hemisphere soft function. This function is known to 2loops [60,61] and contains the leading non-global logarithm [62].
The position-space trijet hemisphere soft function S(z , z h , µ) relevant for the heavy jet mass Sudakov shoulder resummation is superficially similar. We might like to writẽ Unfortunately, although we can still use RG invariance to determine its µ-dependence we no longer have a symmetry in the interchange of the two hemispheres to give a precise meaning to two factorss (z , µ) ands h (z h , µ) (beyond the cusp anomalous dimension). To be explicit, in the gluon channel, the coefficient of the overall soft anomalous dimension at O(α s ) can be written as γ s This separation is motivated by the contributions to the soft function from radiation into the light hemisphere containing the single gluon jet, or the heavy hemisphere containing the quark and antiquark jets. This suggests that we might identify γ 0 = γ sg 0 as the anomalous dimension governings (z , µ) and γ h 0 = γ sqq 0 as the anomalous dimension governing s h (z h , µ). Also, there is no guarantee that resummation achieved by assuming separate anomalous dimensions will predict any correct logarithms at higher orders.
If we cannot define single-scale objectss (z, µ) ands h (z, µ) there is no sense to choosing different scales for the heavy and light hemispheres. Moreover, it can be problematic to choose different complex scales for the two hemispheres as in Eq. (3.17). In Fig. 2 we show the prediction (using the full NNLL setup we describe below) with canonical complex scales for different choices of how the two-loop soft anomalous dimension splits into heavy and light components: Curves for λ = − 1 2 . . . 2 are shown in Fig. 2 and compared with the real scale choice in Eq. (3.19). We see that artificially separating the anomalous dimension can generate large distortions in the spectrum. It effectively induces large logarithms of the ratio of these scales ln µ µ h ∼ π. These logarithms are formally of the same order as non-global logarithms of k /k h in thes ng part of the trijet soft function in Eq. (3.23), which we necessarily treat at fixed order when evaluatings ng at any given individual scale. We thus conclude that there is no sensible way to split the soft anomalous dimension into light and heavy components, and that a common scale should be used for all parts of the soft function.

Summary
In this section, we examined scale setting in position, momentum, and Laplace space for the Sudakov shoulder region of heavy jet mass in the Γ 0 approximation. From the momentumspace point of view, the regions with r < 0 and r > 0 are distinct, and the canonical scale choice µ s ∼ |r|Q is not smooth across r = 0. However, r = 0 is not special from the point of QCD: the shoulder at r = 0 is an artifact of perturbation theory. In position space, the natural complex scales are µ s ∼ − i z Q and µ sh ∼ i z Q. Choosing scales in position space makes the distribution smooth across r = 0 and removes the spurious Sudakov Landau singularity caused by momentum-space scale setting.
To choose canonical complex scales requires a separation into contributions from heavy or light hemispheres, which is a sensible criterion only in the LL approximation. Beyond that, artificially separating scales in the heavy and light hemispheres is problematic. Instead, one should choose equal soft scales such as µ s = µ sh ∼ Q |z| . The Sudakov Landau pole is still removed as long as scales are chosen in position rather than momentum space.
-16 -With the factorization formula and the insights from Section 2 and 3 in hand, we can now turn to full NNLL resummation. As discussed, choosing scales in momentum space is problematic: it generates spurious Sudakov Landau poles from the region where the heavy and light jet mass shifts are both large, but there difference is still small.
In position space this region corresponds to small z. When we set scales in position space the distribution is heuristically z −α ln z = e −α ln 2 z so the small z region becomes Sudakov suppressed. However, despite this suppression, one must still be careful when performing the inverse Fourier transform back to momentum space in actual QCD because of the Landau pole in α s (µ). In addition, the factorization formula we have been using so far holds for d 3 σ dρ 3 , so one must integrate over ρ twice to get back the physical spectrum dσ dρ . This integration requires boundary conditions generating a linear function c 0 + c 1 ρ. These two constants c 0 and c 1 correspond to finite remainders after adding counterterms to remove the logarithmic and linear UV divergences in the original dσ dρ distribution in Eq. (2.2). More physically, they correspond to the fact that the offset and slope of the distribution at r = 0 are not predicted by the shoulder factorization formula alone: there can be contributions from four-parton and higher-order configurations which have ρ ∼ 1 3 but do not arise from configurations close to the trijet configuration. The integration constants c 0 and c 1 must be determined by some boundary conditions for the double integral from d 3 σ dρ 3 → dσ dρ as we discuss in this section.

Integration
In terms of the position space distribution, the factorization formula has the form where r = 1 3 − ρ and with Π g the same as in Eq. (2.3), but now sj g is a function rather than an operator (4.3) For NLL or NNLL resummation we expand sj g to order α s . The quark channel is analogous.
To obtain the physical spectrum, we need to inverse Fourier transform and integrate over r twice. A sufficient condition for σ(ρ) to be real is that σ (−z) = σ(z). This condition on σ holds automatically for real scale choices, since (i(−z)) = iz. We can thus write the inverse Fourier integral as Integrating twice with respect to r generates a factor of 1 z 2 in the integrand along with two integration constants. We can then write the integral as 3 with a kernel function where K 0 and K 1 serve to fix the constants in the boundary conditions. For example, if we performed the integrals from a lower cutoff r L so that then we would find Unlike for dijet observables, where the limit of integration ρ = 0 or τ = 0 is natural, there is no natural choice for r L for the Sudakov shoulder. Thus, this boundary condition forcing the distribution to vanish at r = r L is not ideal. A more natural boundary condition is to ask that the unmatched distribution when expanded to fixed order reproduces the leading behavior of the full theory distribution near r = 0. It does this automatically for all the r ln m r terms, which are true predictions of the effective theory factorization formula. It also reproduces the difference in slopes in the left and right shoulders. This difference comes from integrating over the δ-function terms in the triple-derivative. For example, The first term on the right is independent of r L and therefore a prediction of the theory. The value for the second term is determined by one of the boundary conditions. More generally, consider the unmatched resummed distribution, expanded in α s . The distribution g(r) = 1 σ 0 dσ dρ has the generic form with c 1 , b, d ± m all functions of α s , but only c 0 and c 1 depend on the choice of boundary conditions. That is, d ± and b are predicted by the factorized expression while c 0 and c 1 are not. We can then demand that the offset at r = 0 and the sum of the slopes on the left and right shoulders at r = 0 agree with a fixed-order calculation to fix c 0 and c 1 . Moreover, using perturbation theory to determine the boundary conditions implies that the uncertainty on these constants will be included in the uncertainty associated with the fixed-order contribution. Thus we will not have to account for it separately.
At leading order, the distribution in the two mass shifts is so the integrals over m 2 and m 2 h in Eq. (1.6) are finite and the unmatched distribution is We read off d ± m = 0, b = c 1 = 1 2 and c 0 = 0. Note that at leading order there is no UV divergence in the integrals over m 2 and m 2 h so the integration constants are calculable without using the full theory. Since g (r) = δ(r) we can implement these integration constants with an integral kernel This kernel when integrated againstσ(z) = 1 gives rθ(r) as desired.
At higher orders we need to modify the boundary conditions by the inclusion of α s corrections from fixed-order perturbation theory. Furthermore, we also want to add any fixed order contributions that are not enhanced by large logarithms, and hence not part of the resummed cross-section (so called "non-singular" terms). This can be accomplished by writing where µ res stands for the scale choices leading to resummation and µ FO refers to setting all the scales equal to a common fixed-order scale. When writing the matched cross section as on the first line, the integration constants will contribute only beyond the order to which the resummed cross section is matched, since the latter is scale-invariant to that order. Thus the uncertainty on the constants will be within the fixed-order (hard scale) uncertainty. We therefore use the leading-order kernel in Eq. (4.13) for simplicity. On the second line, we have rearranged terms and performed the analytic z integral over the fixedorder singular cross section to isolate the "non-singular" term in square brackets, which is analytic near the shoulder.

Profile functions
Although choosing canonical real scales resums the correct logarithms to NNLL order, in practice it is important to choose scales which avoid the Landau pole at µ = Λ QCD , and which allow for resummation to turn off in regions that are more accurately described by an alternate description. These features can be achieved with profile functions [19,63]. In our case, we want to turn off the resummation for the shoulder region as we approach the dijet region, where conventional dijet resummation applies, and to switch to fixed-order perturbation theory when we are far enough above the shoulder threshold.
To construct profile functions, we follow the procedure used for q T resummation in [64]. We start with the canonical (real) scales, as justified in Section 2. To ensure that the soft and jet scales do not hit the Landau pole of QCD, we adjust these canonical scales to be and take µ min s = 2 GeV. We would like the canonical scales to be used for a certain range of ρ and then smoothly turn off away from the resummation region. To do so, we choose central jet and soft profile scales based on the canonical scales as where 0 ≤ g(ρ) ≤ 1 in the exponent determines whether the resummation is fully on (g = 1) or turned off to recover the fixed-order singular result (g = 0), and smoothly interpolates between the two cases through a weighted geometric mean. Since the RG evolution resums logarithms of µ s /µ j and µ j /µ h , this effectively amounts to having g(ρ) multiply logarithms of z in the resummation exponent.
To achieve a matched result, the shoulder resummation needs to turn off outside of the resummation region. To implement this, we take g(ρ) to be a product of profile functions governing the turn-off towards the left and the right of the shoulder, (4.17) We use two different functional forms for the individual profile functions, each with their own advantages. For the left profile, we use a sigmoid function: The advantage of using an infinitely differentiable function like the sigmoid is that it does not introduce unphysical kinks in the matched distribution which may bias comparisons to data. A disadvantage is that the sigmoid function turns the resummation exactly on or off only in the asymptotic limits ρ → ±∞. (In practice, g L (ρ) ≥ 0.9 for ρ > ρ L + 2σ L , which leads to scales that are equivalent to canonical ones within the variations we consider.
-20 - Similarly, we find that g L (ρ) ≤ 0.1 for ρ < ρ L − 2σ L is close enough to the strict fixed-order limit to the left of the shoulder for our purposes.) For the right profile, we use a piecewise quadratic function [64]: This form has the advantage that the resummation is exactly off at ρ ≥ ρ R2 . Since the cross section on the right shoulder falls rapidly, one must turn off resummation rather quickly to avoid unphysical negative values from the resummed distribution [19]. This is easiest to achieve with a quadratic profile. The left panel of Fig. 3 shows the combined interpolating function g(ρ) we use for the full shoulder region. The right panel of this figure shows the scales as a function of z for a point in the canonical region where g = 1, and a point with g = 0.3 in the transition region, illustrating the weighted geometric mean of scales. Although the region with |z| 0.56 where the jet and soft scales are larger than Q is not excluded from the Fourier integral, this region contributes negligibly. 4 We can understand this from looking directly atσ(z) as shown in Fig. 4. 4 In the |z| 0.56 region, since r is small, we can approximate e izr = 1 in the inverse Fourier transform.
Thus not only does the region |z| 0.56 not contribute much overall (because the integrand is bounded), but the amount it contributes is also independent of r and therefore its leading contribution is fixed by the c0 + c1r boundary condition.
-21 - To determine the profile midpoint ρ L and the transition width σ L on the left shoulder, we compare the size of the singular cross section predicted by the shoulder factorization at fixed order to the total cross section in the top left panel of Fig. 5. The non-singular remainder, which is given by the difference of the two, has the same size as the singular cross section around ρ ≈ 0.15, suggesting that the resummation should be turned off around this point. Taking ρ 0.15 to be the region where the resummation is (nearly) off, this suggests ρ L − 2σ L = 0.15. On the other hand, the singular cross section dominates over the remainder by a factor ≥ 5 down to ρ ≈ 0.25, which motivates keeping the resummation on until ρ L + 2σ L ≈ 0.25. Thus the g L (ρ) sigmoid profile is configured to make the bulk of the transition between ρ = 0.25 and ρ = 0.15, as evident in Fig. 3.
For the right shoulder, a comparison of singular and non-singular contributions is not a useful tool to determine the resummation region. The problem is that the cross section dies off very fast for ρ 1 3 and moreover there is a second shoulder determined by the 4-parton threshold at ρ = 0.42. In fact, the resummed distribution with canonical scale choices, although smooth at ρ = 1 3 , falls rapidly for larger ρ and becomes negative by ρ = 0.345 (see Fig. 6 below). To obtain a physical, positive cross section, the resummation must be turned off at least by then, so we take ρ R2 = 0.342. We take ρ R1 = 1 3 to at least extend the right shoulder resummation as far as possible. Part of the reason for using a quadratic profile in this region, which has g R (ρ) = 0 exactly for ρ ≥ ρ R2 is that even the very small difference between the sigmoid and quadratic profiles has big effects relative to the value of the cross section. When we match the resummed distribution to fixed order, we add the singular and nonsingular contributions (see Eq. (4.14)). In the fully fixed order region the singular and nonsingular terms are individually large with opposite sign, and must sum to give a cross section which is close to zero. Even a small amount of leftover resummation from not turning off higher order logarithms can spoil this sensitive cancellation. The cancellation in the fixed order region is guaranteed by the quadratic profile in Eq. (4.19), which imposes g(ρ ≥ ρ R2 ) = 0 exactly, as can be seen in Fig. 3. The same delicate cancellations do not take place in the region controlled by g L (ρ), allowing us to use the smoother sigmoid for the left shoulder transition.

Dijet matching
To obtain the best possible prediction valid across phase space, we should match the resummed shoulder distribution to both the fixed-order result and the resummed dijet distribution. The resummation for heavy jet mass in the dijet limit has been done up to the N 3 LL level [14]. The general form for the resummed distribution in the dijet region is where after taking derivatives one sets As discussed in Section 3, for the dijet region it is not necessary to set scales in position space (although one could). For simplicity, for the canonical scales we use conventional canonical scales in momentum space, Before we present the complete matched result including shoulder resummation, we first discuss how to match the resummed dijet distribution to fixed order using standard procedures [19] in a setup where the shoulder is not accounted for as a special region. Here one adds a non-singular remainder with a common scale µ FO to the resummed dijet cross section evaluated with soft and jet profile scales (denoted together by µ pro dij ) To ensure that the plain fixed-order result is recovered in the tail, it is common to turn off the dijet resummation with profile functions that interpolate between canonical scales in Eq. (4.22) and equal scales µ s = µ j = µ h directly as a function of ρ. When matching to the plain fixed-order result, appropriate parameters for the dijet profile can be inferred by the top right panel of Fig. 5 in analogy to the discussion for the shoulder. When the LO cross section vanishes at ρ = 1 3 , the dijet singular cross section exactly cancels the non-singular. Any small mismatch between the singular and non-singular, resulting for example from scales that are close but not equal in the profiles, can lead to large relative errors on the prediction. This suggests that the dijet resummation should be turned off before reaching this point. In practice, when comparing to partially matched results based on Eq. (4.23), we will make use of the profile functional form developed in ref. [15] for the C parameter distribution, adjusting the point t s (defined in that reference) where the resummation is exactly turned off to be t s = 1 3 . We now turn to the complete matched result incorporating both dijet and shoulder resummation. The matched cross section takes the following form, This matched distribution has the formal properties that far from the shoulder, when the shoulder scales are equal to the fixed-order scale, it reduces to the dijet distribution matched to fixed order as in Eq. (4.23). Similarly, parametrically far from the dijet region, where the dijet scales reduce to fixed order scales, the dijet resummation turns off and it reduces to a matched distribution of shoulder and fixed order only. We note that matching different regions of a one-dimensional distribution is in a sense easier than the joint resummation required for double-differential distributions as in [64], where generically there is a region where two separate types of logarithms can be simultaneously large. For Eq. (4.24) to have the right limits, we moderate the dijet resummation through profile functions similar to Eq. (4.16). We define the profile scales with a function g dij (ρ) playing the role that g(ρ) plays for shoulder resummation: Note that for, dijet resummation, the profile scales only depend on ρ, in contrast to Eq. (4.16) where they also depend on z, because we perform the resummation directly in momentum space. The weight function g dij (ρ) should interpolate between the region where the dijet resummation is fully on (g dij = 1) and the region where it is off (g dij = 0). To determine the parameters for g dij , we return to the comparison of the singular and non-singular contributions, but now look at the non-singular with both singular shoulder and singular dijet subtracted. This overall non-singular contribution is given by the expression in square brackets in Eq. (4.24). At leading order it is shown (multiplied by −1) in the bottom-left panel of Fig. 5. From this plot, we conclude there is no reason to adjust the shoulder profiles when matching to dijet -the shoulder singular still accounts for nearly the complete LO distribution for ρ 0.25 and the shoulder singular is the same size as the overall nonsingular at ρ ≈ 0.15. However, for the dijet, when matching to the shoulder resummation it is no longer sensible to maintain dijet resummation up to ρ ≈ 0.33. If dijet resummation is allowed past the point where the dijet singular and the full nonsingular curves are nearly equal in magnitude, ρ ≈ 0.25, the resummation would upset the large cancellations between them, biasing the result. If dijet resummation is turned off at ρ 0.25, these cancellations are guaranteed.
A profile function for the matched dijet resummation consistent with these constraints is simply the same profile as for the shoulder, but growing in the opposite direction. That is, we take with ρ dij = ρ L and σ dij = σ L equal to those of g L in Eq. (4.18). The additional factor of g R (ρ) in Eq. (4.26) ensures that all resummation (dijet and shoulder) is turned off in the far tail ρ 0.342, maintaining consistency with our choice of profiles for the right shoulder.

Uncertainties
A common way to assess theoretical uncertainties due to missing higher orders is to vary the factorization scale. We consider three scale variations in the shoulder region. A hard -25 -variation involves changing Q → 2 v h Q in all the scales in Eq. (4.15). A soft variation involves varying the soft scale in the same was as for the hard variation, but not varying the hard and maintaining µ j = √ µ s µ h . For the jet scale variation, we take µ j = (µ h µ s ) v j and allow v j to differ from 1 2 . That is, these variations correspond to i.e., we vary the dijet scales in a fashion that is similar to the shoulder scales. We use a single hard scale variation v h , which is the same for both dijet and shoulder logs and appears in the nonsingular cross section in Eq. (4.24) as well, where we take µ FO = µ h . We independently vary the profile parameters ρ L , σ L , ρ dij , and σ dij . We do not consider variations of the profile functions in the right shoulder since resummation must turn off quite quickly on the right shoulder, so any reasonable variations which do not make the cross section negative are in any case smaller than the hard scale variation.
A summary of the profile parameters along with their variation ranges is shown in Table 1. As a preliminary estimate of the overall perturbative uncertainty, we consider an envelope of all the scale variations and the profile parameter variations.

Numerical results
With all the ingredients in place, we can now look at numerical predictions. All our results are given for e + e − hadron events at Q = 91.2 GeV with α s = 0.1179 [65].
We first examine the prediction using Sudakov shoulder resummation without dijet matching in Fig. 6. We compare canonical scales choices, as given in Eq. (4.15), to profile scales, as given in Eq. (4.16), and compare to fixed order distributions at LO and NLO [66,67]. One can also see from this figure that the profile functions smoothly transition out of the region where shoulder logarithms are large (ρ ∼ 1 3 ) into the fixed order region at smaller ρ. This figure also confirms that setting scales in position space eliminates the Sudakov Landau poles encountered in Ref. [26]. The right panel in Fig. 6 Table 1: Parameter ranges used for the uncertainty estimates in both the shoulder profile function and dijet profile function. We fix the profile function for the right shoulder to have ρ R1 = 1 3 and ρ R2 = 0.342, and fix µ min s = 2 GeV. Varying these parameters has negligible effect on the distribution. For the dijet resummation (when matching to the fixed-order result only), we use the profile function from [15], with the following set of parameters defined in that reference: n 0 = 2, n 1 = 10, t 2 = 0.2 ± 0.025, t s = 1 3 ± 0.025, r s = 1 ×2 ÷2 , e j = 0 ± 1, e h = 1 ×2 ÷2 .  the region around ρ = 1 3 . It shows that in the matched curves, the non-analytic behavior (kink) of the fixed-order distributions is completely resolved by the resummation of shoulder logarithms, leaving behind a distribution with continuous first derivative. Although the NLL+LO matched curve goes negative for ρ > 1 3 , this is due to our boundary conditions on integrating the third derivative which makes it match exactly to LO at ρ = 1 3 . For the NNLL+NLO curve, the offset is positive and the profile functions of the right shoulder are chosen to turn off the shoulder resummation quickly enough to avoid a negative cross section. Comparing the NLO curve to those at NNLL+NLO, we see that the shoulder resummation has a considerable impact for values of ρ 0. 25  dρ . This is done to highlight and expand the transition region without distorting the curves. For the green (blue) curve, we vary the shoulder (dijet) profile only, and for the red curve, we present the envelope of both profiles. Fig. 7 how the profile functions interpolate between the regions where shoulder resummation and dijet resummation are important. As discussed in Section 4.3, we use the same profile functions for both dijet resummation and shoulder resummation, such that the dijet resummation is active in the range 0 ≤ ρ 0.2, and the shoulder resummation is active in the range ρ 0.25, with a smooth transition between them. The parameters used and their variations are given in Table 1. We use NNLL resummation for the shoulder, NNLL resummation for the dijet, and match to the fixed NLO (O(α 2 s )) distribution. These form a consistent set of resummed orders since they both contain the complete O(α 2 s ) information in the respective singular limit. We see that the matched resummed distribution smoothly merges into the dijet resummation near ρ ∼ 0.2, and into the shoulder resummation near ρ ∼ 0.25, as desired. In the right panel of Fig. 7, we highlight the transition region by dividing the distribution by r = ρ − 1 3 . In this panel we include profile uncertainty estimates obtained by varying the profile parameters within ranges given in Table 1. In all cases the band shown is the envelope of the relevant individual variations.
We next plot the various uncertainties from individual hard, jet, and soft scale variations and profile variations in Fig. 8. The first panel shows the fixed-order predictions. We use Event2 [66,67] for NLO and CoLoRFulNNLO [5,6] for NNLO. The hard scale variation for the shoulder and dijet resummation is correlated, since there is only a single common hard scale for both factorizations. We vary the hard scale by a factor of 2 above and below its central value at 91.2 GeV. The soft scales (µ s,sh , µ s,dij ) are both independently varied for the shoulder and the dijet. The range of variation is approximately a factor of 2 relative to their pointwise central value, while simultaneously varying the jet scale using the relation µ j = √ µ h µ s . The jet scale variation is done as per the prescription in Eqs. (4.27) and (4.28). Note that all variations maintain the monotonicity condition µ s < µ j < µ h (up to the numerically negligible region where |z| < 0.56), and are thus sensibly defined. Finally, the profile parameter variations are performed as summarized in -28 -   Table 1. We recall that the shoulder profile variation is performed while fixing the profile for dijet scales and varying only the shoulder profiles, and vice-versa for the dijet profile variation.
Our final prediction, with uncertainties given by an overall envelope of all variations, is shown in Fig. 9. Increasing the resummation order leads to a visible decrease in perturbative uncertainty. At the same time, the NNLL result is fully covered by the larger uncertainty estimate at NLL. The dominant source of perturbative uncertainty in both the shoulder and dijet region is the variation of the respective soft scale.
Finally, we compare our prediction combining shoulder and dijet resummation to the previous approach where only dijet resummation was included [14] in Fig. 10. For reference, we also plot the NLO and NNLO distributions in the range 0.10 ρ 0.4. Fig. 10 also shows the associated perturbative uncertainty with each resummation, which we obtain by considering an envelope of all scale and profile variations. (Note that the resummed results with and without shoulder resummation use different profile functions, as described in Section 4.2.) The uncertainty associated with the FO curves is given by a scale variation by a standard factor of 2 around the central value µ FO = m Z . The sharp spikes in the dijet resummation and FO distributions are indicative of the presence of large shoulder logs, which are exponentiated by shoulder resummation, leaving behind a smooth result. The addition of shoulder resummation has a numerically significant impact on the value of the cross section in the region ρ 0.2, which includes both the transition region between dijet and shoulder resummation 0.2 ρ 0.25 and the region where shoulder resummation dominates, for ρ 0.25. These differences will be relevant for α s (m Z ) fits using the heavy jet mass distribution. Next, turning to the region of very large ρ > 1 3 to the right of the shoulder, we can see that the fully matched prediction merges onto the fixed NLO result past ρ 0.34, as desired for both the central value and uncertainty band. Since the NLO is -30 - Relative deviation Figure 10: Plots comparing our dijet + shoulder resummation to both "full" dijet resummation (matched to NLO, but no shoulder) and fixed order results at NLO and NNLO.
In the left panels we show results over a wider range of ρ where both shoulder and dijet resummation are active, while in the right panels we zoom in on values of ρ near the Sudakov shoulder. In the upper panels we show absolute cross sections, while in the lower panels we show variations relative to the central curve for our highest order shoulder+dijet resummed result, namely ∆(resum) ≡ resum NNLL pro dij +NNLL pro sh +NLO − 1.
the first non-zero order to contribute in this region, its O(20%) uncertainty is substantially larger than any of the predictions below the shoulder. Finally, we observe that the fully matched prediction with shoulder resummation provides a realistic uncertainty estimate across the shoulder, smoothly connecting the NNLL uncertainties on the left with the NLO (effectively LO) uncertainties on the right. In contrast, both fixed order and dijetresummation-only predictions are discontinuous at the shoulder and their uncertainty bands can be seen to be unreliable, since they do not capture the impact of the large logarithmic contributions. From the NNLO curve in Fig 10 we see that this higher order contribution does not improve the ability of fixed order results to capture the large resummation effects in the 0.1 < ρ < 0.34 region. Including NNLO results can be expected to reduce the perturbative uncertainty for ρ > 0.34, where we note that the NNLO central value is within the NLO uncertainties.

Conclusions
In this paper we have considered the resummation of Sudakov shoulder logarithms in the heavy jet mass distribution in e + e − events. These logarithms were identified by Catani and Webber [24] in 1997, and a systematically improvable method to compute and resum these logarithms was developed in 2022 [26]. When the factorized distribution is resummed in momentum space, that is the space where the heavy jet mass itself lives, spurious "Sudakov Landau poles" result. These poles are divergences in the distribution arising from scale setting and have no physical significance. Similar poles have been found in factorized expressions at hadron colliders, such as the q T spectrum in Drell-Yan. This is the first example we know of for Sudakov Landau poles in a non-transverse momentum distribution. The Sudakov Landau poles in the heavy jet mass distribution were found in [26] and prohibited the quantitative application of the resummation formula. In this paper we have identified the origin of these poles as arising from an inconsistent solution of the evolution equations: when scales are set in momentum space, subleading terms in logarithmic power counting are included which give rise to these singularities. In contrast, when the resummation is performed in position space and the momentum-space distribution is only generated by a final Fourier transform after scales have been set, then the Sudakov Landau poles are removed. We verified this analytically and numerically.
In addition to the Sudakov Landau pole, another complication in the HJM shoulder is the existence of a linear UV divergence in dσ dρ associated with the region where the light and jet mass are both very large, but their difference is finite. We regulate this divergence by considering the third-derivative d 3 σ dρ 3 , and specifying appropriate boundary conditions to carry out the double integral, which fix terms of the form c 0 + c 1 ρ. In particular we demanded that the resummed and matched distribution and its first derivative agree with the fixed order distribution at r = 1 3 − ρ = 0, where perturbative uncertainties provide natural variations in this constraint.
The trijet hemisphere soft function for the Sudakov shoulder is a two-scale function, like the one for heavy jet mass in the dijet region. Thus it has non-global structure which is worth exploring. Unlike the hemisphere soft function, the trijet hemisphere soft function is not symmetric in its arguments, and there is not a natural way to separate its global and non-global parts. While one might like to try to choose different complex scales µ s and µ sh for the light and heavy hemispheres, we find such complex scales are problematic, since it is difficult to ensure logs of the form ln µ s µ sh do not induce spurious effects. If instead, one uses a single real soft scale, then the only missing logarithms are non-global in nature and integrate to a fixed-order contribution to the trijet soft function starting at order α 3 s , similar to the leading non-global logarithmic contribution to the dijet heavy jet mass distribution. Such contributions are formally N 3 LL, the same order as the unknown 2-loop soft function constant. Thus our NNLL calculation is self-consistent with real scales and dropping the non-global contributions.
To make a quantitative prediction, we needed to match between the dijet and shoulder regions. This requires us to modulate the hard, jet and soft scales with profile functions, where as ρ grows we turn off the dijet resummation and turn on the shoulder resummation.
-32 -We included such profiles and provided the first quantitative predictions that include both types of resummation. We find that at the NNLL level, the Sudakov shoulder resummation provides as much as a 20% increase over the NLO cross section for values of ρ 0.25, and captures effects that may be numerically important for α s (m Z ) fits for ρ 0.2.
For a full phenomenological analysis and comparison to data, the main remaining ingredient needed is the incorporation of non-perturbative power corrections due to hadronization. In the dijet region, much is understood about how the power corrections affect the distribution. In the trijet region, much less is known. It will be important to understand how power corrections coming from factorization formula in the trijet region can be combined with power corrections from the dijet region in a manner that goes beyond hadronization models. Combining the NNLL resummation of Sudakov shoulder logarithms that we have developed in this paper with a rigorous field theoretic treatment of power corrections could be critical in resolving long-standing discrepancies between the values of α s extracted from the heavy jet mass distribution and the world average.

A One-loop trijet hemisphere soft function
In this section, we review the calculation of trijet hemisphere soft function needed for both thrust and hemisphere. Based on the analysis of trijet kinematics in [26], we define the six-directional differential soft function via with the conventional factor ι = e γ E 4π and the eikonal matrix elements , , -33 -and the soft measurement function (M ) at one-loop Here n 1 , n 2 and n 3 are three lightlike vectors that describe the trijet configuration. For concreteness, we align n 1 = (1, 0, 0, 1) , n 2 = 1, 0, For thrust, the last two projections N 2,3 are lightlike while for HJM, it is spacelike. Explicitly, Thrust : N 2 =n 2 , N 3 =n 3 HJM : N 2 = 1, 0, The short-hand notation δ(q j =1 ) stands for δ(q 2 )δ(q 3 )δ(q1)δ(q2)δ(q3). In the following, we will focus on the HJM case. To simplify the calculation, we perform the standard topology identification. First of all, let us introduce the notation (A.6) then our soft function can be written as a linear combination of I na,n b ,nc,n d ,N . For example, the first term in Eq. (A.3) can be mapped to ,n 3 ,n 2 ,n 3 ,n 1 + 1 2 C A I n 1 ,n 2 ,n 2 ,n 3 ,n 1 + 1 2 C A I n 1 ,n 3 ,n 2 ,n 3 ,n 1 . (A.7) Using rotations and reflection, we can rename the projection vectors and reduce the soft function into 7 master integrals: I 0 = I n 1 ,n 2 ,n 2 ,n 3 ,n 1 , I 1 = I n 2 ,n 3 ,n 2 ,n 3 ,n 1 , I 2 = I n 1 ,n 3 ,n 2 ,n 3 ,n 1 , I 3 = I n 2 ,n 3 ,n 2 ,n 3 ,n 1 , I 4 = I n 2 ,n 3 ,n 1 ,n 2 ,N 3 , I 5 = I n 1 ,n 2 ,n 1 ,n 2 ,N 3 , I 6 = I n 1 ,n 3 ,n 1 ,n 2 ,N 3 . (A.8) To evaluate the integrals, we use the lightcone parameterization, appropriately expand to the first order in d = 4 − 2 and compute each piece numerically. It turns out some of the numbers can be reconstructed with the pentagon table [68] via PSLQ [69,70]. I 0 is the only divergent integral, which reads For others, we get π ln 2 + c 3 , I 2 = N − 2κ + 2π ln 2 + c 4 , π ln 2 + c 5 , I 4 = N − 2κ + 2π ln 2 + c 6 , where the constants κ and c i are and the overall normalization is Putting back the color factors and renormalizing in MS, we obtain the differential soft function defined in Eq. (A.1). To get the trijet hemisphere soft function, we need to integrate the soft momenta over the hemispheres, i.e.
Note that for convenience, we split the soft constant into S iL (q L , µ) and S iH (q H , µ) and leave only non-global logarithmic structure in S f (q L /q H ). Since non-global logarithms are the same order as the finite part of the 2-loop soft function, which is relevant only for N 3 LL -35 -resummation, we simply set S f (q L /q H ) = 1. The final result is with the non-cusp anomalous dimensions .15) and the soft constants For completeness, we can also extract the exact 1-loop soft function for thrust in the trijet region. The only difference between the thrust and HJM trijet soft functions is that the soft constants in the light hemisphere are different:

B Hard, jet and soft functions
In this Appendix, we summarize the hard and jet functions. We write the cusp anomalous dimension as where [71,72] Γ 0 = 4 ,

Hard Functions
Hard functions for event shape factorizations in SCET are obtained by matching QCD form factors onto corresponding SCET operators. In general, the hard function RGE takes the form where Γ H is the anomalous dimension proportional to the cusp anomalous dimension, and γ H depends on the relevant factorization and kinematics. The solution to the above RGE is given by is the fixed order hard function at the scale µ h . Perturbatively, we have The constants c H i encode the fixed order, non-log pieces in the hard function. At our working order, the hard anomalous dimension has the form [73] where T a j denotes the color charge of the j th external hard parton, written using color space formalism [74]. Here q i is the momentum of the outgoing i th hard parton. For the shoulder resummation, the hard kinematics is a trijet one, equally spaced in angle given by with the relevant partons being q,q, g. The relevant color sums are The non-cusp hard anomalous dimension γ H is also color diagonal up to two loop order, and is thus a sum of quark and gluon pieces, (i.e) These are extracted from the expressions in [47,75] and are given by where [76][77][78][79][80] γ q 0 = −6C F , (B.10) Using Eqns. (B.6), (B.7), (B.8) in (B.5), we find that the anomalous dimension Γ H for the shoulder trijet hard function is For NNLL resummation, we also need the c H i constants to one-loop, namely c H 1 . It can be extracted from the real-virtual correction of γ → qq squared matrix elements [42,81], ln 3 − 10 ln 2 ln 3 + 3 ln 2 3 + 10Li 2 1 3 The hard anomalous dimensions are also common to other hard functions that have appeared earlier in the SCET literature [40,41], and we have verified ours against the same.

Jet functions
The jet function RGE in the Laplace space is We will need the O(α s ) piece for NNLL shoulder resummation and O(α 2 s ) piece for NNLL dijet resummation. For quark jets [38,82,83], the anomalous dimension and constant are For gluon jets, we have [39,40] Γ

Soft functions
The RGE for the trijet hemisphere soft function is and its solution is The soft anomalous dimension for each hemisphere is The two-loop non-cusp anomalous dimension can be obtained from RG invariance The soft constants were computed in Appendix A. In Laplace space and in the above notation they translate to

C Running coupling and Sudakov kernels
We expand the QCD beta-function as where the coefficient up to three loops are given by [84][85][86][87] It is also useful sometimes to have an expression for the coupling at one scale in terms of the coupling at a different scale. For two-loop running, such an expression is and at three loops (C.4) For reference, we also provide explicit expressions [37,88] for the Sudakov RG kernels appearing in the evolution factors. Their definitions are: To obtain the singular expression in momentum (r) space, we need to take the inverse Fourier transform Eq. (D.2) and integrate over r twice. To achieve this, we derive the map between logs in the momentum z space and position r space. Specifically, we consider the dictionary between plus distributions of the form where the digamma function is defined as ψ(z) ≡ Γ (z) Γ(z) and ψ (n) (z) ≡ d n ψ(z) dz n . From the RHS, we can see that the first log basis ln m (|z|e γ E ) is mapped to the sum of both left and right shoulder r space logs, which is symmetric. Furthermore, we can also expand I [f 1 (z, )] in and read off the rules between ln m (|z|e γ E ) and spectrum logs I ln 2 (|z|e γ E ) = 1 + π 2 24 |r| (1 + sgn(r)) + 1 2 |r| ln 2 |r| − |r| ln |r| = θ(r) 2r + π 2 12 r − r ln r + 1 2 r ln 2 r + θ(−r) r ln(−r) − 1 2 r ln 2 (−r) .
For the second group of terms in the basis, ln m (|z|e γ E ) sgn(z), we consider the generating functional f 2 (z, ) = iπe 2 γ E |z| 2 sgn(z) , (D. 12) and its inverse Fourier transformation At O(σ LO α s ) = O(α 2 s ), the difference of linear slopes to the left and the right of the shoulder threshold, which amounts to a non-analytic term, is uniquely predicted by the effective field theory, Computing the difference a (2,1) L − a (2,1) R would require knowledge of the non-global structure of the two-loop trijet hemisphere soft function. This is analogous to how the 2-loop soft constant for the dijet heavy jet mass distribution depends on the non-global structure of the hemisphere soft function. The difference is that for the shoulder case the constant is different for the left shoulder r > 0 and the right shoulder r < 0. In Fourier space, the -46 -terms multiplied by a In contrast, for heavy jet mass in the dijet limit we get a Laplace transform of the form L(ν) ∼ α s 4π 2 b (2,1) ln ν + b (2,0) , (E.9) with b (2,0) = c 2ρ in the notation of [14]. For the dijet limit, the b (2,1) term contributes α 2 s L terms in the heavy jet mass cumulant, while the non-global b (2,0) piece only α 2 s times a constant. In the shoulder case, both terms from the second line of Eq. (E.8) are "constants", in that their derivative with respect to z vanishes almost everywhere. A full non-global structure calculation would be required to determine them. In this way, the non-global structure is needed to predict the difference between the α 2 s L terms on the left and right shoulder, while it is only need to predict the α 2 s L 0 term in the dijet limit. We stress that despite the scaling ∼ α 2 s L of this difference term in momentum space, the logarithmic counting for the Sudakov shoulder region must be done in position space for consistency.
In position space, the extra terms in consideration are constants and thus formally N 3 LL.