The role of the threshold variable in soft-gluon resummation of the $t\bar{t}h$ production process

We study the role of the threshold variable in soft-gluon threshold resummation. We focus on the computation of the resummed total cross section, the final-state invariant-mass distribution, and transverse-momentum distribution of the Higgs boson when produced in association with a top-anti-top quark pair for the Large Hadron Collider operating at 13 TeV. We show that different choices for the threshold variable result in differences at next-to-leading power, i.e. contributions that are down by one power of the threshold variable. These contributions are noticeable numerically, although their effect on the resummed observables lies within the scale uncertainty of those observables. The average central results, obtained after combining several central-scale choices, agree remarkably well for different choices of the threshold variable. However, different threshold choices do effect the resulting scale uncertainty. To compute our results, we introduce a novel numerical method that we call the deformation method, which aids the stabilization of the inverse Mellin transform in cases where the analytical Mellin transform of the partonic cross section is unknown. We show that this method leads to a factor of 10 less function evaluations, while gaining a factor of 4-5 in numerical precision when compared to the standard method.


Introduction
The top-quark Yukawa coupling is the largest one in the Standard Model (SM). A precise measurement of its properties is one of the prime goals of the experiments at the Large Hadron Collider (LHC), as its measurement can place constraints on beyond-the-SM (BSM) scenarios that modify the SM Yukawa coupling. Therefore, it is important to have a precise and accurate prediction for the tth cross section. At present, the experimental uncertainties dominate the theoretical ones. However, the experimental uncertainties rapidly decrease with the collection of more data. The top-quark Yukawa coupling has recently been measured by the ATLAS [1] and CMS [2] experiments, constraining its value to lie within 20 − 30% of the SM value. The total cross section for the tth production process as reported by ATLAS is 670 ± 90 (stat.) +110 −100 (syst.) fb at the hadronic center-of-mass (CM) energy of √ S = 13 TeV, while CMS reports 639 +157 −132 fb. The prediction of the SM for the tth cross section (given below) lies within the 2σ-interval of these measurements. The next-to-leading order (NLO) QCD corrections to the tth process were calculated already some time ago [3][4][5]. Electroweak corrections were also studied [6][7][8], and the effect of off-shell decaying top-quarks was studied in Ref. [9]. As summarized in Ref. [10], the best estimate for the NLO QCD (+electroweak) cross section with the Higgs mass at m h = 125 GeV and the top mass m t = 172.5 GeV is 498.7(507.1) +5.8% −9.2% fb for √ S = 13 TeV. Since the tth process is already quite involved at NLO, it is unlikely that higher-order corrections will become available soon. However, the impact of soft-gluon corrections can be studied by resummation. Resummation up to nextto-leading logarithmic (NLL) accuracy of the total cross section is performed in Ref. [11]. This is extended to next-to-next-to-leading logarithmic (NNLL) order in Ref. [12], where the invariant-mass distribution of the tth process is computed. Furthermore, electroweak corrections are included in Ref. [13], where also other kinematic distributions and the effect of scale variations are studied. Using the orthogonal Soft-Collinear-Effective Theory (SCET) approach, the total cross section and several kinematic distributions have been computed to NNLL in Refs. [14][15][16][17]. The tth process is an interesting playground for threshold resummation, as it has three particles in the final state already at leading order (LO). This makes for a complicated phase space. Firstly, the Mellin transform of the partonic cross section cannot be performed trivially, creating a potentially severe numerical issue. Secondly, as there are many ways to share the kinematics amongst the final state particles, multiple threshold definitions may be used. In this work we explore the impact of choosing different threshold variables on the resummation of the total cross section, the invariantmass distribution of the final state, and the transverse-momentum distribution of the Higgs boson. We present our results at NLL accuracy, as many of the important features are already present at NLL. The outline is as follows: in Section 2 we set up our notation using the tth process at LO. In Section 3 we consider the resummation of the tth process and introduce the threshold variables we consider in this work. Resummation in QCD is traditionally performed in Mellin space, and the resummed result needs to be transformed back to physical space to be compared against experimental results. This inverse Mellin transform needs to be performed numerically and is notoriously unstable. In Section 4 we introduce a new numerical method that brings better behavior to the computation of the inverse Mellin transform. This method is applied to compute the results, and the improvement in speed and the obtained numerical accuracy is shown in Section 5, along with the results for the LHC operated at √ S = 13 TeV. As we will see, the threshold definitions differ in the way they include next-to-leading power (NLP) corrections, which are contributions to the resummed result that are down by one power in the threshold variable. Their role is discussed in Section 6. We conclude in Section 7.

Notation and conventions
To set up our notation, we first introduce the tth process at LO. The final state may be created by two partonic processes: gg → tth and qq → tth. The initial-state momenta are indicated by p 1 and p 2 , while the final-state momenta are labeled p t , pt and p h . We define the invariants of which only 5 are independent. The four-momentum of the Higgs boson p h is parameterized as = (E h , p T , p z,h ) = (m T,h cosh η, p T , m T,h sinh η) .
The former parameterization is useful for the invariant-mass distribution, while the latter is more useful for the transverse-momentum one. The variable m T,h is the transverse mass of the Higgs boson and defined through with m h the mass of the Higgs boson and p T ≡ | p T,h | the transverse momentum. The pseudorapidity is indicated by η. Note that the variables η, m T,h and p T are not mutually independent at LO. One needs to parameterize the three-body phase space in a suitable way to compute the invariantmass or transverse-momentum distribution. As is common, we separate the three-body phase space into two two-body ones dΦ 3 (p 1 + p 2 ; p t , pt, p h ) = 1 2π ds tt dΦ 2 (p 1 + p 2 ; p tt , p h ) dΦ 2 (p tt ; p t , pt) , with p 2 tt = s tt . The phase space of the tt-system dΦ 2 (p tt ; p t , pt) may be computed in the CM system of tt. This system is denoted by the starred notation ( * ), and the direction of travel of the tt-system in the CM system of the incoming particles is used as the z-axis with respect to which the angle θ * is defined. The phase space of the tt-system then results in For this to have a solution, we need that E * t ≥ m t and therefore √ s tt ≥ 2m t , which puts a lower boundary on the s tt integration. The angular integrations are bounded as usual by θ * t ∈ [0, π] and φ * with λ 1/2 (x, y, z) = x 2 + y 2 + z 2 − 2xy − 2xz − 2yz .
The connection between the tt CM kinematics and the partonic CM frame is detailed in Appendix B for the convenience of the reader. The phase space integration of the (tt)h-system dΦ 2 (p 1 + p 2 ; p tt , p h ) is evaluated in the partonic CM system, where the four-momenta of the initial-state particles read (1, 0, 0, 1) , p 2 = √ s 2 (1, 0, 0, −1) .
With this, the phase space of the (tt)h-system becomes Using these definitions, the LO fully differential partonic cross section is written as K ij spin,color |M ij | 2 ds tt dΦ 2 (p 1 + p 2 ; p tt , p h ) dΦ 2 (p tt ; p t , pt) .
Here, the M ij are the matrix elements for the partonic initial states ij = qq and gg, and K ij is an averaging factor for the initial state spins and colors. The matrix elements can be found in Ref. [5], given in terms of the partonic Mandelstam invariants of Eq. (1). From this, the hadronic differential cross section follows as (11) where S is the hadronic CM energy squared, f i/j (x 1/2 , µ 2 F ) denote the parton distribution functions at the factorization scale µ F , and x 1 and x 2 are the parton-momentum fractions. Having set up our notation, we are now ready to turn our attention to the resummation of the tth production process.
After the transformation to N -space is performed, we may replace the partonic matrix element with its resummed version. By doing this, the resummed differential cross section in N space reads [11,12] dσ res tth (N ) = i,j f i (N + 1, µ 2 F )f j (N + 1, µ 2 F ) Tr H ij→tth (N )S ij→tth (N + 1) (12) The moments of the parton distribution functions are defined in the standard way The trace of Eq. (12) acts in color-tensor space, and implicitly includes the averaging over the initial-state spins and colors (1/(4N 2 c ) for qq channel or 1/(4(N 2 c − 1) 2 ) for the gg channel). The matrix H ij→tth , decomposed in color-tensor space, contains the hard-scattering contributions. The soft-collinear enhancements are captured by the functions ∆ i (N, , and the soft wide-angle contributions are included via S ij→tth (written in the same basis as H ij→tth ). The matrices H and H implicitly depend on the factorization scale µ F and the renomalization scale µ R . Note the mismatch between the moments of the hard function, and the soft and collinear functions. This is caused by the form of the δ-function in Eq. (11). To separate the PDFs from the partonic cross section, one uses a partonic threshold variable reading is the fraction of s that is radiated away by soft gluon emissions (i.e. z 1). This fraction (1 − z) can be further decomposed into one fraction being radiated by the wide-angle emissions, and one by the initialstate jet functions (see e.g. [18]). Note that z is different from ρ in general, unless one sets M 2 = Q 2 . The soft-collinear enhancements are generated by the integral The coefficient A i is a power series in the coupling α s (q 2 ). Note that we set the upper limit of the integral to M 2 (1 − z) 2 , which is an approximation. We will come back to this point in Section 3.1.
To NLL accuracy, the integral results in and b 0 the first-order coefficient of the QCD β-function (see Appendix A). The g (i) functions are collected in Appendix A. In N -space and up to NLL, the resummed soft function S is given by a solution of the renormalization group equation (RGE) [19] and takes the form The matrix S ij→tth is the boundary condition for the RGE. The logarithmic enhancements are captured by the evolution matrix U: Here, P denotes the path ordering in the variable q, and Γ ij→tth (α s (q 2 )) is the soft anomalous dimension that has the perturbative expansion The explicit form of the soft anomalous dimension is collected in Appendix A. The path ordering is not needed if the soft anomalous dimension matrix is diagonal. In such cases, we can simply solve the integral up to NLL to obtain The resummed soft function at NLL can then be written as However, this simple form can only be obtained if indeed the soft anomalous dimension becomes diagonal in the threshold limit, which will not be the case for every threshold variable that we consider. If it is not the case, we need to diagonalize it in order to reduce the path ordered exponential to an ordinary exponential. We make use of the method outlined in [19], and introduce a matrix R such that with Γ R,ij→tth,lm = λ l δ lm and λ l the l th eigenvalue of Γ (1) ij→tth . The other two matrices S and H then also need to be written in this basis. This procedure is summarized in Appendix A. The hard function can also be written as a perturbative series in α s . In order to perform NLL resummation one only needs to know the lowest-order contribution H (0) ij→tth , which can be obtained after decomposing the LO matrix element for tth production into s-channel color bases, one for each channel [20]. This is a standard technique, and the resulting expressions for H where we leave the dependence of dσ NLL ij→tth on M 2 , µ 2 R and µ 2 F implicit. The resummation-improved distributions can be obtained by matching the resummed result to the fixed-order result as follows with c a real number.
In what follows, we demonstrate how different parameterizations of the threshold variable may be chosen, depending on the distribution one is interested in. We start by examining the threshold behavior of the invariant-mass distribution, followed by that of the transverse-momentum distribution. Of course, one may integrate the resulting resummed distributions to obtain the full cross section.

Threshold definitions for the invariant-mass distribution
For the invariant-mass distribution, we parameterize the two-body phase space of Eq. (9) using the first expression of Eq. (2) for the momentum of the Higgs boson. Using the azimuthal symmetry of the matrix element, we can write with We aim to achieve factorization of the partonic coefficient function and the PDFs after Mellin transforming Eq. (11) by choosing a suitable threshold variable. The first option is to define with Q 2 the invariant mass of the final state, i.e. Q 2 = (p t + pt + p h ) 2 (= s at LO). We are interested in the invariant-mass distribution. To this end, we Mellin transform Eq. (11) with respect to τ abs , and rewrite the argument of the δ-function as We also make a variable transform from s to ρ abs , where the factor of 1/s stems from Eq. (29). After these two steps, we may interchange the order of the τ abs and ρ abs integrations, and obtain dσ LO,abs By now introducing an integral over Q 2 using Q 2 s = (2m t + m h ) 2 /ρ abs (valid at LO and in the limit of z → 1) we rewrite Solving the δ-function for ρ abs leads us to the invariant mass distribution The integration bounds on s tt become ρ abs -dependent and read As noted before, ρ abs was used in Ref. [11] to compute the resummed total cross section to NLL accuracy. Now we come back to a point that was raised before, namely the approximation applied to the upper limit in the integral of Eq. (14). One can show using phase-space arguments [21] that the upper limit of this integral actually reads s(1 − z) 2 /z. Two approximations are made to end up with Eq. (14) that are valid up to leading power in the threshold variable: 1/z 1 and s M 2 . It is trivial to see that the first replacement may be made for the z → 1 limit valid in the limit where all gluons are soft, which is the only region where one can guarantee the validity of Eq. (14). The z → 1 limit is isolated by the N → ∞ limit. By taking N → ∞, the largest contribution of Eq. (33) comes from ρ abs 1. Therefore, we replace s → M 2 . Corrections of this replacement are of O(1/N ), i.e. of next-to-leading power. Therefore, by choosing different threshold variables, one is actually probing the importance of partial sub-leading power corrections. Since we are after the invariant-mass distribution, we have to integrate over the invariant mass of the tt pair. Upon taking a closer look at its integration limits, we may find our second threshold variable as M 2 = √ s tt + m h 2 such that It is easy to see how this threshold definition arises if one switches around the order of the s tt and s integrations. That is, we may write where indeed the lower limit of the s-integration indicates an edge of phase space that depends on the value of s tt . Secondly, we make a variable transform from s to ρ s tt to obtain We then perform the Mellin transform of Eq. (38) with respect to τ s tt , which uses the δ-function in Eq. (38). The invariant-mass distribution is obtained by again introducing the integral This δ-function may be used to solve the integral over ρ s tt such that the invariant-mass distribution becomes As one can see, the integration limits on s tt are now threshold-variable independent, and directly depend on Q instead (which guarantees that with ρ s tt ≤ 1, Eq. (39) has a solution). The third and final option is obtained by setting M 2 = Q 2 , with which was used in Ref. [12] to compute the resummed invariant-mass distribution to NNLL accuracy. With this definition and using the same steps as before, the invariant-mass distribution is easily obtained as Obviously, the resulting LO invariant mass distributions are independent of the parameterization of the threshold variable. However, the resummed result is impacted by this choice because of the upper limit in Eq. (14) and Eq. (17). Using ρ = ρ abs , the soft-anomalous-dimension matrices (Eq. (93) and (94)) become diagonal in the threshold limit where ρ abs → 1. The off-diagonal components contribute at O(1/N ). We will explicitly include these in our numerical analysis in Section 5 to analyze the numerical impact of this approximation. The soft-anomalous-dimension matrices are not diagonalized by the threshold limit for ρ s tt and ρ Q 2 .

Threshold definitions for the Higgs transverse-momentum distribution
To compute the transverse-momentum spectrum, it is more convenient to use p h as given in the second expression of Eq. (2) to write down the phase space of the (tt)h-system. We obtain To find solutions to the δ + -distributions, we need that m T,h ≥ m h and to multiply the final result by 2 to also take into account the p z,h < 0 configuration (for which θ h > π/2). We have included this factor in the equation above. We may then turn around the integration order of the s tt and s integrations as in Eq. (37), resulting in with m T,4m 2 t = p 2 T + 4m 2 t , in analogy to the definition of m T,h (Eq. (2)). There now arise two natural partonic threshold variables for a resummed transverse-momentum distribution: 1.
x 2 T,s tt = m T,h + m T,s tt 2 s , with m T,s tt = s tt + p 2 tt,T = s tt + p 2 T , 2.
The first option sets M 2 = m T,h + m T,s tt 2 , while for the second option we have the phase-space- with X 2 T,s tt = (mT,h+mT,s tt ) 2 S the hadronic threshold variable. For the second option, we have to use an upper boundary of the s tt integration that depends on the threshold variable. As shown in Appendix A.1, the soft-anomalous-dimension matrices are diagonal in the threshold limit for x 2 T,4mt 2 but not for x 2 T,s tt .
We have now set up our notation for the resummation of the tth process in N -space, and formulated different threshold definitions for the invariant-mass and transverse-momentum distribution. In what follows, we will present our numerical results obtained using the various threshold definitions for the invariant-mass distribution, transverse-momentum distribution, and the total cross section. An overview of these definitions (and scale choices) will be given in Section 5 ( Table 1 on page 17). First, however, we discuss a technical issue: performing the inverse Mellin transform.

The inverse Mellin transform
Stabilizing the numerical evaluation of the inverse Mellin transform is a notorious problem in direct-QCD resummation. Most commonly, the so-called Minimal Prescription (MP) method is used to handle the integral in Eq. (25), which consists of bending the contour towards the negative real axis with a large angle. This method works well if one has access to both the analytical Mellin transforms of the PDFs and that of the partonic cross section. For a complicated cross section already at LO like that of tth, the latter is not achievable. This presents numerical stability issues. The method that we introduce here reduces the oscillations of the integrand in the complex plane, and thereby helps to stabilize the numerical evaluation of the inverse Mellin transform.
Despite it often being mentioned in this context, the original MP as introduced in Ref. [22] is not a numerical prescription that tells one how to numerically compute the inverse Mellin transform. Instead, the MP is a definition for the existence of the inverse Mellin transform of the resummed formula. Before we discuss the issues of performing the inverse Mellin transform numerically, we briefly recall why the MP is needed to obtain an analytically viable result in the first place, as it is often confused in the literature with the numerical prescription.

Analytical considerations of the inverse Mellin transform
Let us first consider the definition of a Mellin transform [23,24]. Given a function g(t), assume that its integral up to a finite (real) a is bounded and that |g(t)| ≤ K e c 1 t for t → ∞ with K and c 1 real constants and K > 0. For such functions, a one-sided Laplace transform exists and is defined by The resulting function g(N ) is analytic for Re[N ] > c 1 . From this, we may derive the Mellin transform of a function f (x) for x ∈ [0, 1]. To this end, we set t = − ln(x) and f (x) ≡ g(− ln(x)), such that This is the form of the Mellin transform that we have encountered above. However, we may also get the Mellin transform from a two-sided Laplace transform by combining it with the one-sided Laplace transform of g(t) for t ∈ (−∞, 0]. The latter object reads with the requirement The Laplace transform of g(t) with t ∈ (−∞, 0] exists and is analytic for in Eq. (52), and combining it with Eq. (50), we obtain which holds for c 1 < Re[N ] < c 2 . We have used the subscript '∞' to distinguish this Mellin transform from the one in Eq. (51).
In order for g(N ) to represent a Laplace transform of the function g(t) with t ∈ [0, ∞), we need to require that The inverse Mellin transform must be taken over a straight line that runs from c − i∞ to c + i∞, as already shown in Eq. (25). The value c can be chosen arbitrarily, as long as it lies within the strip of definition, which is simply a consequence of Cauchy's theorem. If we pick a value of c that lies outside the strip of definition, there is no guarantee that the inverse will return the original function. We now turn our attention to the resummed cross section. The large-N -approximated resummed exponent introduces two branch cuts. These may already be observed in g Therefore, the strip of definition is given by 0 < Re[N ] < N L . Note that, besides the resummed exponent, the N -space PDFs and partonic cross section could also introduce additional poles in the Re[N ] > 0 domain, and place additional constraints on the values that c is allowed to take. However, now a problem appears, as such a bounded strip of definition only holds for a Mellin transform that is performed on a function f (x) with x ∈ [0, ∞), rather then x ∈ [0, 1]. This is caused by the fact that we cannot find a finite value for K for which Eq. (55) holds for any resummed partonic cross sectionσ(N ). Therefore,σ(N ) cannot represent the Mellin transform of any resummed partonic cross sectionσ(ρ) with 0 ≤ ρ ≤ 1. Instead, the resummed partonic cross section receives contributions from the non-physical domain where ρ > 1, which might be problematic if these terms grow large. The original MP [22] then consists of defining the hadronic resummed cross section in physical space through Eq. (25). This is validated in Ref. [22] by proving that the contribution of the domain where ρ > 1 is suppressed by a factor e −HQ(1−τ )/Λ QCD , with H ∼ ln(Q/Λ QCD ) a slowly varying positive function. Since Λ QCD = O(1) GeV, τ = 1, and Q = O(100) GeV represents the hard scale of the process, the contribution of the domain with ρ > 1 is indeed negligible. The original MP tackles the problem of the formally non-existent inverse Mellin transform. However, there is a second problem to worry about: that of the numerical convergence of the inverse Mellin transform.

Numerical stability issues of the inverse Mellin transform
To implement threshold resummation numerically, one needs to numerically perform the integration on the domain Im[N ] ∈ (−∞, ∞). This leads to large oscillations (as will be shown later), which are difficult to stabilize. One option, often referred to as the MP parameterization in literature, is to double the contribution of the integral in Eq. (25) in the Im[N ] > 0 plane, with N being parameterized as Note that this parameterization in principle has no relation to the original MP, which is the method that proves that the contributions of Eq. (25) outside the physical domain with ρ > 1 are exponentially suppressed, as summarized above. However, the reason why it is known in the literature as the MP parameterization is that it uses the same principle: it consists of bending the integration contour towards the negative real axis. However, for the orginal MP, the bending towards the negative axis is done infinitesimally (φ MP = O( )), while φ MP = O(1) in the MP parameterization above. The integral can formally not depend on the value of C MP and φ MP . However, choosing these parameters with care can be helpful for an efficient numerical evaluation. By choosing φ MP > π 2 (for y > 0), one introduces an exponential damping to the factor τ −N of Eq. (25). This means that the integrand is exponentially suppressed for τ < 1 and Re[N ] → −∞. The value of C MP has to be chosen inside the strip of definition, and in practice, it is usually chosen to be around 2.
If the Mellin transforms of either the PDFs or the partonic cross section needs to be evaluated numerically, the MP parameterization presents a problem. Firstly, as we have no handle on the analytical continuation of the Mellin transform, we do not know the strip of definition and therefore we do not know a good value for C MP , nor can we be certain that points outside the strip of definition will not be sampled by bending the contour towards the negative real axis of N . The more practical issue however is that the numerical Mellin transform of the PDFs (or partonic cross section) does not converge along the path of integration for Re[N ] < 0, as then x N with x < 1 will become increasingly large for x → 0. Secondly, x N oscillates heavily for Im[N ] → ∞ if no exponential damping is introduced. It follows that in cases where we do not have access to the analytic forms of f i (N, µ 2 F ) andσ(N ), there is an overall factor that reads The argument of the logarithm can be larger or smaller than 1 for fixed τ . When the argument becomes larger than 1, we would like to have a contour where Re[N ] < 0, as then the integral can converge numerically. On the other hand, if the argument is smaller than 1, we need Re[N ] > 0. These two requirements cannot hold simultaneously, although both domains, for constant τ , are probed in Eq. (25). One way in which the numerical stability issue may be circumvented is by setting (x 1 x 2 ρ)/τ > 1 to define the integration domain for ρ, x 1 and x 2 , and confirm that the contribution that originates from (x 1 x 2 ρ)/τ ≤ 1 is negligible by checking this explicitly. For the PDFs, one may employ the derivative method [25] to introduce an O(1/N ) (or higher, depending on how many derivatives are taken) suppression to the numerical oscillations. For resummation it is assumed that Re[N ] → ∞, hence additional factors of O(1/N ) will reduce the absolute size of the oscillations. However, there are worries whether one may trust the result of this method for general processes, especially when employing higher derivatives (see Appendix C for a justification of this statement).
In this work, we will make an analytical fit to the PDFs, using the functional form We demand that the fitted function lies within the 1σ error as given by the LHAPDF [26] grid implementation of the PDFs in the entire domain. By fitting the PDFs, we remove the numerical factor of x N 1,2 that could result in exponential growth. We now also know the strip of definition for the Mellin transform of the PDFs. More specifically, we require where the first set of conditions is respected in the fitting procedure. The last three conditions define the strip of definition for the Mellin transform of the PDFs.
To reduce the numerical stability issues of the Mellin transform of the partonic cross section, we invented a novel deformation method, which is explored in the next sub-section.

The deformation method
This sub-section covers a novel method to perform the Mellin transform numerically, which results in an additional suppression factor of 1/N , but also removes part of the numerical oscillations present in Eq. (58). Consider the Mellin transform of the partonic cross section We make the following change of variables  The new variable is w, which is complex-valued and N -dependent. The Jacobian leads to a 1/N suppression Then, for each value for N = C MP + iy, we integrate over a straight line in the lower left part of the complex w plane. This means that Eq. (61) now becomeŝ In Fig. 1, this integration path is indicated by the dashed blue lines for different values of N . These lines extend to Re[w] = −∞. This integration path does not converge numerically due to large oscillations that are induced by the imaginary part of w. However, by Cauchy's theorem, we may deform the integral to the real negative axis, where we can compute the integral. The total contour integral equals 0, as there are no poles enclosed, thus where f (w) = e w Nσ exp w N (see Eq. (64)). The contour integral consists of three segments, represented in Fig. 1 by the labels 1, 2, and 3. For N = C MP + iy, these paths can respectively be parameterized by when y > 0 with θ max < 3π 2 , and Segment 1: when y < 0 with θ min > π 2 (not shown in Fig. 1). The radius of the arc of integration path 2 is parameterized by R, which has to be taken to ∞ to obtain the original path. For y > 0, the contribution of segment 2 is The absolute upper bound of this integral may be estimated.
Since θ ∈ {π, θ max }, where θ max < 3π/2, the bounds on cos θ are −1 ≤ cos θ < 0. Therefore, we arrive at with 0 < α < 1. We assume that the function h(R, θ) is bounded, asσ represents a cross section. Hence, we observe that the integral of Eq. (68) is exponentially suppressed. We consequently find that the contribution of segment 2 vanishes for R → ∞ for θ max < 3π/2. Similar arguments hold for y < 0. It follows that the contribution from the real axis is precisely opposite to that of the original integration path. Therefore, we can deform the contour away from segment 1, and remove the oscillatory behavior that is present in the factor e w of Eq. (62) where w is parameterized by t e iθmax (see Eq. (66)). The new integration path is indicated by segment 3 in Fig. 1, where w is now integrated from −∞ to 0. The resulting Mellin transform of the partonic cross section readŝ In this procedure, we have removed the increasingly large oscillatory behavior from the Mellin transform for Re[N ] → −∞ and Im[N ] → ±∞, as now all N -dependence is captured in the variable ρ = exp w N , whose absolute value cannot grow bigger than 1. Moreover, we have introduced an exponential damping via the factor e w with w ≤ 0, and introduced an extra suppression factor of 1 N . One important improvement over the traditional methods is that we can parameterize N = C MP ±iy with y ≥ 0 to perform the inverse Mellin transform, as the tilted contour is not needed to obtain numerical convergence. Therefore, we do not have to worry about the validity of the Mellin transform outside the strip of definition, as this domain is never reached in the numerical integration. Secondly, we do not need to artificially introduce a cut-off on x 1 x 2 ρ to stabilize the numerical integration. However, we should note that by deforming the contour, w becomes real, but ρ becomes complex. Therefore, this method can only be used if we have access to the analytical ρ-behavior of the matrix element. In particular, this method is not suited if the matrix element is obtained numerically.

Numerical results
In this section we explore the numerical results one obtains by setting different threshold variables as defined in Section 3.1 and 3.2. As shown above, these variables result in different values for M that parameterize the upper limit of Eq. (14) and Eq. (17), and their impact on the results via terms proportional to ln(M 2 /µ 2 ) in the resummed functions can be partially undone by choosing suitable values for the renormalization and factorization scales. The recommendations of Ref. [27] specify to use µ R = µ F = µ ≡ m t +m h /2 for the computation of the total cross section. In Ref. [10], where a comparison of the tth cross section is performed using different parton showers, a central scale choice is used of either Following these recommendations, and considering that we integrate over the phase space of the tt-system, we examine the results for the following scale choices: The explicit scale logarithms in the NLL resummation function g (2) cancel if we use µ M with ρ Q 2 or x 2 T,4m 2 t , and if we use µ high with ρ abs . Since we probe the effect of different central scale choices, we refrain from varying µ R and µ F independently. In our presentation of the numerical LHC results, we use the fitted form (Eq. (59)) of the central member of the NNLO PDF4LHC15 PDF set [28], and take the CM energy equal to 13 TeV. We use MadGraph5_aMC@NLO [29] to obtain the NLO fixed-order result. Before we discuss the matched result, we first compare the numerical methods introduced in the previous section for the unmatched total NLL resummed cross section with ρ = ρ s tt .

Comparison of numerical methods
Here we compare the deformation method (Section 4.3) and the derivative method (Appendix C). We use the unmatched resummed NLL cross section for this comparison, set ρ = ρ s tt and µ = µ low . As covered in Appendix C, the derivative method may be used to obtain a suppression factor of 1/N 4 , which helps the inverse-Mellin transform to converge numerically. Three values of φ MP are probed (π/2, 5π/8 and 3π/4) for a range of C MP values between 0.75 and 3.0. For the deformation method we fix φ MP = π/2, and have the same range of C MP values. The resulting cross sections are shown in Fig. 2, where for the derivative method we only show the result for each C MP value with the minimal error obtained after varying over φ MP . One may immediately observe that the deformation method leads to a more stable result over a wider range of C MP values. We also compute a weighted average and weighted error via the standard method   Table 2: Results and number of function evaluations for the NLL tth production cross section using the deformation or derivative numerical integration method. The top two rows indicate the average results, and the bottom two the best results.
is σ deform. pp→tth = 0.36015 ± 0.00071 pb, which has a similar average value as the former method, but the numerical error is smaller by roughly a factor of 5. For the derivative method, the lower bounds on x 1 , x 2 and ρ need to be chosen such that x 1 x 2 ρ/τ > 1, otherwise the numerical integration is not stable. In Appendix C we expressed a worry that such a lower bound may create a boundary contribution. This contribution can be computed explicitly, as we know it must match the contribution in the domain where x 1 x 2 ρ/τ < 1. To compute it, we use values of φ MP ≤ π/2, as for φ MP > π/2 the integral does not converge numerically due to the x 1 x 2 ρ/τ < 1 requirement. The result can be observed on the right-hand side of Fig. 2, where one sees that the contribution of the boundary term is negligible, as the average result is consistent with 0, hence we can safely discard it. We summarize the results for the computation of the NLL tth cross section in Table 2, where we also indicate the number of function evaluations needed to obtain the indicated result and precision. For the average results, we quote the average number of function evaluations. We see that the deformation method leads to an error that is a factor of 4 − 5 smaller when compared to the derivative method, while it needs at least a factor of 10 less function evaluations. We conclude that the deformation method leads to a substantially more stable result than the derivative method, and does so by needing less function evaluations. Therefore, in the presentation of our results below, we employ the deformation method to evaluate all threshold variables, except for ρ Q 2 , as this choice directly eliminates the Mellin-space integral over the partonic threshold variable by virtue of the δ-function. Note that not only the threshold variable becomes complex for ρ abs and x 2 T,4m 2 t when using the deformation method: as these variables feature in the upper limit of the s tt integration, the s tt variable becomes complex too. We compute each matched result for the different values of C MP as indicated in Fig. 2, and use the average as obtained via Eq. (72) as our final result.

The invariant-mass distribution
We begin by discussing the invariant-mass distributions (Fig. 3). One may observe that the scale uncertainty of the NLO cross section is large. At the peak of the distribution around dimension matrices) lie between dσ tth /dQ = 1.18 − 1.31 fb/GeV, where higher values are typically found for µ = µ low . It is remarkable that different threshold choices affect the resulting scale uncertainty. The matched distribution obtained with ρ = ρ abs is most robust under scale variations, while that obtained with ρ = ρ Q 2 shows the largest dependence on the scale. Especially in the tail of the distribution this effect is visible: for ρ = ρ abs , the scale uncertainty nearly vanishes at Q = 1 TeV. By comparing the two plots in the middle panel of Fig. 3, one may directly observe a difference between the distribution obtained using an approximated diagonal form of the soft-anomalous dimension matrices (labeled with ρ abs (diag)) and that without approximating these matrices (labeled with ρ abs ). Although the difference lies within the scale uncertainty of the results, we feel that it is worth to investigate this further, which we do in Section 6.
In Fig. 4  The numerical value of the scale logarithm obtained with ρ = ρ Q 2 is higher than that with ρ = ρ s tt , therefore the deviation from the ρ = ρ abs result is largest in the former case. A similar effect is observed for µ = µ M : there, the distribution with ρ Q 2 shows a +4.3% correction at Q = 1 TeV. The distribution with ρ s tt lies above that with a +6.9% correction, and that obtained with ρ abs gives a very large +17.9% correction to the NLO distribution.
In contrast to what the plots in Fig. 4 may suggest, the numerical difference of using different threshold variables is actually small at the peak of the distribution: we obtain a central value of dσ tth /dQ = 1.18 − 1.23 fb/GeV for all three definitions at µ = µ M and µ = µ high , and around dσ tth /dQ = 1.31 fb/GeV for µ = µ low . We therefore see that the scale choice is of bigger importance than the threshold choice. Slightly larger discrepancies between using different threshold variables are found in the tail of the distributions. Especially the difference between using ρ abs and ρ s tt or ρ Q 2 grows for the scale choices µ = µ high and µ µ M : at Q = 910 GeV we find that dσ tth /dQ 0.39 fb/GeV for ρ s tt and ρ Q 2 , while dσ tth /dQ 0.44 fb/GeV for ρ abs . Interestingly, we do obtain a consistent result between all three threshold definitions at µ = µ low with dσ tth /dQ 0.44 fb/GeV. In Fig. 5 while the central value is obtained by simply averaging the maximum and minimum values. On the left-hand side of Fig. 5 we observe that a near-perfect agreement is found on the central value between choosing different threshold parameterizations. The scale uncertainties do differ, and the smallest scale uncertainty is found for ρ = ρ abs . By approximating the soft-anomalous dimension matrices we do find a noticeably different result of around +3% for all Q values. However, this difference lies well within the scale uncertainty of the results obtained without this approximation. On the right-hand side of Fig. 5, one sees that the correction from NLL resummation obtained with respect to the averaged NLO distribution is around +4.7% at the peak of the distribution, which grows to about +12.1% in the tail of the distribution. In Section 6 we will further discuss the role of O(1/N ) corrections, but first we turn our attention to the transverse-momentum distributions.

The transverse-momentum distribution
We now consider the impact of resummation on the transverse-momentum of the Higgs boson. The results are shown in Fig. 6. As before, let us first take a look at the NLO distributions. At the peak of the distribution around p T = 70 GeV we observe (lower-right panel of Fig. 6) that the results are similar for µ = µ high and µ = µ M , and those setting µ = µ low and µ = µ H T are similar too. The reason for this is clear: when p T is small, m T,4m 2 t 2m t and m T,h m h . Therefore, µ M µ high for small p T , and the NLO distributions at small p T for these scale choices lie close in value. The discrepancy steadily grows with higher p T values. Similarly, µ H T µ low at low p T values, where the best agreement between the two scales is found for p T 100 GeV. This is directly reflected in the results. The NLO distribution shows a substantial scale uncertainty at the peak of the distribution, and at p T = 70 GeV the central results vary between dσ tth /dp T = 2.91 fb/GeV and dσ tth /dp T = 3.24 fb/GeV. The smallest scale uncertainty on the interval µ ∈ [µ ref /2, 2µ ref ] at p T = 70 GeV is found by setting µ ref = µ low where we obtain dσ tth /dp T = 3.23 +5.6% −8.7% fb/GeV. The scale uncertainty of the distribution is again reduced by matching the NLL resummation to the NLO fixed-order result. Using dσ tth /dp T = 3.04 − 3.17 fb/GeV. The smallest dependence on varying µ ref is found by setting µ ref = µ M , in which case we obtain dσ tth /dp T = 3.04 +3.6% −3.1% fb/GeV at p T = 70 GeV. The scale uncertainties grow when using x 2 T,s tt as our threshold variable (lower-left panel). There, the smallest scale uncertainty at p T = 70 GeV is again found for µ = µ M , resulting in dσ tth /dp T = 2.94 +6.5% −4.7% fb/GeV. The upper scale uncertainty therefore has increased a bit with respect to the NLO distribution, while the lower scale uncertainty has roughly halved. Again we find a noticeable impact by approximating the kinematics of the soft-anomalous dimension matrices (upper-left panel), although the difference with the result obtained by using the unapproximated form of these matrices again lies within the scale uncertainty. In Fig. 7   . The correction with respect to the NLO distribution is then between 4.5 − 5.5%, while upon approximating the soft-anomalous dimension matrices the correction is around 6.0 − 5.0% from low to high p T . Interestingly, and in contrast to the invariant-mass case, we now find that this approximation does not only affect the overall size of the resummed correction, but also the shape of the distribution, as it does not lead to a constant difference between the curves labeled by the dominant contribution to the transverse-momentum distribution is picked up by a value of s tt that is roughly constant between different p T values. We will come back to this point in Section 6. The results obtained with µ high (upper-right panel) are similar to the µ M results at small p T , but quickly deviate from those results when p T is increased. This is due to the ratio (m T,t + m T,h )/µ high (with m T,t either m T,4m 2 t or m T,s tt ), which increases as p T grows. The most stable results are found for µ = µ H T (lower-right panel), where we find a correction with respect to the NLO result at small p T values of −2.1%, while for larger p T values we find a correction between −1.5% and −2.5%. The correction obtained by approximating the soft-anomalous dimension matrices is constant at about +0.5%. We observe little dependence on the choice of threshold variable for the scale choices µ = µ low and µ H T . Especially at low values of p T , we find that the two threshold choices return a similar correction of around −2%. For larger values of the transverse momentum, we find that the correction changes sign for the threshold choice x 2 T,s tt for µ = µ low (upper-left panel, red dashdotted line). Interestingly, a similar upward trend is also observed for the result obtained after approximating the kinematics of the soft-anomalous dimension matrices in the threshold limit (for the same µ choice), but does not happen for the other threshold choices. As for the invariant-mass distribution, the ratio plots in Fig. 7 are again slightly misleading: the resummed results show a consistent picture when averaged over all scale choices. This may be observed in Fig. 8, where we see that the use of either x T,4m 2 t or x 2 T,s tt results in the same average value of the transverse-momentum distribution when considering all scale choices, although the former choice leads to smaller scale uncertainties. As for the invariant-mass distribution, the effect of approximating the soft anomalous dimension matrices is noticeable, but lies within the scale uncertainty. On the right-hand side of Fig. 8, one can see that the scale uncertainty of the resummed result is slightly smaller than that of the NLO result. The central result at p T = 70 GeV has increased with respect to the NLO one with +5.2%, whereas the correction in the tails of the distribution grows to +10.5%. The correction obtained with the approximated soft-anomalous dimension matrices is +8.1% at the peak and around 15 − 18% in the tails. These large corrections are not observed in Fig. 7, as there the ratio plots are weighted by the NLO distribution with the same scale choice.

Total cross section
To conclude these results, and before moving on to the discussion of the O(1/N ) effects, we briefly comment on the total cross section obtained using various scale choices and threshold variables. These values are obtained by numerically integrating the resummed expressions for either Q or p T with the various threshold-variable definitions. At NLO, the smallest scale uncertainties are indeed found by setting µ = µ low , for which we obtain σ NLO tth = 0.499 +5.8% −9.2% pb (in accordance with the NLO cross section reported in Table 229 of Ref. [10]). If instead we again vary the scales around all values of µ that are shown in Table 1, we find σ NLO,average tth = 0.449 +17.6% −17.6% pb. The NLL resummed and matched result, averaged over all µ vales and threshold choices is σ NLL,average tth = 0.492 +12.9% −12.9% pb. This lies higher than the averaged NLO result (+9.6%), but is close to the NLO result obtained for µ = µ low . The scale uncertainty is reduced slightly. The option that shows the smallest scale uncertainties is obtained by setting ρ abs with µ high , which results in σ NLL,ρ abs tth,µ=µ high = 0.476 +3.4% −2.7% pb, followed by that obtained setting µ = µ low : σ NLL,ρ abs tth,µ=µ low = 0.491 +5.6% −3.3% pb 1 . The average value obtained by approximating the kinematics of the soft anomalous dimension matrices is σ NLL,diag tth = 0.506 +7.3% −7.3% pb. Therefore, as for the two distributions that we have considered, the effect of approximating the soft anomalous dimension matrices is noticeable, but the resulting value for the total cross section lies well within the scale uncertainty band of the resummed result when considering all scales and threshold definitions.

Role of NLP corrections
In the previous section, it became clear that the parameterically subleading O(1/N ) contributions of the soft-anomalous dimension matrices show a noticeable impact on the resulting distributions. In this section we further study the role of these O(1/N ) contributions. The partonic cross section is evaluated as a function of the partonic center of mass energy squared s, with s = x 1 x 2 S. The partonic threshold region is the region in which the partonic threshold parameter ρ is close to 1. All values of x 1 and x 2 between τ and 1 are however accessible, so whether or not resummation is actually relevant depends on which values of x 1 and x 2 give the dominant contribution to the total hadronic cross section. The region where this dominant contribution originates from in N -space may be estimated with a saddle-point argument (as was done in Refs. [30][31][32] for the single Higgs and DY production processes), which we review here and apply to the production of tth. We first clarify that the large-N limit only does not apply when arguing the validity of approximations made for the threshold variable ρ. The validity of the large-N approximation that one employs to compute the effects of soft-gluon contributions, leading for example to the resummation functions g (i) , is unchanged. By using the large-N limit to compute the integral in e.g. Eq. (14), one essentially isolates the z → 1 limit, which is the only region where the results of this integral can be trusted. The integral needs to be adjusted with subleading-power contributions away from the z → 1 limit if one also wants to assess the O(1/N ) contribution of this integral. However, since we have no control over all-order O(1/N ) effects, it is safest to truncate the result of this integral at O(1). This line of reasoning does not apply to the inverse Mellin transform, as we have no reason to assume that the final resummed result is dominated by the partonic threshold limit ρ → 1 for all variables that we formulated above. Indeed: we will find that small (O(1)) values of N dominate the final result, so in general we expect that effects originating from regions away from the ρ → 1 (and similarly x i → 1) limit do affect the final result. Denoting the resummed hadronic distribution of interest as dσ NLL pp→tth we have with ρ and τ the generic partonic and hadronic threshold variables, and we have defined . We now consider the resummation up to LL, which means that our expression for the resummed partonic distribution becomes simply with i = j = g or i =j = q. We may then rewrite Eq. (74) to For τ < ρ, which is the physical domain where s < S, we have that ln τ ρ < 0, and thus the exponent will grow as N → ∞. Note that the slope of the (linear) function −N ln τ ρ gets smaller for values of ρ closer to τ (i.e. close to the hadronic threshold). For small N , the parton luminosity function has a singularity. Combining these two facts, we understand that the function E ij→tth (N ) has a minimum (N 0 ), which may also be seen explicitly by plotting E ij→tth (N ) as we do in Fig. 9 (left-hand side, solid lines), and therefore that the dominant contribution to the N -integral may be estimated using a saddle-point argument. To this end, we take the derivative with respect to N such that The saddle-point approximation of Eq. (76) then becomes , where we have set N = N 0 + it and c = N 0 . This technique is not only useful to estimate the integral, but for our purposes it can be used to determine which value of N 0 gives rise to the bulk of the full result. However, we first need to verify the approximation of setting N → N 0 by comparing Eq. (78) to the full result, and see that the numerical difference is small, which is what we do in what follows. We first confirm that the resummation functions minimally influence the location of the minimum of E ij→tth . This means that instead of using Eq. (77), we may use to determine N 0 N 0 . The validity of this approximation may be estimated from Fig. 10 for the gg (top row) and qq (bottom row) initiated processes, where we have used µ = µ low to set the factorization scale in the luminosity function and the renormalization scale for α s . One can observe that the values of N 0 (where the solid lines cross 0) and N 0 (where the dashed lines cross 0) are close together for small values of τ /ρ. The difference between the values of N 0 and N 0 does increase as τ /ρ → 1. However, since the minimum of E ij→tth gets less deep for τ /ρ → 1 (see Fig. 9), the numerical difference between E ij→tth (N 0 ) and E ij→tth (N 0 ) remains small. Secondly, from Fig. 9 it can be seen that the approximation of ignoring the g (n) functions to determine the minimum of E ij→tth works better when the ratio M 2 /µ 2 grows large. On the right-hand side of Fig. 9, we show for ij = gg the difference between the full E gg→tth function and the E gg→tth function without the g (n) functions for µ = µ low and M = (m T,h + m T,4m 2 t ) 2 for an extreme value for p T (p T = 490 GeV), where the scale logarithm grows to ln(M 2 /µ 2 ) 3.1. One may observe that the effect of including this scale logarithm is negative, and brings the full result closer to the approximated result where the g (n) functions are not considered for τ /ρ → 1. The value of E ij→tth in the minimum becomes slightly higher for smaller τ /ρ ratios, and better agreement is found for slightly lower values of ln(M 2 /µ 2 ). Keeping these observations in mind, we now proceed and ignore the contribution of the resummation function, thereby determining the minumum value of N 0 via Eq. (79). As a general remark, by comparing the top and bottom panels of Fig. 10, we infer that the qq luminosity function results in slightly lower values for N 0 than the gg luminosity function for small τ /ρ. The difference between using the qq and gg luminosity function vanishes for τ /ρ → 1. We have verified that the resulting N 0 value is minimally influenced by the scale choice, with vanishing dependence as τ /ρ → 1.
We have now established that the behavior of the exponent E ij→tth is largely determined by the interplay between the luminosity function and ln(τ /ρ). Next to dropping the dependence on the resummation function to determine the value of N 0 , we may make one further approximation: we assume that we may take ρ close to 1. This approximation is definitely validated for ρ = ρ Q 2 , but might fail for the other threshold definitions. Next to easing the computation of N 0 , this Ratio to full result Ratio to full result approximation has as a further goal that it will estimate the numerical importance of the ρ → 1 approximation of the soft-anomalous dimension matrices, and the correctness of changing the upper limit of the integration in e.g. Eq. (14). Therefore, by using − ln (τ ) + d ln(L ij (N )) dN to determine the value of N 0 for the qq and gg-initiated channels, and comparing the full resummed result with the approximated one, we may check the validity of setting ρ = 1. We then use these values of N 0 to set λ 0 = α s b 0 ln(N 0 ) in the resummation functions. We keep the full N -dependence of the PDFs and still perform the inverse Mellin transform numerically. Doing this, we may fully assess the role of the resummation itself, and determine for which value of N 0 the resummation functions pick up the correct contribution. The result of this may be seen in the left panel of Fig. 11 for ρ Q 2 and ρ abs (for the invariant-mass distribution) and in the right panel for x 2 T,4m 2 t (for the transverse-momentum distribution), using all scale choices. We do not consider the ρ s tt and x 2 T,s tt options, since our goal here is to assess the validity of the ρ abs → 1 and x 2 T,4m 2 t → 1 approximations.
We first consider the invariant-mass distribution. The values ofN 0 that we have found are:N 0,gg = 2.4 andN 0,qq = 1.7 for τ = (2m t + m h ) 2 /S, which increase to aboutN 0,gg = 3.1 andN 0,qq = 2.2 at τ = Q 2 /S with Q = 1 TeV. Given that these values of N 0 are small, we expect that the threshold approximation of the soft-anomalous dimension matrices indeed shows a noticeable effect when compared against the result that uses the full form of the soft-anomalous dimension matrices.
With using ρ abs = (2m t + m h ) 2 /s, we have a generically lower value for τ than with using ρ Q 2 . The highest value for τ is obtained when using ρ Q 2 . By choosing ρ = ρ Q 2 and µ = µ low , we introduce a moderate scale logarithm in the g (2) coefficient of about ln(M 2 /µ 2 low ) = 1.4 at M = Q 2µ low and of ln(M 2 /µ 2 low ) = 2.9 at M = Q = 1 TeV. As observed in Fig. 9, a non-zero positive value for this scale logarithm will improve the agreement between E ij→tth with the g (n) contributions, and that without it. Since we use Eq. (80) to compute the value of N 0 , we therefore expect that the agreement between the full result and the N 0 -approximated result improves as the scale logarithm grows larger. This is precisely the behaviour observed in Fig. 11: the best agreement for ρ = ρ Q 2 is found by setting µ = µ low . We now assess the validity of the ρ → 1 approximation. Since Q 2 s in the z → 1 limit, the validity of the approximation of ρ Q 2 → 1 is directly guaranteed. The story is more subtle for ρ abs . For ρ abs and µ = µ low , the role of the luminosities and the value of τ are unchanged for increasing Q, therefore the value of N 0 is fixed to the same number for all values of Q. The value of the scale logarithm ln(M 2 /µ 2 low ) is fixed around 1.4, so the scale suppression originating from the resummation across all Q values is constant too. If the ρ abs → 1 limit would be valid across all Q values, we would expect to see a roughly flat line for the ratio between the N 0 -approximated result and the full result. This is what is observed for the ρ = ρ Q 2 and µ = µ M option, where the ratio to the full result is around −9% for low Q values, and −7.5% for high Q values. However, we see that the agreement with a roughly flat line gets worse for ρ abs and µ = µ low at higher Q values. This is directly caused by the approximation of setting ρ abs → 1 to calculate N 0 , which is no longer true at large values of Q, as there Q is allowed to deviate significantly from 2m t + m h . Indeed: if one adjusts the value of τ accordingly by letting it scale as τ abs → τ abs /ρ abs Q 2 /S = τ Q 2 to calculate N 0 via Eq. (80), one obtains a deviation from the full result at low Q values of around −4%, while that at large Q values is around −2.5%, following a similar trend as the ρ Q 2 , µ M curve. The same behaviour is reflected in the curve obtained for ρ = ρ abs and µ high . There, the scale logarithm vanishes for all values of Q, so we get a worse estimate of N 0 by ignoring the g (n) functions. In turn, this results in a generic 10% difference between the full result and the N 0 -approximated result, as can be observed for low values of Q. At large Q-values, the N 0 -approximated result deviates significantly from the full result. If the ρ abs → 1 limit would be valid at large Q-values, one would expect a similar behavior of the ρ = ρ abs , µ high and the ρ = ρ Q 2 , µ = µ M results, as the scale logarithm vanishes in both cases. However, we see that at large Q, the two curves deviate significantly, which again shows us the failure of the ρ abs → 1 approximation in the large-Q limit. Note that the difference between the ρ abs , µ = µ low and ρ abs , µ = µ high results is not constant for different values of Q. This is caused by the Q-dependence of the full NLL (not-matched) result for these two choices, which has different behavior in the tail. Unsurprisingly, the worst agreement between the full result and the N 0 -approximated result is found with ρ = ρ abs and µ = µ M at large Q values. Firstly, for this choice, the sign of the scale logarithm in g (2) is reversed, therefore resulting in a larger discrepancy between the full form of E ij→tth , and the one where the resummation functions are neglected. Secondly, we have now established that the ρ abs → 1 limit is not valid at large values of Q. These two facts combined lead to an underestimation of N 0 , from which a large deviation of the N 0 -approximated result from the full result follows. Before concluding, we briefly turn our attention to the transverse-momentum distribution (righthand side of Fig. 11). The values ofN 0 we find are:N 0,gg = 2.45 andN 0,qq = 1.78 at p T = 10 GeV, which grows toN 0,gg = 3.22 andN 0,qq = 2.35 at p T = 410 GeV. Firstly we note that the scale logarithm in g (2) cancels by setting µ = µ M . The option µ = µ H T leads to a roughly constant value of ln(M 2 /µ 2 ) 1.4 − 1.6. Larger values of ln(M 2 /µ 2 ) are found for smaller values of p T , therefore we expect that the N 0 -approximation with µ = µ H T works better for small p T , which is indeed what we observe. For the fixed-scale options of µ = µ high and µ low , we find that the N 0 approximation is better for higher p T values, which is a direct result of the growing value of ln(M 2 /µ 2 ) for those two options. Focusing again on the case where µ = µ M , which is the choice where the scale logarithm cancels, we see that the approximation works slightly less well in this case than for the invariant-mass distributions for the option ρ Q 2 , µ M . This is directly caused by the invalidity of the x 2 T,4m 2 t → 1 approximation. As we have seen in Section 3.2, another option for the threshold parameter that parameterizes the edge of phase space of the p T distribution is x 2 T,st t . By considering where the d 2 σ tth /dp T ds tt distributions peak (not shown here), one infers that this happens for the generic st t value st t 1.5 × 4m 2 t . This generic s tt value does not depend on the value of p T : the d 2 σ tth /dp T ds tt distributions peak roughly at the same value of s tt for different p T values. More than 90% of the dσ tth /dp T distribution is contained within s tt < 4 × 4m 2 t for small p T , and s tt < 10 × 4m 2 t for large p T . This shows that x 2 T,4m 2 t indeed deviates from 1, and this gives a non-negligible contribution to the soft-anomalous dimension matrices (as was the case for the ρ abs threshold parameter). To conclude this section, we briefly comment on the consequences of the failure of the ρ → 1 approximation for either the invariant-mass or transverse-momentum distribution. Firstly, the upper limit of the integral in e.g. Eq. (14) may formally not be set to M 2 , but should remain fixed at Q 2 instead. Therefore, it becomes dependent on the partonic threshold variable in the cases of ρ abs , ρ s tt and x 2 T,4m 2 t . Secondly, the kinematics of the soft-anomalous dimension matrices may not be approximated, since the ρ → 1 limit is simply not obeyed for either large values of Q, or for the entire p T distribution.

Discussion
We have examined the impact of using different threshold variables in NLL threshold resummation for the tth invariant-mass distribution, the transverse-momentum distribution of the Higgs boson when produced in association with a tt-pair, and the total tth cross section. These threshold variables differ in the way they include NLP corrections, which in N -space show up as O(1/N ) contributions. We also have assessed the role of scale variations. An overview of all options that are considered in this work is given in Table 1. To compute our results, we have introduced a novel numerical method that we call the deformation method. This method stabilizes the computation of the notoriously difficult-to-perform inverse Mellin transform. We show that by using this new method, O(10)-times less computation time is needed to compute resummed observables, while gaining a factor of 4 − 5 in numerical accuracy. We believe that our method is helpful to compute resummed distributions in cases where the Mellin transform of the partonic cross section is too involved to obtain analytically. We show that the resummed distributions and the total cross section are stable under the choice of threshold variable, but that the obtained scale uncertainties do vary with this choice. The smallest scale uncertainties are found by parameterizing the threshold boundary in an absolute sense (i.e. by setting ρ = ρ abs ), and not let it depend on the observable of interest (which is what happens for e.g. ρ = ρ Q 2 ). For the invariant-mass distribution, at the scale choice of µ = µ low (see Table 1, page 17 for the definitions), we see that the numerical difference across the different threshold variables is very small. For all definitions we find a decrease of the NLO central contribution for µ = µ low between −1.5% and −2.5%. The resulting distributions for the other two scale choices µ = µ high and µ M show a large dependence on the choice of threshold variable: corrections to the tail of the NLO invariant-mass distribution vary from −1.6% to +17.9%. After averaging over all scale choices, we find that the correction with respect to the averaged NLO invariant-mass distribution is around +4.7% at the peak of the distribution and +12.1% in the tails. For the transverse-momentum distribution we find that the scale choice µ = µ H T leads to a very constant result across the two different threshold choices. Upon averaging over all scale choices, we find a correction with respect to the averaged NLO transverse-momentum distribution of +5.2% at the peak and around +10.5% in the tail of the distribution. Sometimes, resummed results in direct QCD are obtained using an approximation on the kinematics of the soft-anomalous dimension matrices. This approximation is justified by claiming that the resummed results are only valid in the large-N limit (N → ∞). However, we show that this large-N limit does not apply to the kinematics of the soft-anomalous dimension matrices. The large-N limit has to be used for computing the effects of soft-gluon contributions, since we have no control over subleading O(1/N ) results that originate from for example the emissions of next-to-soft gluons and soft quarks. In computing the effects of soft-gluon contributions, the large-N limit leads to the wellknown resummation coefficients g (i) . However, the large-N limit may not be used when performing the inverse-Mellin transform, as there all values of N can be probed. Using a saddle-point argument we show that the average value of N in this inverse-Mellin transform integral is around 1 for the qq-induced channel, and around 2 for the gg-induced channel. This shows that the value of N in the inverse Mellin transform is determined by the interplay between the shapes of the PDFs, and the value of the hadronic threshold variable τ . This means that the N → ∞ limit does not apply to the factor ρ N present in the Mellin transform of the partonic cross section. It follows that the partonic threshold limit of ρ → 1 is not obeyed, and that the simplification of the soft-anomalous dimension matrices using this limit is invalid. The numerical effect of this approximation is different for different observables. For the invariantmass distribution it leads to an overall normalization that is wrong by around +3%. Similarly, the approximation leads to an overestimation of the total cross section of +3% as well. In contrast, the shape of the transverse-momentum distribution is altered, as the correction obtained after approximating the kinematics of the soft-anomalous dimension matrices is not constant for all p T values: at the peak of the distribution the overestimation is around +2.9%, while in the tails this grows to about +6.5%. These differences lie within the scale uncertainties of the results. A natural follow-up of our analysis would be to investigate the effects of NNLL threshold resummation. It is known that NNLL threshold resummation further stabilizes the scale-dependence of the results [12,13], but it does not impact the central values to a great extent. Therefore, we expect that the extension to NNLL does not change our conclusions. Another extension to our work would be to examine the impact of NLP effects on the tth distribution to its full extent. The NLP leadinglogarithmic terms would originate from either next-to-soft gluon emissions, or from the emission of soft quarks, and it would certainly be interesting to study their effect once a resummation framework for these contributions exists. ST/T000864/1).

A Useful definitions for NLL resummation
We use the following definition of the QCD β-function For NLL resummation we only need b 0 and b 1 , defined by [33][34][35][36][37] with T R = 1/2, C A = 3 and C F = 4 3 . The number of active flavors is denoted by n f and is set equal to 5 in this work. The resummation functions of Eq. (15) read with the coefficients A (1),(2) a given by [38] A (1) a = C a , The λ coefficient is defined through λ = α s b 0 lnN with α s ≡ α s (µ 2 R ) andN = e γ E N with N the Mellin moment and γ E the Euler-Mascheroni constant. At lowest order, the soft-anomalous dimension matrix S (0) ij→tth is given by in the bases (c q 1 , c q 8 ) and (c g 1 , c g 8S , c g 8A ) respectively. Denoting q(c i )q(c j ) → t(c t )t(ct)h with c k the color indices belonging to the fundamental representation, we may write the base tensors for the qq channel as where t a c i c j denotes the generator of SU(3) in the fundamental representation, normalized via Tr[t a t b ] = δ ab /2. Repeated indices are summed over. For the g(a i )g(a j ) channel, with a k indicating a color index in the adjoint representation, we may write with f abc the structure constants of SU(3) defined through [t a , t b ] = if abc t c , and d abc the symmetric tensor of SU(3). The color tensors of Eq. (87) and (88) are orthogonal, which is the reason why the soft anomalous dimension matrix is diagonal. The color structure of the hard scattering matrix element needs to be projected onto these bases. For the qq channel this is straightforward since there is only one color channel (t e c j c i t e ctct ). Denoting the matrix element (stripped of the color tensors) as M qq , the lowest-order hard function in the basis (c q 1 , c q 8 ) reads For the gg channel this projection is slightly more involved. There are three color structures, given by The hard scattering matrix element (stripped of the color tensors) in the basis t a i ctk t The hard scattering matrix element h in the basis (c g 1 , c g 8S , c g 8A ) then becomes With this, the lowest-order hard function for gg scattering is with

A.1 The soft-anomalous dimension matrices
The procedure on how to calculate the soft-anomalous dimension matrix is outlined in Ref. [20,39] for tt production and given for tth production in Ref. [11] with full dependence on the 2 → 3 particle kinematics. To write things down in a compact form, it is useful to introduce , With these definitions the one-loop soft-anomalous dimension matrix for the qq channel in the basis with Γ qq 11 = −C F (L tt + 1) , The one-loop soft-anomalous dimension matrix for the gg channel in the basis (c g 1 , c g 8S , c g 8A ) is with Γ gg 11 = −C F (L tt + 1) , For s tt = 4m 2 t , we may write L tt s tt =4m 2 t = lim The function Ω tt vanishes for s tt = 4m 2 t , while Λ tt = ln 1 + p 2 T /(4m 2 t ) − 1 + iπ. Therefore, at s tt = 4m 2 t , we have a diagonal one-loop soft-anomalous dimension matrix with With this, the soft function becomes This result only holds for s tt = 4m 2 t . Corrections to the soft function that involve using the full kinematics of the soft anomalous dimension matrices are of NLP. In that case, we also need to diagonalize the one-loop soft-anomalous dimension matrices. We do this using the matrix R with RR −1 = 1. We may then write (dropping the subscripts) with H (0) In the qq case, the form of R is simple. The eigenvalues of the one-loop soft-anomalous dimension matrix (denoted by λ qq ± ) are given by with Γ qq 12 = C F C A Ω tt and Γ qq 21 = 2Ω tt . The eigenvectors are such that R qq becomes R qq = such that the eigenvalues read such that with With this, the resummation of the soft function for the gg channel becomes with β x the x-component of β, and β 2 = β · β. With this, we may write the four-vectors p t and pt in the partonic CM frame. For the invariants t 1t , t 2t , t 1t , and t 2t , we only need E t , Et, p t,z and pt ,z . These read E t = γE *

C The derivative method
The derivative method was introduced in Ref. [25], and later extended in Ref. [11] to also include a second derivative. This method results in one factor of 1/N for each PDF and each derivative. This factor suppresses Eq. (58), and one may hope that then the numerical integration converges quickly enough. Consider where we have allowed for a possible x min = 0 as the lower limit of the integral, resulting from demanding (x 1 x 2 ρ)/τ ≥ 1. Eq. (117) may be rewritten to The PDFs vanish at the upper limit of the integral (x = 1). For x min = 0, the integrand also vanishes at the lower limit, and we may neglect the boundary term. The lower bound x min is precisely the value of x 1,2 where x 1 x 2 ρ/τ = 1. Therefore, for x min = 0, it is to be expected that the boundary term results in a finite contribution, although it will be suppressed by a factor of 1 N (before taking the inverse Mellin transform). The second factor on the second line of Eq. (118) also has a suppression factor, but this is compensated by the large-N -dependence of the integral.
Since we are interested in the N → ∞ limit, we may hope that its contribution is negligible. It is important that one checks this assumption explicitly for every process, as it cannot be guaranteed from the outset that this assumption holds true. If the contribution is indeed negligible, we may replace where we see that we have introduced one suppression factor of 1 N . We can further manipulate the second factor on the last line of Eq. (118) and introduce a second factor of 1 N via Again, we have no control over what happens to the boundary term. We do not know whether it vanishes at x min , but also the derivative of xf (x, µ 2 F ) has to vanish at x = 1, and it is not clear that this happens for every PDF. If we still press on and assume that the boundary term goes to zero, we arrive at Using this result to perform the inverse Mellin transform as in Eq. (25), we introduce an 1/N 4 suppression factor where τ and ρ are generic threshold variables, and dσ ij (ρ) (dσ(τ )) is a generic partonic (hadronic) cross section or differential distribution. The derivative method has the clear advantage that it is flexible in the PDFs that one can use, as one does not need to assume a specific parameterization. However, considering the assumptions on the boundary terms, there are doubts about whether the method can be trusted. The boundary term is artificially introduced as a method to stabilize the numerical integration (see discussion below Eq. (58)). Therefore, the boundary term should not have been there in the first space as in fact x min = 0, and one has to verify numerically that the boundary term results in a negligible contribution to the final result. But even if the contribution of the boundary term is negligible, we still do not have access to the analytic form of the N -space PDFs. Therefore, one cannot be certain on what the strip of definition actually is, and consequently, what value of C MP can be chosen.