Top quark pair production near threshold: single/double distributions and mass determination

We investigate top quark pair production near the threshold where the pair invariant mass $M_{t\bar{t}}$ approaches $2m_t$, which provides sensitive observables to extract the top quark mass $m_t$. Using the effective field theory methods, we derive a factorization and resummation formula for kinematic distributions in the threshold limit up to the next-to-leading power, which resums higher order Coulomb corrections to all orders in the strong coupling constant. Our formula is similar to those in the literature but differs in several important aspects. We apply our formula to the $M_{t\bar{t}}$ distribution, as well as to the double differential cross section with respect to $M_{t\bar{t}}$ and the rapidity of the $t\bar{t}$ pair. We find that the resummation effects significantly increase the cross sections near the threshold, and lead to predictions better compatible with experimental data than the fixed-order ones. We demonstrate that incorporating resummation effects in the top quark mass determination can shift the extracted value of $m_t$ by as large as 1.4 GeV. The shift is much larger than the estimated uncertainties in previous experimental studies, and leads to a value of the top quark pole mass more consistent with the current world average.


Introduction
The top quark is the heaviest elementary particle in the Standard Model (SM). Its large mass plays important roles in many frontiers of particle physics. In the SM, the top quark mass m t comes exclusively from the O(1) Yukawa coupling between the top quark and the Higgs field. Therefore, the top quark is believed to be crucial to understand the electroweak symmetry breaking and properties of the Higgs sector. For example, the stability of the electroweak vacuum is quite sensitive to the top quark mass. The same is true for the fine-tuning of the Higgs boson mass and the indirect constraints on new physics beyond the SM. Consequently, precise measurement of the top quark mass is a highly important quest of the Large Hadron Collider (LHC) and future high energy colliders.
Traditionally, the top quark mass is measured by reconstructing the top quark from its decay products, and fitting the resulting invariant mass distribution against that generated by Monte Carlo (MC) event generators. Such a mass is often referred to as the "MC mass". Thanks to the large amount of data collected by the ATLAS and CMS detectors at the LHC, the precision for the measured MC mass has been greatly improved in recent years. The current world average for the MC mass is given by m MC t = 172.9 ± 0.4 GeV [1]. Despite the high precision of the experimental result, it turns out to be difficult to relate the MC mass to a well-defined mass parameter in the Lagrangian of the associated quantum field theory with a certain renormalization scheme (see, e.g., Refs. [2,3]). The difficulties are mostly related to the fact that top quarks (and their decay products) are strongly-interacting particles who may radiate additional gluons and quarks which end up as hadrons in the detectors. These effects are described approximately by parton shower algorithms and hadronization models in MC event generators. Both the perturbative and non-perturbative aspects of the generators need to be carefully studied in order to relate the MC mass to a field-theoretic mass. There have been ongoing researches on these issues [4][5][6][7], but no final quantitative conclusion has been reached.
Instead of measuring the MC mass from the decay products of the top quark, it is possible to directly extract a Lagrangian mass by comparing experimental measurements and theoretical predictions for certain observables (e.g., total or differential cross sections of scattering processes involving the top quark). For that purpose, not only the experimental measurements, but also the theoretical predictions for these observables have to achieve rather high accuracies in order to extract a relatively precise value of the top quark mass. Such theoretical predictions necessarily involve higher order perturbative corrections. In their calculations ultraviolet (UV) divergences appear at intermediate steps and one has to adopt a renormalization scheme to arrive at finite predictions. The definition of the Lagrangian mass therefore depends on the renormalization scheme. In practice, one often employs the on-shell scheme or the modified minimal subtraction (MS) scheme. In the on-shell scheme, one defines the so-called "pole mass" of the top quark in perturbation theory. 1 This is the most widely used mass scheme in perturbative calculations for top quark related scattering processes, and we will only discuss this mass definition in the current work. The current world average for the top quark pole mass, extracted from cross section measurements, is given by m pole t = 173.1 ± 0.9 GeV [1]. The value of the extracted pole mass is rather close to the MC mass, and their exact relationship is an important question to be addressed [4][5][6][7].
Following the above discussions, it is clear that to extract the top quark mass, one needs to use observables that are strongly dependent on m t , and in the same time can be experimentally measured and theoretically calculated with high precisions. An often used observable is the tt pair invariant-mass distribution and related multi-differential cross sections in the top quark pair production process [13,14]. It can be easily anticipated that the kinematic region most sensitive to m t is where the pair invariant mass M tt is near the 2m t threshold. Precision theoretical predictions for this observable, especially in the threshold region, are therefore highly demanded to achieve the goal of extracting the top quark mass. A closely related observable ρ s (and similar ones) in tt + jet production was employed in [15][16][17][18][19], where ρ s is defined as where m 0 is an arbitrarily chosen scale of the order of m t , and s ttj is the invariant mass of the top quark, the anti-top quark and the additional jet. It was shown in [15] that the region most sensitive to m t is where ρ s is near its maximal value. In that region, the tt invariant mass M tt is pushed to the 2m t threshold. Consequently, understanding the threshold behavior of M tt is crucial also when using the ρ s variable to extract the top quark mass.
In this work, we will investigate the M tt distribution in top quark pair production, especially its behavior in the threshold region. The tt + jet production process will be studied in a forthcoming article. In perturbation theory, the differential cross section receives corrections from both strong and electroweak (EW) interactions. One can therefore organize the theoretical result as a double series in the strong coupling constant α s and the fine-structure constant α. We will mainly be concerned with strong-interaction contributions described by quantum chromodynamics (QCD). It is possible to incorporate EW effects in the future in a similar way as in [20][21][22]. In QCD, the current benchmark of fixed-order calculations is at the level of next-to-next-to-leading order (NNLO) [23][24][25][26][27][28][29][30][31]. Upon the NNLO result, all-order resummation of soft logarithms [32][33][34] combined with resummation of small-mass logarithms [35][36][37] up to the NNLL accuracy can be added which improves the theoretical precision, particularly in the high M tt (a.k.a. boosted) region. This results in the state-of-the-art QCD prediction at NNLO+NNLL [38].
The high precision theoretical predictions are compared to the experimental measurements by the ATLAS and CMS collaborations at the 13 TeV LHC in, e.g., Refs. [39][40][41][42][43]. Overall excellent agreement between theory and data is found in almost all phase space regions. However, there exists an interesting discrepancy in the threshold region of the M tt distribution found in both the lepton+jets and di-lepton data of the CMS experiment [39,41]. To see that more clearly, we show in Fig. 1 the CMS result in the di-lepton channel [39] for the averaged M tt distribution in the [300, 380] GeV range, where the green band reflects the combined statistical and systematical uncertainty of the experimental measurement. The central values of various theoretical predictions (NNLO from [29,31], NNLO+EW from [22], and NNLO+NNLL from [38]) are shown in comparison. It can be seen that there exists a clear gap between the experimental and theoretical results. While this is just a small discrepancy in a vast collection of observables which is normally not very important, the threshold region of the M tt distribution is somewhat special since it is strongly sensitive to the top quark mass. This can be easily observed from Fig. 1, where we have shown theoretical predictions using two values of m t : 172.5 GeV (blue points) and 173.3 GeV (red points). Therefore, this small discrepancy has profound implications on the top quark mass measurement. As a matter of fact, such a measurement using the data of [39] has already been performed in [14]. It is found that the extracted top quark pole mass is around 171 GeV (with an uncertainty of about 0.7 GeV), which is significantly smaller than the current world average m pole t = 173.1 ± 0.9 GeV and m MC t = 172.9 ± 0.4 GeV. The main driving force towards the lower value is exactly the mismatch between theory and data in the threshold region M tt ∼ 2m t . Note that Ref. [14] has only used an integrated luminosity of 35.9 fb −1 compared to the full LHC Run 2 dataset of 150 fb −1 . The Run 3 of the LHC will further collect much more data in the near future. With the large amount of tt events, future extractions of the top quark mass will have much smaller experimental uncertainties. One should therefore take this discrepancy seriously if it persists in the future.
It is known that in the threshold region M tt ∼ 2m t , there is a class of higher-order contributions not included in the current state-of-the-art QCD predictions of Refs. [29,36,38]. They are of the form α n s /β m where β ≡ 1 − 4m 2 t /M 2 tt is the speed of the top quark in the tt rest frame. In the threshold region where the top and anti-top quarks are slowly moving with respect to each other, one has β ∼ 0, and the α n s /β m contributions are enhanced. These corrections arise from exchanges of Coulomb-like gluons, and can be systematically resummed to all orders in α s [44][45][46][47]. A physical effect of this resummation is that the value of M tt can be lower than the 2m t threshold, due to bound-state effects caused by the virtual gluon exchanges. In Ref. [14], the authors use the result of [47] to estimate that these higher-order corrections will lead to a shift of +0.7 GeV to the extracted m t , which is of the similar size of the total experimental uncertainty. However, there are a few concerns which may invalidate the direct application of the result of [47]. First of all, Ref. [47] only gives numeric results for M tt ≥ 335 GeV which does not fully cover the range M tt ≥ 300 GeV used in the experimental analysis. While the contributions below 335 GeV may not be very important, it is best to be clarified quantitatively. 2 Secondly, the prediction of Ref. [47] (as well as the first bin of the experimental data) extends to M tt = 380 GeV, where β ≈ 0.4 is not so small. One should therefore carefully treat the subleading-power contributions in β in order not to introduce unrealistic corrections into the theoretical prediction. Last but not least, on top of the small-β threshold limit, Ref. [47] also considers the "soft" limit z ≡ M 2 tt /ŝ → 1, where √ŝ is the center-of-mass energy of initial-state partons in the hard scattering. 3 Given the high energy (13 TeV) of the LHC compared to 2m t ≈ 345 GeV, it is necessary to assess the validity of the z-soft limit in the current context.
The goal of this paper is two-fold. Firstly, we reexamine the three points raised above. Our main findings can be summarized as following: 1) The contribution from the region M tt ∈ [300,335] GeV is about 4% of the integrated cross section in the bin [300,380] GeV, which is non-negligible for current and future high precision measurements; 2) It is necessary to modify the resummation formula to take into account the subleading power corrections such that the formula is valid up to M tt ∼ 380 GeV; 3) The soft limit z → 1 does not provide a reasonable approximation for the kinematic region of interest, therefore soft resummation should either not be performed, or be applied very carefully. The second goal of this paper is to combine the Coulomb resummation with the NNLO results of [29,31], to achieve the best prediction in the threshold region, and to extend the prediction to higher M tt values. For that purpose, we need to modify the factorization formula of [44,45,47] to deal with the dynamic renormalization and factorization scales used in the NNLO calculation. We also need to calculate a new hard function with kinematic dependence which is an essential ingredient in our factorization formula. Note that some of the results in this work have already been presented in [48]. This paper aims at a more thorough analysis with more technique details and more phenomenological results and discussions. This paper is organized as follows. In Section 2 we discuss the fixed-order QCD corrections for the M tt distribution and derive the factorization and resummation formula relevant in the threshold region. In Section 3 we calculate the hard function which is an essential ingredient in the factorization formula. We then use these analytic results to perform numeric calculations and present the phenomenological results in Section 4. We summarize in Section 5 and give additional details in the Appendices.
2 Fixed-order results and factorization

Fixed-order results
In this work we consider the hadronic process

1)
2 Note that the shape of the distribution below 2mt threshold strongly depends on the decay width Γt of the top quark. 3 Later we will also study the behaviors of soft gluons in the β → 0 limit. To avoid confusion, we will refer to the z → 1 soft limit as the z-soft limit, and refer to the β → 0 soft limit as the β-soft limit, respectively.
where h 1 and h 2 are two incoming hadrons, while X h denotes all final-state particles except the top quark and the anti-top quark. We are mainly interested in the invariant mass of the tt pair, which is defined as In QCD factorization [49], the invariant-mass distribution can be written as a convolution of partonic differential cross sections and non-perturbative parton luminosity functions: where i, j ∈ {q,q, g} denote partons within the colliding hadrons; z ≡ M 2 tt /ŝ, τ ≡ M 2 tt /s, with √ s and √ŝ being the hadronic and partonic center-of-mass energies, respectively; and µ f is the factorization scale. The symbol Θ denotes a collection of extra kinematic variables (other than m t and M tt ) upon which µ f may depend. The functions ff ij (y, µ f ) are the parton luminosity functions defined by where f i/h is the parton distribution function (PDF) of the parton i in the hadron h. They are non-perturbative objects which can be extracted from experimental data, and can be obtained using, e.g., the program package LHAPDF [50]. The partonic differential cross sections can be calculated in perturbation theory. In this work, we are concerned with QCD corrections to this quantity. At the leading order (LO) in the strong coupling constant α s , only the qq and gg channels give non-vanishing contributions where µ r is the renormalization scale, θ t is the scattering angle of the top quark in the tt rest frame (which coincides with the partonic center-of-mass frame at LO). The coefficient functions c ij,α , with α = 1, 8 labelling the color configuration of the tt system, are given by Plugging Eq. (2.5) into Eq. (2.3), we obtain the LO hadronic differential cross sections (2.8) At the next-to-leading order (NLO) and the next-to-next-to-leading order (NNLO) in QCD, there are no analytic formulas for the partonic differential cross sections, and one relies on numeric methods to perform the phase-space integrals as well as loop integrals (at NNLO). The NLO results were calculated in [51][52][53], and can be obtained using the program package MCFM [54]. The NNLO results were calculated in [23][24][25][26][27][28][29][30][31], and we obtain the invariant-mass distribution from [29,31,55,56].

Factorization near threshold
In the threshold region M tt ∼ 2m t , higher order QCD corrections are enhanced by contributions of the form (α s /β) n as well as α n s ln m β, which arise from exchanges of Coulomb-type gluons and soft gluons between the top and anti-top quarks. Using the method of regions, we identify the following relevant momentum regions in the tt rest frame: collinear: k µ = (n i · k, n i · k, k ⊥ ) ∼ M tt (1, β 2 , β) . (2.9) Note that later we will also consider the ultrasoft region in the z → 1 limit, i.e., the z-soft limit introduced in footnote 3 on page 5. That should not be confused with the β-soft limit here. In the last equation above, the light-like 4-vector n µ i is along the momentum of each massless energetic parton in the initial and final states. The light-like 4-vectorsn µ i satisfy n i ·n i = 2. Later we will show that the collinear modes are irrelevant at the order considered in this work. We nevertheless list them here for completeness.
The momentum modes in Eq. (2.9) can be described in the language of effective field theories (EFTs). The relevant EFT is potential non-relativistic QCD (pNRQCD) [73][74][75][76], possibly supplemented by soft-collinear effective theory (SCET) [77][78][79][80][81]. The EFT of pNRQCD describes interactions among potential, soft and ultrasoft fields, while SCET describes interactions among ultrasoft and collinear fields. Both theories admit a power expansion in the small parameter β 1. In this work, we will consider the power expansion up to the next-to-leading power (NLP). In order to resum the (α s /β) n terms up to all orders in α s , pNRQCD adopts an additional power counting α s ∼ β, such that all (α s /β) n terms are O(1) and are incorporated already at the leading power (LP).
We begin with the partonic differential cross section with respect to M 2 where the summation over final-state polarization and color indices and the average over initial-state ones are understood. In the tt rest frame, the momenta of the top and anti-top quarks are given by where v µ = (1, 0, 0, 0) and the relative momentum q µ behaves as the potential mode in Eq. (2.9). The extra radiations X are generically counted as the hard mode in our setup, since we count 1 − z = 1 − M 2 tt /ŝ as an O(1) quantity. In other words, we do not consider the limit z → 1 besides the threshold limit β → 0. The reason will be clear later.
In the β → 0 limit, the scattering amplitude in Eq. (2.10) can be described in pNRQCD up to the NLP as ij,X (p 1 , p 2 , P tt , P X ) t a 1t a 2 |ψ † χ|0 , (2.12) where the fields ψ and χ are heavy quark fields in pNRQCD describing the top and antitop quarks, respectively; and C a 1 a 2 ij,X are Wilson coefficients which encode fluctuations at the hard scale M tt . They receive contributions from both virtual exchanges and real emissions of hard gluons. They depend on total momentum of the tt pair as well as the momenta of other external particles. They also depend on the color indices of the external particles, in particular, the color indices a 1 and a 2 of the top and anti-top quarks, which are contracted with the corresponding indices of the operator matrix elements in Eq. (2.12). The squared amplitude in Eq. (2.10) can then be expressed as where the summation over polarization and color indices are understood, and the 1/N ij factor takes into account the average over initial states. The contraction of color indices in Eq. (2.13) can be simplified by inserting a complete set of orthonormal color projectors P α {a} given by P 1 a 1 a 2 a 3 a 4 = 1 3 δ a 1 a 2 δ a 3 a 4 , P 8 a 1 a 2 a 3 a 4 = 2T c a 1 a 2 T c a 4 a 3 , (2.14) where α = 1, 8 denote the singlet and octet color configurations of the tt pair. We can now define the hard functions as where Q T and Y are the transverse momentum and the rapidity of the tt pair in the initial-state center-of-mass frame, respectively. The reason for keeping their dependence in the hard functions will be clear later. The hard functions can be calculated in perturbation theory, where both ultraviolet (UV) and infrared (IR) divergences appear. The UV divergences are removed via renormalization. Part of the IR divergences cancels when adding virtual and real contributions, while the remaining collinear divergences are absorbed into the PDFs. After these procedures, the hard functions develop dependencies on the renormalization scale µ r and the factorization scale µ f . Plugging Eqs. (2.13) and (2.15) into Eq. (2.10), we find that the remaining integrals are over p t and pt, or equivalently, over the potential-scaling relative momentum q µ as given in Eq. (2.11). We can then define a potential function describing fluctuations of the potential, soft and ultrasoft modes as where E ≡ M tt − 2m t represents the residue kinetic energy of the top and anti-top quarks in the tt rest frame. The partonic differential cross section can then be written in the factorized form up to the NLP: where the coefficient functions c ij,α are included such that the leading order expansion of the factorization formula coincides with the exact results in Eq. (2.5). They are given in Eq. (2.6) for (ij, α) = (qq, 8), (gg, 1), (gg, 8), and we choose c ij,α = 1 for all other cases. The kinematic variables contained in Θ include Q 2 T , Y , as well as θ t and φ t being the scattering angle and the azimuthal angle of the top quark in the tt rest frame.
The formula (2.17) holds for rather generic choices of µ r and µ f . Near the threshold M tt ∼ 2m t , it is reasonable to associate the scales to either m t or M tt . On the other hand, we have in mind that our results can be extended to a much larger range of M tt through a combination with fixed-order calculations [23][24][25][26][27][28][29][30][31] and with soft-gluon resummation calculations [34][35][36][37]. We will therefore also consider the scale choices adopted by those calculations, where the scales are correlated with the variable where p T,t and p T,t are the transverse momenta of the top and anti-top quarks in the initialstate center-of-mass frame. The variable H T is a (complicated) function of M tt , θ t , φ t , Q T and Y . This is essentially the reason why we need to keep these variables unintegrated in Eq. (2.17), as collected in the symbol Θ.

Absence of additional structures up to NLP
At this point, it is worthwhile to briefly discuss the derivation of the factorization formula (2.12). Such a factorization is straightforward if one could count all parton exchanges and radiations (except those within the tt system) as hard. In this case the only EFT required to describe the process is pNRQCD, and hence the standard matching formula (2.12). On the other hand, IR divergences appearing at higher orders in perturbation theory may spoil this simple assumption. If that happens, one will need to utilize other EFTs such as the SCET to describe, e.g., the collinear modes, and introduce new structures into the factorization formula. In the following, we will show that such new structures are not required at LP and NLP. Besides the dynamics described by pNRQCD, the remaining IR divergences arise from soft and/or collinear interactions. The strategy we are going to take is then to use SCET (combined with pNRQCD) to analyze the behavior of the differential cross section in those limits. At LP in β, the interactions of ultrasoft gluons with initial-state and final-state partons are both governed by the eikonal approximation. The interactions among collinear fields are the same as in the full QCD. The cancellation of soft divergences and final-state collinear divergences therefore follows similarly as the KLN theorem [82,83]. The remaining initial-state collinear divergences can be absorbed into the PDFs through factorization [49]. Note that the above discussions apply to arbitrary orders in α s at LP in β. We will explicitly demonstrate these cancellations through the calculation of the NLO hard functions in the next section.
Using the EFT language, the ultrasoft and collinear interactions are described by the LP Lagrangians of SCET and pNRQCD, written as where n µ takes each of the light-like 4-vectors n µ i along initial-state and final-state massless energetic partons; ξ n is the collinear quark field along the n direction; ψ and χ are Pauli spinor fields annihilating the top quark and creating the anti-top quark, respectively; A n (in the covariant derivative D n ) and A us represent the collinear and ultrasoft gluon fields, with F µν n(us) their field strength tensors. The ultrasoft eikonal interactions are manifest in the n · A us terms in the above Lagrangians. One can perform the field redefinitions [79,84] such that these interactions do not appear explicitly in the LP Lagrangians, where S v (x) and S q n (x) are ultrasoft Wilson lines in the fundamental representation along the directions implied by the subscripts, while S g n (x) are ultrasoft Wilson lines in the adjoint representation. These interactions reappear in the effective operators describing the tt production process.
The partonic differential cross sections can then be decomposed into a hard sector (containing Wilson coefficients from matching the full QCD to the EFT), a potential sector (containing top and anti-top quarks as well as potential and soft modes), an ultrasoft sector (containing the ultrasoft Wilson lines), and several collinear sectors (containing the collinear fields along each of the incoming and outgoing energetic partons). Within each sector, one needs to perform the well-known multipole expansion [80,81] to have a uniform power counting in β. However, the only physical scale which may enter the ultrasoft sector and the collinear sectors is given by the residue momentum p 1 + p 2 − P tt , which is counted as hard in our approach. As a result, the loop and phase-space integrals in the ultrasoft sector and the collinear sectors become scaleless and vanish in dimensional regularization. This effectively means that we do not need to consider them at LP in β to start with, and hence the differential cross sections are factorized as in Eq. (2.17).
At NLP in β, we need to consider the subleading Lagrangians of pNRQCD and SCET, as well as the subleading effective operators relevant for the process. The NLP pNRQCD Lagrangians are given by [75,76,85] where E i us = F i0 us are the chromoelectric components of the ultrasoft field strength tensor. The coefficient a 1 was calculated in [86,87] and is given by where W n is the collinear Wilson line and q us is the ultrasoft quark field. It can be shown that single insertions of L 1a pNRQCD give rise to vanishing results due to angular momentum conservation [84,88,89], while L 1c SCET does not contribute due to baryon number conservation. The terms in L 1b pNRQCD involve subleading potentials between the top and anti-top quarks. These contributions can be incorporated by upgrading the potential function J α (E) to the NLO, which we will discuss in the next subsection.
Besides the NLP Lagrangians which describe the low-energy interactions in the EFTs, we also need to consider the NLP effective operators describing the hard scattering processes. These are constructed out of gauge-invariant building blocks of pNRQCD and SCET fields, with the overall power counting of order β 1 (the LP operators are of order β 0 ). This extra power of β comes either from the collinear fields or from the fields in the potential sector. Note that the ultrasoft mode scales as β 2 and therefore cannot provide a single power of β. The new operators from the potential sector may lead to new potential functions in addition to the LP one in Eq. (2.16). For example, there could be contributions from matrix elements of the form 0|χ † ψ|t a 3t a 4 t a 1t a 2 |ψ † ∂χ|0 .
( 2.27) However, such terms have an odd parity and always lead to a vanishing result when integrating over the phase space as in Eq. (2.16). For the NLP operators in the collinear sector, and for the single insertions of L 1a,1b SCET , the situation is quite similar. Only the transverse component of a collinear momentum or a collinear gluon field can give rise to an order β 1 contribution. In the NLP collinear functions (beam or jet functions), one therefore generically encounters integrals similar as where Ξ n represents gauge-invariant building blocks of collinear fields, and ∂ µ ⊥ might be replaced by x µ ⊥ or A µ ⊥ . Note that at NLP, such dependence on the transverse component can only appear once. This kind of contributions either vanish trivially, or vanish after phase-space integration. We therefore conclude that the factorization formula (2.17) is not modified by NLP contributions, except that the potential function J α (E) should be calculated up to order β.

The perturbative ingredients and resummation at NLP
The hard functions H ij,α can be expanded in powers of the strong coupling α s : Due to soft and collinear divergences, H ij,α are singular (in terms of distributions) in both the limits z → 1 and Q T → 0. We work in dimensional regularization with the spacetime The LO hard functions are simply given by where we have kept the dependence on which is needed for renormalization. The NLO hard functions are much more complicated, and serve as one of the major new ingredients of this work. We will discuss their calculation in the next section.
We now turn to the potential function J α (E), which can be related to the imaginary part of the pNRQCD Green function G α ( r 1 , r 2 ; E) of the tt pair at origin [84]: (2.31) Up to the NLP, the potential function can be written as The Green function can be obtained by solving a differential equation [75,76,90,91]. It depends on an additional (hard) scale other than E, which is usually chosen as m t . It is equally well to write the Green function in terms of M tt and E, which corresponds to a reorganization of the power expansion in β. Since M tt = 2m t (1 + O(β 2 )), at NLP it is sufficient to simply replace m t → M tt /2. We can then write the Green function as Here From the form of the logarithm, it appears that the natural choice of the potential scale µ J is √ 2M tt E. However, as E approaches zero, this scale enters the non-perturbative regime. We therefore follow the prescription in [84,89] to set a lower bound µ cut J for the potential scale. It is set to be the solution to the equation µ cut J = C F m t α s (µ cut J ), with a numeric value µ cut J ≈ 32 GeV. Finally, when E is small, the top quark width effect becomes important. To deal with that we replace E → E + iΓ t , where Γ t ≈ 1.4 GeV.
Combining the hard functions and the potential functions and convoluting with the parton luminosities, we define the NLP resummed hadronic differential cross section as In Eq. (2.35), the integration domain of Q T and Y is determined by It is evident that in the limit z → 1, whereŝ → M 2 tt , both Q T and Y must approach zero. In practice, it is often useful to have the perturbative expansion of the NLP kernel for where the coefficients for the first few orders are given by We note that 2E/M tt = β + O(β 3 ), and the above expansion makes the 1/β corrections explicit.
We still need to specify how to perform the integrations in Eq. (2.35), and how to compute the variable H T in Eq. (2.18). These are in general quite complicated, but are simplified at NLP, where the extra radiation X satisfies M 2 X = 0. In this case the transverse momenta of the top and anti-top quarks can be written as It is then straightforward to compute the variable H T which enters the scales µ r and µ f . The integrals in Eq. (2.35) can now be performed numerically. The only subtlety is that the NLP kernel K NLP ij,α contains singular distributions involving z, Q T and Y , which arise from the NLO hard functions to be discussed in the next section.

Matching with fixed-order results
The resummed result of Eq. (2.35) contains contributions enhanced by 1/β or ln β to all orders in α s at the NLP accuracy. It is possible to add back the β-power suppressed contributions at NLO and NNLO to achieve a more precise prediction through a matching procedure. This is straightforward given the fixed-order expansion Eq. (2.38) of the resummation formula. We define the n k LO differential cross sections (with k = 0, 1, 2, . . .) as Note that the n 0 LO cross section is exactly the same as the LO cross section (2.8) with our choice of normalization in the resummation formula, while the n k LO cross sections provide approximations to the exact N k LO results (with N 1 LO ≡ NLO and N 2 LO ≡ NNLO). The validity of these approximations is very important for applying the resummation, which we will study numerically in Section 4. At the moment, we just note that the difference contains β-power suppressed contributions beyond NLP at N k LO, which are exactly what we would like to incorporate through the matching procedure. The matching formula is then simply given by where nLO ≡ n 1 LO and nnLO ≡ n 2 LO as defined in Eq. (2.42). The matched results at NLO+NLP and NNLO+NLP precisions are then our main results in this paper, based on which we will present our best predictions in Section 4. Before going into that, we first perform the calculation of the hard functions at NLO in the next section.

The hard functions at NLO
In this section, we discuss the calculation of the NLO hard functions, which were not available in the literature. The hard functions receive contributions from both virtual gluon exchanges and real emission subprocesses. We first consider one-loop virtual corrections where no extra radiation is present. As a result they must be proportional to the tree-level results in Eq. (2.30). We generate the one-loop amplitudes using FeynArts [92], manipulate them with FeynCalc [93][94][95], and reduce the relevant integrals to a set of master integrals using Reduze2 [96]. The calculation of the master integrals is straightforward and we collect the results in Appendix A. Supplemented with the trivial one-body phase space integral, the bare virtual contributions to the NLO hard functions can be written as where L M = ln(µ 2 r /M 2 tt ). Note that we have put in the numerical values of the color factors C F = 4/3 and C A = N c = 3 here and below for simplicity. The above results contain both UV and IR divergences. The UV ones are removed by renormalization. We renormalize the fields and the top quark mass in the on-shell scheme, and renormalize the strong coupling in the MS scheme with the top quark integrated out and N l = 5 active flavors. We collect the relevant renormalization constants in Appendix A. After renormalization, we get the UV-finite virtual contributions as follows: We now turn to the real emission subprocesses The sum over X in the definition Eq. (2.15) of the hard function now involves integrating over the momentum k. This leads to the two-body phase-space integral At NLO, the kinematic variables either do not appear in the Wilson coefficient in Eq. (2.15), or are fixed by the delta functions in Eq. (3.4). Therefore the whole integral can be carried out which leads to where Y max is a function of Q T and z defined in Eq. (2.37), satisfying where again Q T,max is defined in Eq. (2.37) as a function of z. Later we will often invoke the value of Y max at Q T = 0. It therefore deserves a separate symbol which we write as which satisfies The Wilson coefficients in the definition (2.15) of the hard functions are divergent in the limits z → 1 and Q T → 0 which correspond to soft and collinear singularities. These singularities are regularized in dimensional regularization by the factor of Q −2 T appearing in Eq. (3.5). In practice, it is useful to write One can then perform the expansion in using where the plus-distributions and star-distributions satisfy for some test functions f (z) and f (Q 2 T ). It will be convenient to introduce the scattering angle θ of the tt pair in the partonic center-of-mass frame. It satisfies the relations The inverse relation reads Using the delta functions in Eq. (3.5), it can be further expressed as It should be stressed that while there is a factor of 1 − z in the denominator above, the value of y is well-defined in the limit z → 1. In fact, it is easy to see from Eq. (3.8) that where the sign depends on the sign of Y = ±Y max . We further introduce a few abbreviations to shorten the expressions: The reason to include a factor of 1 − z in the last equation is that the combination of δ Y max and (1/Q 2 T ) * will produce a singularity as z → 1 upon integration over Y and Q 2 T . This can be easily seen from the integral This singularity has to be cancelled by a corresponding factor of 1 − z in the numerator, and we therefore include that factor explicitly here. This will help to identify the leading singular terms in the z → 1 limit later. The results of the hard functions will also involve the one-loop splitting functions given by We can now write the real emission contributions as Combining the virtual contributions in Eq. (3.2) and the real contributions in Eq. (3.19), the soft divergences cancel according to the KLN theorem. However, there are still collinear divergences remaining. These divergences must be absorbed into the PDFs, which is equivalent to adding the following counter-terms (3.20) Finally, we obtain the UV and IR finite NLO hard functions: where L f = ln(µ 2 f /M 2 tt ). The above expressions, when integrated over Q T and Y as in Eq. (2.35), can be rewritten in terms of integrations over y. Namely, we may defineH ij,α as functions of y which satisfy for a test function f (Q 2 T , Y ), where on the right side one should understand that Q T and Y are determined by y and z through Eq. (3.12). It is straightforward to obtainH ij,α from the expressions of H ij,α , Eq. (2.30) and (3.21), by the following replacements: To illustrate the idea, we give the results for the qq channel: The results for the NLO hard functions serve as an important ingredient in the factorization formula at NLP. Combining them with the other ingredients, we are now ready to perform various numerical analyses, which is the main topic of the next section.

Numerical results and discussions
In this section, we use our resummation formula to carry out several numerical studies and present phenomenologically relevant results. We will discuss in more detail the three points raised in the Introduction concerning the difference between our result and the result of Ref. [47]. Throughout this section we take Γ t = 1.4 GeV, use the NNPDF3.1 NNLO PDFs [97] with α s (m Z ) = 0.118, and set the renormalization scale µ r to be the same as the factorization scale µ f . The default scale is chosen to be H T /4, if not otherwise stated. To estimate the scale uncertainties of the differential cross sections, the two scales are varied simultaneously up and down by a factor of 2.

Validity of the threshold approximation
Any factorization and resummation formula is only valid in kinematic regions where higher order power corrections are small compared to the required accuracy. It is therefore necessary to check the validity of the relevant approximation in the region of interest before performing the resummation. One way to do that is to compare the fixed-order expansion of the resummation formula against the exact perturbative results. In the region of validity, the expansion should provide reasonable approximations to the exact results order-by-order.
In this subsection we carry out the validity check of our resummation formula in the region 300 GeV ≤ M tt ≤ 380 GeV at the 13 TeV LHC. This is straightforward since we already have the fixed-order expansion of the resummation formula in Eq. (2.42). We just need to check whether the n k LO results are good approximations to the exact N k LO ones. We first note that due to our normalization of the factorization formula Eq. (2.17), the n 0 LO result (i.e., the first term in the fixed-order expansion) is precisely the same as the exact LO one in Eq. (2.8). The factorization formula of Ref. [47], on the other hand, has a different normalization than ours. Consequently, the first term of their expansion would not be the same as the exact LO. The difference, of course, is formally power-suppressed in β, but it has significant impact on the validity of the formula when β is not so small, e.g., when M tt ∼ 380 GeV.
We now proceed to perform the comparison at NLO. We show the exact NLO M tt distribution in the range [340-380] GeV in the left plot of Fig. 2 as the red band, while the nLO one from the expansion (labelled as "NLO β → 0") is shown as the blue shaded band. It can be clearly seen that the nLO result provides an excellent approximation to the exact NLO one in the whole range, including scale variations. Since both the NLO and nLO results include the common LO term, it is interesting to compare just the corrections (i.e, the second term in the perturbative series). We show this comparison in the right plot of Fig. 2. Again, the agreement is remarkable. The plot also shows clearly that the deviation between the two results gradually increases from small β to larger β, but remains under-control even when M tt is as large as 380 GeV. The agreement we just observed is a strong implication for the validity of the resummation formula Eq. (2.17) in the region of interest. We emphasize again that such an agreement is only possible due to the fact that we have correctly taken into account the subleading-power contributions in β at LO in α s . If we had used a different normalization factor, the agreement at the upper edge of the region of interest would not be as good.
At this point, it is worthwhile to discuss the z-soft limit where z ≡ M 2 tt /ŝ → 1. Such a limit in the context of the M tt distribution has been extensively studied in the literature [32][33][34]. By taking this limit it is possible to resum logarithms of 1 − z to all orders in α s , at the price that power corrections in 1 − z are neglected. As such, it can be expected that this limit works better at larger values of M tt than the threshold region. Furthermore, Ref. [47] employed the double limit β → 0 and z → 1, which neglects power corrections in both β and 1 − z. Given the high collision energy of the LHC compared to the values of M tt we are considering (hence z is not necessarily close to 1), and the fact that β is not so small at M tt ∼ 380 GeV, one must carefully check the validity of such a double approximation in the region of interest.
The NLO result in the z → 1 limit can be obtained from [34]. The result in the double limit β → 0 and z → 1 can be obtained from our formula Eq. (2.17) by further taking z → 1. This amounts to keeping only the singular plus-and delta-distributions in the hard functions, which is straightforward given their expressions in Eq. (3.21). In this limit, only the flavor-diagonal channels (i.e., the qq and gg channels) contribute. We collect the relevant analytic expressions in Appendix B, and show the numeric results in plot, we compare the exact NLO corrections with that in the z-soft limit z → 1. We see that although the agreement is not so good (as expected), the z-soft limit still captures a dominant portion of the NLO corrections. This is a justification for the application of the soft gluon resummation to this region as in [34,36,38]. On the other hand, the NLO corrections in the double limit β → 0 and z → 1 are shown in the right plot of Fig. 3. It is obvious that the double limit does not provide a reasonable approximation at all. Therefore, the factorization formula valid in the double limit cannot be applied to the region we are considering. Although such a factorization formula can be used to resum certain logarithmic terms to all orders in α s , they are not the dominant contributions and such a resummation may even lead to incorrect estimation of higher order corrections. In other words, the power corrections in 1 − z are not under-control in this situation and consequently the results cannot be trusted. Based on the above observations, we do not perform the z-soft gluon resummation in the β → 0 limit in our work, in contrast to [47].

NLP Resummation at 13 TeV LHC
Given the perfect agreement between the approximate (β → 0) and exact results at NLO, we will apply the small-β resummation at NLP to the range 300 GeV ≤ M tt ≤ 380 GeV at the 13 TeV LHC. Our starting point is the matching formula Eq. (2.44), which combines the all-order resummation with the fixed-order results at NLO or NNLO. We will compare our numeric predictions with the experimental data [39], and therefore we use m t = 172.5 GeV in accordance. In this subsection and the subsequent ones, whenever we present numeric results for a broader range of M tt , it should always be understood that the resummation is only applied to M tt ≤ 380 GeV. We have checked that the results are insensitive to the exact point at which resummation is switched off. This should be clear from the analyses below.
First of all, given the matching formula (2.44), it is interesting to ask in which region the resummation effects (which are added onto the fixed-order results) are important. This information is encoded in the correction term of Eq. (2.44). The first term in the above difference contains all-order information in the strong coupling. It is instructive to see its perturbative behavior order-by-order. This is shown in the left plot of Fig. 4, up to the 5th order in α s . We see that the perturbative expansion converges rather quickly for values of M tt not too close to the 2m t threshold. However, in the threshold region, the perturbative behavior goes wild. While the LO vanishes and the nLO approaches a constant value in the threshold limit M tt → 2m t , the differential cross section becomes divergent starting from nnLO. The nnLO and n 3 LO distributions are still integrable, but the n 4 LO one will give rise to infinite total cross section if one integrates down to the threshold. Such a breakdown of the perturbation theory in the threshold region is a natural reflection of the (α s /β) n terms from Coulomb gluon exchange. The divergent behavior observed above is cured by the resummation. We show a comparison between the NLP resummed result and its perturbative expansion in the right plot of Fig. 4. We also show the LP resummed result for reference. The divergence in the threshold region is replaced by a small peak in the NLP resummed distribution. One can also observe that the NLP distribution extends below the 2m t threshold, where the difference 2m t − M tt can be viewed as the binding energy of the tt "bound-state". The low-energy tail of the distribution is rather long, all the way down to M tt ∼ 300 GeV. This is due to the relatively large decay width of the top quark. On the other hand, we have checked that the integrated cross section in the [300, 380] GeV bin is insensitive to Γ t . It is also clear that in and below the threshold region, the LP and NLP distributions are rather similar, showing the good convergence of the power expansion in β. Above the threshold, the difference between the LP and NLP results are mainly induced by the O(α s ) corrections including the NLO hard functions.
It is already evident from Fig. 4 that the resummation effects are only important in and below the threshold region. As M tt increases, the nLO and nnLO curves quickly approach the NLP one, meaning that the NLP corrections defined by Eq. (4.1) become small with respect to the fixed-order results when M tt is far above the threshold. To see this more clearly, we directly plot the correction terms dσ NLP − dσ nLO and dσ NLP − dσ nnLO of Eq. (4.1) in Fig. 5. These quantify the corrections induced by resummation upon the NLO and NNLO results. The plots make it clear that the resummation effects concentrate in the region near and below the threshold, or more precisely, where M tt < 350 GeV. In this region β < 0.17 and pNRQCD is perfectly applicable. On the other hand, for M tt > 350 GeV, the corrections are almost negligible. As a result, the NLO+NLP and NNLO+NLP predictions are dominated by the fixed-order terms away from the threshold. This demonstrates that our resummation has not been applied to regions where subleading corrections in β might be important, and makes our predictions more robust. Later on, we will sometimes show predictions for a broader range of M tt , where resummation is switched   Figure 6. The NLO+NLP and NNLO+NLP predictions for the absolute M tt distribution against the CMS data in the di-lepton channel [39]. Fixed-order results are shown for comparison. The left plot shows the first bin M tt ∈ [300, 380] GeV, while the right plot shows the full M tt range. off beyond 380 GeV. From Fig. 5, it should be clear that the results are insensitive to the the exact switch-off point, as long as it is larger than ∼ 360 GeV.
We are now ready to present the matched results combining the resummation and fixed-order calculations, namely, the NLO+NLP and NNLO+NLP predictions. We show the results for the absolute differential cross sections in Fig. 6, where the NLO and NNLO results are also given for comparison. The uncertainties estimated from scale variations are shown as the vertical bars. At central scales µ r = µ f = H T /4, resummation effects increase the cross section in the first bin by 13% with respect to NLO, and by 9% with respect to NNLO. It should be noted that the uncertainty bar of the NNLO result does not overlap with that of the NNLO+NLP one. This shows that scale variations alone cannot faithfully account for the uncertainties of fixed-order calculations in this situation, due to the fact that the Coulomb resummation is genuinely non-perturbative. After adding the resummation effects, the NLO+NLP and NNLO+NLP predictions become more consistent with the CMS data than the fixed-order ones. This has significant impacts on the top quark mass determination, as we will discuss in the next subsection.
The experimental collaborations often quote the normalized differential cross sections (dσ/dM tt )/σ in addition to the absolute ones, where σ is the total cross section. Normalization of the distribution has the benefit that part of the systematic uncertainties drops out when taking the ratio. On the theoretical side, normalized differential cross sections often exhibit smaller scale uncertainties as well. In Fig. 7, we show the NLO, NNLO, NLO+NLP and NNLO+NLP predictions for the normalized differential cross section in the first bin M tt ∈ [300, 380] GeV, in comparison with the CMS data [39]. We see that indeed, the scale uncertainties of all predictions are significantly reduced compared to those of the absolute differential cross sections of Fig. 6. We also find that the NLO and NNLO results are rather close to each other. This shows that the NNLO correction to the normalized distribution is not very large. On the other hand, the resummation still shows big impact in this case: about 11% increase from NLO to NLO+NLP, and about 8% increase from NNLO to NNLO+NLP. This demonstrates that our conclusions in the last paragraph drawn from the absolute distribution remain unchanged when considering the normalized differential cross sections.
So far we have only discussed the single differential cross section with respect to M tt . Thanks to the full kinematic dependence of the hard functions, our framework is flexible enough to be applied to double or triple differential cross sections, which were measured and employed to fit the top quark mass in, e.g., Ref. [14]. To illustrate the idea, we have  [14]. calculated the double differential cross sections with respect to M tt and the rapidity Y tt of the top quark pair in the laboratory frame. This can be performed using the formula 2) where the partonic differential cross sections can be obtained using Eq. (2.17) as before. We show the normalized double differential cross sections in the threshold region in Fig. 8, compared with the CMS data from [14]. The plot corresponds to the first bin in M tt , namely, M tt ∈ [300, 400] GeV, and contains four bins in Y tt . Again, the resummation effects enhance the differential cross sections by about 7% with respect to the NLO, making the theoretical predictions better consistent with experimental data. The increase here is not as big as that observed in Fig. 7, mainly due to the larger size of the first M tt bin which covers a broader range above the threshold.

Influence on the top quark mass determination
In this subsection, we discuss the influence of our resummed result on the determination of m t from kinematic distributions. Although we cannot repeat the experimental analyses in, e.g., Ref. [14], it is instructive to roughly estimate the impact of including the resummation effects in the fitting procedure.
To determine the top quark mass from kinematic distributions, one collects a set of observables {O i } which are theoretically functions of m t , but can be experimentally }. 4 It can be understood that in such a procedure, the observables most sensitive to m t are the main driving force to decide the outcome. These include, in particular, the M tt distribution near threshold and related double/triple differential cross sections.
From the above description, it is clear that the outcome of the procedure strongly depends on the theoretical predictions entering the fit. Especially, the theoretical inputs for the m t -sensitive observables are of crucial importance. For illustration, we calculate the averaged M tt differential cross sections in the range [300, 380] GeV using different top quark masses. The results are shown as functions of m t in Fig. 9 for the absolute distribution (left plot) and the normalized distribution (right plot). As expected, we observe a strong (and nearly linear) dependence of the differential cross sections on m t , and a large horizontal gap between the NLO and the NLO+NLP curves.
Ref. [14] has used the NLO predictions for the normalized differential cross sections to fit the top quark mass, with the outcome m t ≈ 171 GeV. From the horizontal dashed line in Fig. 9, one can see that the NLO result with m t = 171 GeV is roughly the same as the NLO+NLP result with m t ≈ 172.4 GeV. This 1.4 GeV shift caused by the threshold effects is much more significant than that estimated in [14]. Given that the normalized NLO+NLP and NNLO+NLP results are rather close to each other, we expect a similar shift in the outcome of the fit if one uses the NNLO+NLP result as the theoretical input. We have also check that similar conclusions can be draw if the first bin is chosen as [300, 400] GeV. Therefore, we see that the impact of the resummation effects on the top mass fit is rather concrete: the result of the fit should be much closer to the world average if one takes into account the precision theoretical predictions for the threshold region.

Results at the 8 TeV LHC
The ATLAS and CMS collaborations have also performed measurements of the M tt distribution at the center-of-mass energy √ s = 8 TeV [98,99]. No significant inconsistency between theory and data was spotted in those measurements, which is at first sight confusing. In this subsection, we show that the reason is simply due to the different choices of bins in the 8 TeV measurements than the 13 TeV ones.
To begin with, we repeat the exercises we've done for the 13 TeV LHC. In the left plot of Fig. 10 we compare the exact NLO distribution and the approximate one in the β → 0 limit, while in the right plot we compare the NLP resummed distribution against its fixed-order expansions. As expected, we observe similar behaviors as the 13 TeV case: 1) The approximate result agrees with the exact one rather well up to M tt ∼ 380 GeV; 2) The resummed result regularizes the divergence near threshold, and tends to coincide with fixed-order results far above the threshold. One can then conclude that our resummation framework is reliable also for this case.
We now apply the resummation to the first bin of the experimental result in the lepton+jets channel from the CMS collaboration [98], which is 345 GeV ≤ M tt ≤ 400 GeV. Note that the lower edge has been chosen as 345 GeV instead of 300 GeV used in the 13 TeV measurements. We already know from Fig. 5 that the resummation effects concentrate in the region slightly below the 2m t threshold. Therefore, it can be expected that the numeric impact of resummation should not be significant for this choice of bin. Indeed, we show in Fig. 11 the NLO, NLO+NLP, NNLO and NNLO+NLP predictions for the normalized differential cross sections in this bin. It can be seen that all calculations give similar numeric results, and agree with the experimental data remarkably well.  On the other hand, if the experimental data extends to lower values of M tt , things will be a bit different and the results will show some sensitivity to the threshold effects. Indeed, in the same CMS paper [98]   below the threshold. We show the NLO and NLO+NLP predictions for such a bin choice in Fig. 12. We do observe a slight deficit of the NLO result compared to the experimental measurement. And a small correction from the resummation is also evident.
Had the experimental data extended further downwards, the sensitivity to the resummation effects would be more obvious. In Fig. 13 we compare two choices of the lower edge of the first bin in the M tt distribution, while keeping the upper edge at 400 GeV. The left plot uses the same bin choice as the experimental data in the lepton+jets channel [98], and is in fact an enlarged version of Fig. 11. We see that all 4 results are similar here. In the right plot, we extend the bin down to 300 GeV. One immediately finds that resummation has a big impact on the normalized differential cross sections in this case. We suggest that it is possible to experimentally verify the difference if one reanalyze the data in an extended range of the invariant mass.

Summary
To summarize, we have investigated single and double differential cross sections for tt production involving the pair invariant mass M tt , particularly in the threshold region M tt ∼ 2m t or β ∼ 0. Theoretical predictions for these observables are rather sensitive to the value of m t , such that they can be used to extract the top quark mass from experimental data. The existing experimental studies at the 13 TeV LHC have employed the fixed-order calculations which did not take into account Coulomb effects of the form 1/β and ln β at and below the threshold. In this paper, we have performed a comprehensive study of these effects. Using the framework of effective field theories, we have derived a resummation formula which allows for dynamic renormalization and factorization scales. Such scale choices are often adopted in current theoretical calculations, including fixed-order ones and those with all-order resummation of soft gluon effects. As an important ingredient of our resummation formula, we have analytically calculated the hard functions up to the next-to-leading order. This enables us to perform the resummation of the Coulomb effects to all orders in α s at the next-to-leading power. We further combine our resummed results with the NLO and NNLO calculations through a matching procedure. Our final predictions therefore reach the precision of NLO+NLP or NNLO+NLP.
Our resummation formula is similar to those in the literature, but differs in several important aspects. We have incorporated the leading-order coefficients with the exact dependence on β. As a result, the fixed-order expansion of our resummation formula reproduces the exact LO differential cross section, and to a good approximation the NLO one in the phase-space region of interest. Our resummation formula allows for dynamic renormalization and factorization scales, which are necessary for the combination with the existing NNLO results and for extending the prediction to a broader range of M tt . In our formalism, we do not consider the soft limit z = M tt /ŝ → 1 upon the small-β limit, since we have found that the double limit does not provide a reasonable approximation to the exact result in the threshold region. All the above make our predictions concrete and reliable. In particular, we have extensively checked that we have not introduced spurious corrections in phase-space regions where the small-β approximation might break down. Last but not least, the full kinematic information contained in our resummation formula also enables us to study double differential cross sections, which were not available in previous studies.
In our phenomenological studies, we have concentrated on single and double differential cross sections which were employed by experimental groups to extract the top quark pole mass. We find that for the range M tt ∈ [300, 380] GeV at the 13 TeV LHC, the resummation effects increase the cross sections by about 13% with respect to NLO, and by about 9% with respect to NNLO. The combined NLO+NLP and NNLO+NLP results show better consistency with the experimental data. The resummation effects have a strong impact on the top quark mass determination from the M tt distribution, and can change the result by about 1.4 GeV, which is much larger than the estimated uncertainties in previous experimental studies. The shifted top quark mass is much more consistent with the current world average measured using other methods. We have also investigated the double differential distribution in terms of M tt and the rapidity Y tt of the tt pair, and drawn similar conclusions. We therefore conclude that future experimental studies should include the Coulomb effects at and below the threshold in order to consistently extract the top quark mass.
We have also performed numeric studies for the 8 TeV LHC. Due to the fact that the experimental result does not cover the main portion of the phase-space below the threshold, the resummation effects do not show a big impact if using the same choice of bins. However, we have demonstrated that if one reanalyze the experimental data with an extended first bin, the threshold effects should be visible in the normalized differential cross sections.
Our NNLO+NLP result can be further combined with the NNLO+NNLL result of [34][35][36][37][38] to achieve the best prediction in the whole phase-space region. Inclusion of electroweak effects can also be done similar as [22]. Our formalism can be applied to more kinds of double and even triple differential cross sections in the future. It can be extended to study the associated production of tt with an extra jet, which is also employed in the top quark mass determination. With suitable modifications, it can be applied to tt + Z or tt + H  The NLO results for the integrated hard functions were already obtained in [45]. There

E Possible contributions at NNLP
While it is beyond the scope of the current paper, it is interesting to discuss possible contributions at the next-to-next-to-leading power (NNLP). At this order, there can be double insertions of the NLP Lagrangians and effective operators, as well as single insertions of the NNLP ones. One of the complications here is that crosstalk among different sectors through sub-leading ultrasoft interactions is activated, which cannot be removed by the decoupling transformations of Eq. (2.21). As an example, we consider the double insertion of the NLP pNRQCD Lagrangian term L 1a pNRQCD , which contains ultrasoft interactions. In particular, the double insertion of the first term in Eq. (2.22) induces a new contribution to the potential function with the matrix element 0|T χ † (0)ψ(0) ψ † (x 1 ) x 1 ψ(x 1 ) ψ † (x 2 ) x 2 ψ(x 2 ) |t a 3t a 4 t a 1t a 2 |ψ † (0)χ(0)|0 , (E.1) and a new contribution to the soft function with the matrix element where O s is a product of ultrasoft Wilson lines. These two functions are convoluted together in momentum space due to their common dependence on the coordinates x 1 and x 2 . As a result, the ultrasoft integrals are no longer scaleless and may have a non-zero contribution.
Note that similar contributions have also been discussed in the context of heavy quarkonium fragmentation [100,101]. It remains unknown whether this kind of corrections persist when considering the full NNLP contributions, which is an interesting question for future investigations.