Pole mass renormalon and its ramifications

I review the structure of the leading infrared renormalon divergence of the relation between the pole mass and the $\overline{\rm MS}$ mass of a heavy quark, with applications to the top, bottom and charm quark. That the pole quark mass definition must be abandoned in precision computations is a well-known consequence of the rapidly diverging series. The definitions and physics motivations of several leading renormalon-free, short-distance mass definitions suitable for processes involving nearly on-shell heavy quarks are discussed.


Introduction
The existence of a short-distance scale Q Λ QCD and infrared (IR) finiteness are important requirements for the application of perturbation expansions to strong interaction (QCD) processes, but they are not always enough in practice. The perturbative series is divergent, most likely asymptotic. One of the sources of divergent behaviour, called IR renormalon [1,2,3,4,5], arises from the sensitivity of the process to the inevitable long-distance scale Λ QCD . The degree of IR sensitivity limits the ultimate accuracy of the perturbative approximation. For the pole mass of a heavy quark, this observation has been exceptionally important for particle physics phenomenology, leading to a better understanding of quark mass renormalization at scales of order and below the mass of the quark, and to much improved precision in heavy quark and quarkonium physics.
The quark two-point function d 4 x e ipx Ω|T (q a (x)q b (0))|Ω p 2 →m 2 → δ ab Z i( p + m) p 2 − m 2 + less singular (1) has a pole 1 to any order in the perturbative expansion. The location of the pole in the complex p 2 plane defines the pole mass of the quark, m. The pole is shifted off the real axis by a small amount due to the weak decay of the quark, but for the discussion in this article the imaginary part is not relevant and hence will be ignored. The pole mass of a quark is IR-finite [6,7]. It can be related to other renormalized quark mass definitions order by order in perturbation theory. It is nevertheless not physical, as quarks do not exist as free, asymptotic particle states and the scattering matrix of QCD does not exhibit a pole at m 2 .
1 Due to the masslessness of the gluon, there is no gap between the single-particle pole and the multiparticle cut starting at p 2 = m 2 , and the residue Z is IR divergent.
It is intuitively obvious that the strong IR physics of QCD, which is not captured by perturbation theory, should contribute an amount Λ QCD to hadron masses. For a meson state M composed ofq i q j , M = m i + m j + const × Λ QCD + . . . , (2) which renders the notion of the pole mass useless for light quarks with masses m i Λ QCD . However, for mesons containing heavy quarks with m Q Λ QCD , the pole mass provides a first approximation to the meson mass up to power corrections of relative order Λ QCD /m Q . Starting from, say, the MS mass of the heavy quark, the pole mass represents the perturbatively calculable leading-power approximation to the meson mass, just as the perturbative calculation in terms of quarks and gluons of the total e + e − → hadrons cross section at high energy does to the physical hadroproduction cross section.
There is a deep connection between power corrections and IR renormalon divergence of the QCD perturbative expansion. The existence of a linear power correction is related to the strong IR renormalon divergence of the pole mass series that was discovered in [8,9] and is the subject of this article. In consequence, while the pole mass of a quark appeared as the natural choice for processes involving nearly on-shell heavy quarks, the concept has since largely been abandoned in precision calculations in favour of alternative, (leading) renormalon-free mass definitions.

Pole mass series 2.1 Basic definitions
The pole mass m is related to the MS mass by m = m(µ m ) 1 + ∞ n=1 c n (µ, µ m , m(µ m )) α n s (µ) .
Here α s (µ) is the MS coupling at scale µ in QCD with n l light quarks, and m(µ m ) stands for the heavy quark MS mass evaluated at the scale µ m . In the following I will often set µ m = m, where m refers to the MS mass, evaluated self-consistently at the scale equal to the mass itself, i.e. m = m(m) .
We shall see below that the series (3) diverges for any value of α s = 0. Although a proof does not exist for QCD, it is reasonable to assume that it is asymptotic, and approximates the heavy-meson mass up to exponentially small terms in α s , equivalent to power corrections in Λ QCD /m. Asymptotic expansions can sometimes be summed using the Borel transform. Given a power series the corresponding Borel transform is defined by By convention, the tree-level term "1" in (3) is excluded from the definition. A factorially divergent series of the form r n = Ka n Γ (n + 1 + b) (7) has the Borel transform with a singularity at t = 1/a. The Borel integral has the same series expansion as f (α s ) and provides the exact result under suitable conditions. However, for our case of interest, there will be a singularity on the integration contour, rendering the Borel integral as given ill-defined. Deformations of the contour around the pole or branch cut, or the principal-value prescription, result in the ambiguity of the Borel integral, which has the form of a power correction proportional to Λ . This ambiguity provides a quantitative measure of the limit to the accuracy of a purely perturbative calculation.
The linear power correction to the pole mass therefore corresponds to a = −2β 0 . More generally, the pole mass can be regarded as the first term of an asymptotic expansion of the meson mass in powers of α s and Λ QCD /m, which in modern language has a trans-series structure (again, no proof). 2

Linear IR sensitivity and the large-n f approximation
To gain intuition, we start from the leading IR renormalon divergence of the one-loop correction to the pole mass with fermion-loop insertions into the gluon line, often referred to as the large-n f approximation. The relevant expression is The all-order MS counterterms can be found in [8]. They do not diverge factorially, and can therefore be ignored when discussing the large-n behaviour. Strictly speaking, the fermion-loop insertions provide only the n f -dependent part of β 0 in (11). In full QCD, consistency of the trans-series interpretation of short-distance expansions requires the full expression for β 0 , as will be seen below. The diagrammatic recovery of the full β 0 is discussed in [10]. The substitution of the full β 0 in fermion bubble-chain diagrams is often referred to as "naive non-abelianization" [12,13].
For p 2 = m 2 the integral scales as d 4 k/k 3 for small k. It is thus IR finite, but the contributions from k smaller than Λ QCD , where perturbation theory is not valid, is of order Λ QCD . That this should imply that the pole mass cannot be defined to better accuracy than O(Λ QCD ) was noted in [14]. The connection to the IR renormalon divergence of the perturbative expansion was established shortly after [8,9]. Indeed, the increasing power of logarithms in (11) enhance the IR region and yield (after Wick rotation) with typical k ∼ µ e −n . One can also take the Borel transform of (11), sum the series, which yields an effective gluon propagator. The exact expression for the Borel transform of m−m(µ) can be found in [8]. Approximating the integrand to its leading term in the small-k behaviour is sufficient to obtain the dominant IR renormalon singularity at t = −1/(2β 0 ) (closest to the origin of the Borel plane), resulting in The corresponding asymptotic behaviour of the series expansion in α s = α s (µ) is As expected, with a = −2β 0 > 0 the ambiguity of the Borel integral and hence of the pole mass 3 is proportional to Λ QCD , independent of the arbitrary renormalization scale µ.

Exact characterization of the divergence
The relation of IR and ultraviolet (UV) renormalon divergence with the small-and large-momentum behaviour of Feynman diagrams, respectively, allows for a precise characterization of the corresponding singularities in the Borel transform in terms of the factorization properties of observables and correlation functions in these limits. In asymptotically free, renormalizable field theories the UV renormalon singularities occur at t = n/β 0 < 0 and can be related to local operators of dimension 4 + 2n in the regularized theory expanded for large values of a dimensionful cut-off [15] (see [16] for the case of QCD). If the theory has power UV divergences, as is usually the case for effective field theories, the UV renormalon singularities extend into the positive real axis of the Borel plane. Likewise, the IR behaviour of correlation functions is often amenable to expansions in the ratio of Λ QCD and a hard scale, in which case the IR renormalons at t = −n/β 0 > 0 are related to higher-dimensional terms in the operator product expansion (OPE). For example, for the two-point function of two vector currents, the position t = −2/β 0 of the leading IR renormalon divergence of its perturbative series C 0 (α s , Q/µ) is determined by the dimension (four) of the gluon condensate correction [2,3,4,5]. For both, UV and IR renormalons, the parameter b in (8) is determined by the anomalous dimension(s) of the relevant operators. Through renormalizationgroup equations (RGEs), one can determine the α s dependence of the ambiguity of the Borel integral and thus determine 1/n corrections to the leading large-order behaviour in terms of OPE coefficient functions, the anomalous dimensions of all operators of a given dimension, and the beta function coefficients [17]. 4 This leads to the remarkable conclusion that the singular points of the Borel transform due to IR and UV renormalons can be completely specified, except for a set of normalization constants K in (8), whose number matches (at most) the number of operators. They appear as initial conditions of the RGE and should be viewed as non-perturbative [17,18]. The application of these ideas to the large-order behaviour of the pole mass series exhibits some unique features: a) due to the linear IR sensitivity, the IR renormalon divergence is particularly strong and dominates over the sign-alternating UV renormalon series; b) the leading IR renormalon singularity at t = −1/(2β 0 ) involves only a single operator and therefore a single unknown normalization constant; c) the operator has vanishing anomalous dimension, hence the sub-asymptotic 1/n k corrections are determined only from the QCD beta-function, which is known to high-order in perturbation theory.
The derivation of these statements in [19] builds on the observation [8] that the leading IR renormalon in the pole mass is related to an UV renormalon pole at the same position t = −1/(2β 0 ) of the self-energy Σ static of the static quark field in heavy quark effective theory (HQET) with Lagrangian This UV renormalon pole exists, because in contrast to full QCD, the static selfenergy is linearly UV divergent. The only operator with the required mass dimension three ish v h v . It follows that the imaginary part of the Borel integral of Σ static is given by where Σ static hh is the static self-energy with a zero-momentum insertion ofh v h v . The coefficient E(α s , µ) satisfies the RGE However, γh v hv vanishes to all orders in perturbation theory, sinceh v h v is the conserved heavy quark number current of HQET. This justifies statements b) and c). One then shows that [19] Im The α s -dependence of the imaginary part of the Borel integral determines the largeorder behaviour of the perturbative expansion of ∆m according to (7), (9) up to a single normalization constant, N . Defining the result isc where [19,20] b = −β 1 /(2β 2 0 ) and The pole mass series is particularly simple, because the large-order behaviour is completely determined in terms of the β-function coefficients. Since the five-loop betafunction coefficient β 4 is now known [21,22,23], the sub-asymptotic behaviour including 1/n 3 corrections is known. Numerically, for the interesting cases discussed below, the corrections to the leading large-n behaviour do not exceed 3% for c 4 .

The top, bottom and charm mass series
The normalization constant N in (20) cannot be determined exactly by purely perturbative methods. However, given that the pole-MS mass relation (3) is known to the four-loop order [24,25] and the asymptotic behaviour is known including 1/n 3 corrections, one may attempt to match the two at n = 4. In other words, while we evaluate the ratio for n = 4 and check the stability of the result by comparison with n = 3 [20]. In the following, all n l quarks other than the heavy quark will be assumed to be massless. The effect of internal quark mass effects will be discussed below. The coupling α s in (3) is the MS coupling in the n l -flavour theory. Since the IR theories are different, N is expected to depend on n l . It is interesting to apply this method to the large-n l limit (more precisely, n l → −∞). In this limit, b = s i = 0, and N can be calculated exactly from (14) to be lim which equals 0.97656 (C F = 4/3 for N c = 3). This can be compared to evaluating (25) for n = 4 at µ = µ m = m, which gives 0.971 in very good agreement with the exact result. According to (20) the dependence of the asymptotic behaviour on µ and µ m is very simple, since the logarithms of µ in perturbation theory must exponentiate to powers asymptotically. The approximate determination of N is most accurate when choosing µ ≈ m and exhibits a plateau around this value [20]. With this validation of the method we consider the pole to MS mass series for the top, bottom and charm quark, corresponding to n l = 5, 4, 3, respectively. For a detailed analysis of the top-quark case, see [20].
where the tiny difference relative to [20] is due to the inclusion of s 3 , which was not fully available at the time (lack of β 4 ). A conservative estimate of the uncertainty of these values from the independent variations of µ and µ m is ±10% (error symmetrized, see [20]), but the accuracy of the large-n l result at µ = µ m = m suggests that it may be considerably smaller. It is worth noting that N is only half as large as the large-n l result for physical values of n l , implying that the intrinsic ambiguity of the pole mass is smaller than inferred from the one-loop correction dressed by fermion loops.
To display the numerical properties of the series, I use the MS mass values m t = 163.643 GeV, m b = 4.20 GeV and m c = 1.28 GeV. The strong coupling is taken to be α    for the series expansion of the mass conversion formula. In these expressions the first five numbers correspond to the five exactly known terms including the four-loop order, and the subsequent numbers in italics are obtained from the asymptotic formula (21). By construction, the asymptotic formula agrees with the exact one for the fifth term. The minimal term of the series is highlighted in bold face. The asymptotic formula corresponds to the "prediction" at µ = µ m = m Q for the presently unknown five-loop conversion coefficient. The behaviour of the series is illustrated in Figure 1, including an extrapolation of the asymptotic formula to all n > 0. It is apparent that it is already accurate at the three-loop order, n = 3. Comparing (28) -(30), we observe that the top mass series attains its smallest term at the eighth order in perturbation theory, far beyond the four-loop order currently known. On the other hand, the bottom series reaches its minimal term at this order, while the charm series starts to diverge from the two-loop order, which renders the charm pole mass of limited use for phenomenology. From a pragmatic point of view, the minimal term represents the ultimate accuracy beyond which the purely perturbative use of the pole quark mass ceases to be meaningful. The minimal term scales as α s (m Q ) Λ QCD and decreases with larger m Q , which reflects the fact that the minimum is shallower in this case. A renormalization-group invariant measure of the intrinsic limitations of the concept of the pole mass can be defined in terms of the ambiguity (10) of the Borel integral of the series,  (28) -(30), exhibiting the typical dependence of the size of terms with order of a factorially divergent series. The (blue) circles for n < 5 represent exactly known terms, the (yellow) squares the asymptotic formula applied to all n > 0. exactly proportional to Λ QCD , as it should be. Dividing by π gives a numerical value close to the minimal term for the top mass series, and this definition of ultimate accuracy has been adopted in [20].
Determining the non-perturbative normalization N of the leading pole mass renormalon singularity from matching to the highest known order is perhaps the simplest and most intuitive, but not the only method that has been suggested. I refer to [28,29,30,31,32,33,34,35] for other work, noting that earlier work did not have access to four-loop accuracy. As will be seen in Sec. 3.4 below, N is related to a similar leading renormalon constant of the static heavy quark potential by a factor −1/2 [36,37,38]. Some of the quoted works apply this relation to infer N from an analysis of the series expansion of the colour-singlet Coulomb potential. The normalization constants are much more difficult to obtain when there is more than one constant, or an interference of sign-alternating UV and fixed-sign IR renormalon behaviour, as is the case for generic observables. In this case one can resort to simplified parameterizations of the Borel transform as was done for τ -decay spectral moments [39], but the level of rigour and precision that has been achieved for the pole mass of a heavy quark is unmatched by any other series in QCD.

Internal quark mass effects
The analysis assumed up to now that the lighter quarks are massless. In low orders of perturbation theory, this is often a good approximation, especially for the top quark pole mass, where m c , m b m t . (We do not consider the effect of the up, down and strange quark mass and always neglect quark masses smaller than Λ QCD .) However, in the regime where the series is dominated by the leading renormalon divergence, the typical loop momentum at order α n+1 s is of order m Q e −n . Internal quark mass effects from the bottom and charm quark are expected to become important in higher orders. The minimal term of the series is attained when the typical loop momentum is of order Λ QCD . At this scale the theory is a theory of three massless flavours, independent of whether the heavy quark was the top, bottom or charm quark. Hence, the true large-n behaviour of the series beyond the minimal terms is always determined by the n l = 3 result, and likewise the ambiguity (32) should involve the Λ-parameter Λ (3) QCD in the three-flavour scheme, always excluding the bottom and charm quark, independent of n l .
The decoupling of internal quarks with masses m q larger than Λ QCD in the renormalon asymptotic behaviour was studied analytically and numerically in the large-n l limit [40], and the described behaviour has been demonstrated. More precisely, the analysis showed that the asymptotic behaviour of the series in a theory with n l quarks of which n m are massive, approaches the series of the theory with n l − n m massless quarks when both are expressed in terms of the MS coupling α (n l −nm) s (m Q ) in the n l − n m flavour scheme. However, as noted above the large-n f limit overestimates the normalization of the leading renormalon by about a factor of two. Furthermore, one is rarely interested in the formal large-n behaviour of the series beyond the minimal term, but rather in the approach to it. At such intermediate orders, the typical loop momentum crosses the flavour thresholds, as the order of perturbation theory increases, and the internal quark masses are neither negligible nor decoupled.
The issue is especially relevant for the top quark, since the masses of the bottom and charm quark are too small in relation to m t to express the entire series in terms of the four-or three-flavour coupling. In contrast, one may argue [32] that the bottom pole to MS mass conversion factor should be expressed in terms of α  s (m b ). For the two-and three-loop coefficients, for which the mass dependence is known [41,42,43], this substitution indeed renders the charm mass effect almost negligible. A quantitative investigation of bottom and charm mass effects on the top pole mass series was undertaken in [20,44]. The following discussion is adapted from [20].
I first recall that the numerical series (28) - (30) for m Q include internal loops of Q, but are expressed in terms of α (n l ) s (m Q ), where n l is the number of massless quarks, including bottom and charm for the case of Q = top (n l = 5). To estimate the effect of the finite bottom and charm mass, we switch from the five-to the fourflavour scheme at the order, where the typical internal loop momentum is of order m b , which is O(α 5 s ), and from the four-to the three-flavour scheme at O(α 6 s ). Since the mass effect is not known for c 4 at the four-loop order, and since c n beyond the fourloop order can only be estimated assuming dominance of the first renormalon (as done above), this implies the following procedure: a) at two and three loops, we include the known mass dependence, but c 4 is approximated by the massless value.  5 Since the c n increase as n l decreases, the mass effect is also expected to be positive in higher orders. Hence approximating c 4 by its massless value underestimates the mass effect. b) At the five-loop order, we use c (as) 5 [α (4) s (m t )] 5 with c (as) 5 determined by matching to the exactly known four-loop coefficient for n l = 4, that is with normalization N m = 0.5048 and beta-function coefficients for the four-flavour theory. c) Beyond five loops, the remainder of the series is computed with the three-flavour scheme coupling α (3) s (m t ) and normalization N m = 0.5366. Since the bottom and charm quarks are not yet completely decoupled at the five-to seven-loop order, and since an extra quark flavour decreases the c n , we expect that b) and c) overestimate the mass effect, since the approximation assumes that bottom and charm are already decoupled completely. The sum of b) and c) adds another 54 MeV to the top pole mass, such that the total mass effect is estimated to be 80 MeV. Explicitly, the series (28) where the increasing importance of finite-quark mass effects with order is evident. In case of the top quark pole mass, the decoupling of the bottom and charm quark in internal loops increases the intrinsic uncertainty of the pole mass concept by almost 50% due to the more rapid divergence of the series in the three-massless flavour theory. Note that this ambiguity is independent of the precise value of the bottom and charm mass, as long as m b , m c Λ QCD . This also implies that it is the same for any heavy quark, including the bottom quark, since it depends only on the infrared properties of the theory, which is QCD with three approximately massless flavours.
Since the bottom quark is neither heavy enough to be decoupled in low orders, nor light enough to be ignored, where in both cases a massless approximation can be justified, there is an inherent uncertainty in the above estimate. However, as argued above, the errors in the approximations are expected to go in opposite directions, hence we consider (80 ± 30) MeV a conservative estimate of the total internal bottom and charm quark mass effect on the top pole mass. The 30 MeV error estimate arises from an estimate of the neglected mass effect on c 4 by extrapolation from the known lower orders. The approximation described here has been checked to work well in models for the series inspired by the large-n l limit.

Finite width
With the electroweak interaction turned on, the heavy quarks become unstable. In perturbation theory, the pole of the heavy-quark propagator is shifted to which defines the pole mass m and on-shell decay width Γ . Unlike the quark mass, the width is not a parameter of the Standard Model (SM) -for heavy quarks it can be computed in perturbation theory in terms of m and other SM parameters. The width is negligibly small compared to m except for the top quark, where Γ t ≈ 1.4 GeV. The large width Γ t Λ QCD does not eliminate the renormalon divergence of the top pole mass, as was emphasized in [45]. This does not mean that the large width is not relevant, since it does provide a cut-off on IR effects for physical observables. For example, measurements on jets containing top quarks are generically linearly sensitive to Λ QCD and accordingly display a strong renormalon divergence, which can be screened by the sizeable width [46]. In effect, as is intuitive, there is simply no quantity, for which the pole mass of a quark is ever the relevant parameter, once the quark's width is larger than Λ QCD .
Interestingly, the on-shell width of a quark, Γ (m), is itself an observable, which is less sensitive to IR physics than the pole mass. When the final state masses can be neglected, Γ (m) ∝ G 2 F m 5 , where G F denotes the Fermi constant. The leading power corrections are of relative order (Λ QCD /m) 2 . However, when the series is expressed in terms of the pole mass, an IR renormalon divergence indicating an ambiguity of linear order Λ QCD /m appears in the series of QCD corrections to the tree-level width. This ambiguity is spurious and a consequence of using a parameter with stronger IR sensitivity than the observable itself. Once the width of the quark is expressed in terms of the MS mass or (better) another leading renormalon-free mass definition such as will be discussed in Sec. 3, the leading renormalon is cancelled [47], and the series of loop corrections shows a much better behaviour. This is particularly important for the decay width of the bottom and charm quark, for which Γ could otherwise be obtained only with large uncertainty.

Beyond the leading renormalon
Much less is known about the renormalon singularities of the pole mass series beyond t = −1/(2β 0 ). On general grounds one expects a sign-alternating UV renormalon divergence from a singularity at t = 1/β 0 , and an IR renormalon singularity at t = −1/β 0 related to the Λ 2 QCD /m kinetic-energy correction to the meson mass (2). The Borel transform of the series is known exactly in the large-n f limit [8]. Table 11 in [10] displays the breakdown of the nth order term into the contributions from the first three IR renormalon and the first UV renormalon poles, and the MS subtraction terms. At least in the large-n f approximation, the subleading poles never contribute more than one permille of the dominant asymptotics from t = −1/(2β 0 ) for n beyond the four-loop order. For practical purposes, dealing with the leading singularity appears to be enough.
Curiously, the Borel transform in the large-n f limit does not exhibit the expected next IR renormalon singularity at t = −1/β 0 . The authors of [8] speculated that Lorentz invariance might forbid a quadratic power-divergent mixing of the kinetic energy operatorh v (iD ⊥ ) 2 h v intoh v h v , in which case there would be no matrix element to compensate the ambiguity of the Borel transform from t = −1/β 0 , and hence it should be absent. Invoking the virial theorem of HQET, the power-divergent mixing of the kinetic energy operator was related to the one ofh v ig s G µν h v intoh v h v [48]. This work confirms that Lorentz invariance forbids one-loop mixing, which explains the absence of the t = −1/β 0 singularity in the large-n f limit, but also showed that there is no reason for this to hold beyond this limit. Recent investigations [49] of a remainder series with the leading renormalon subtracted show sign alternation more fitting to UV renormalon behaviour. The question whether the subleading renormalon singularity at t = −1/β 0 is absent or simply suppressed by a loop factor, is therefore still undecided.

Renormalon-free "on-shell" masses
Many important properties of heavy quarks are less IR sensitive than the pole quark mass itself -for example, the inclusive decay width discussed above, or the production cross section of a heavy quark-antiquark pair. Their perturbative expansions do not display an IR renormalon singularity at t = −1/(2β 0 ), leading to rapid divergence, provided they are not expressed in terms of the pole mass. In other words, although the pole mass is IR finite, it is for many purposes not a useful renormalized mass parameter. Instead, one should use a renormalization convention that is not only IR finite but also insensitive to the IR at least at linear order Λ QCD .
The MS definition suggests itself. However, in physical systems where heavy quarks are nearly on-shell and have primarily soft fluctuations, the MS mass is not the appropriate choice. Being essentially a bare object, it does not include the shortdistance fluctuations, which should have been integrated out to describe soft heavyquark systems. In practice, this means that the MS mass value is too far away (by O(m Q α s )) from the pole of the heavy-quark propagator. While the spurious pole mass renormalon is eliminated and the asymptotic behaviour improved, there still appear large corrections in low orders. It does not help to evolve the MS mass m(µ) to scales µ < m, since the MS quark-mass anomalous dimension applies only to the logarithmic evolution above m.
The resolution to the problem consists in quark mass concepts that are numerically closer to the pole mass, yet are constructed such that their perturbative relation to the MS mass is free from the leading IR renormalon. This has several benefits: 1) The concept is unambiguous, at least up to O(α s Λ 2 QCD /m), which is sufficient for practical purposes. 2) Such masses can be determined accurately from measurements or lattice calculations, and 3) they can be precisely related to the MS mass. 4) The impact of light internal quark mass effects is reduced, since the leading IR loop momentum contributions have been removed. The MS mass is then the convenient reference parameter (similar to α s (m Z ) for the strong coupling) to which different leading renormalon-free, "on-shell" mass definitions can be related.

General considerations
We start from the observation that the asymptotic behaviour (20) of the pole to MS mass conversion (3) has a very simple, exact linear dependence on the coupling renormalization scale µ, which follows on very general grounds [17], as well as on µ m , which appears only through m(µ m ). The asymptotic coefficientsc (as) n+1 themselves in (21) are m(µ m ), µ and µ m independent. Since the Borel-integral ambiguity of the asymptotic series is always Λ QCD , 6 we can replace m(µ m ) by another scale µ f . We therefore define The series coefficients s X n (µ/µ f ) are polynomials of order n − 1 in ln(µ/µ f ) and must be chosen to satisfy where N andc (as) n are exactly the same as for the coefficients in the pole to MS mass relation (20) and (21), respectively. 7 Once such s X n have been found, we can define a leading renormalon-free, "short-distance" mass m X (µ f ) by subtracting δm X (µ f ) from the pole mass m: n (µ, m(µm)). 7 It is also assumed that the series defines an RGE invariant, such that all logarithms of ln(µ/µ f ) can be absorbed into the running coupling at scale µ f . s X n ≡ s X n (1).
By construction the leading IR renormalon divergence of the series cancels in the square bracket. This in turn guarantees that the series that relates m X (µ f ) to the MS mass m(µ m ) is well-behaved (no leading IR renormalon divergence).
The new scale µ f should be chosen such that Λ QCD µ f m. The first inequality is required for perturbativity. The second guarantees that the difference between the pole mass and m X (µ f ) is only of order µ f α s and can be made sufficiently small to avoid the problem with the MS mass (where µ f ∼ m) discussed above. A common feature of all renormalon-free quark mass definitions suitable for the description of nearly on-shell heavy-quark physics is therefore the existence of a new "subtraction scale" µ f and a linear dependence on this scale. This reflects that the running of the quark mass changes from logarithmic to linear below the scale m in accordance with the fact that the self-energy of a point charge is linearly divergent in the static or non-relativistic regime and turns logarithmic only when the anti-particle fluctuations become relevant in the relativistic theory.
The leading renormalon-free masses satisfy a simple renormalization group equation in the subtraction scale µ f , which may be used to relate m X (µ f ) at different scales µ f 2 , µ f 1 , when logarithms of µ f 2 /µ f 1 might have to be summed. Defining the anomalous dimension γ X (α s ) through the general form (35) of the subtraction term yields In the following, I discuss several suitable mass definitions with no claim to completeness. With respect to (2), note that any definition of δm X (µ f ) automatically yields an unambiguous, renormalon-free definitionΛ X (µ f ) of theΛ parameter that appears in the heavy-quark expansion of the heavy meson mass and many other HQET expressions by the rearrangement This can be turned around: any renormalon-free definition of the HQET parameter Λ can be turned into a renormalon-free, "short-distance", on-shell mass definition.

RS mass
The renormalon subtracted (RS) mass definition [28] is the first of two schemes, which implement the condition (36) in a very direct way. Namely, for RS, one simply defines the s RS n to equal the asymptotic coefficients, that is While this expression could be used for any µ/µ f , the implementation proposed in [28] first assumes µ = µ f , in which case which by construction subtracts the leading renormalon divergence of the series (3), and then replaces α s (µ f ) by its series expansion in α s (µ), where µ is the scale at which the pole to MS series is evaluated. For the bottom and charm mass, the effectiveness of this subtraction is analyzed in detail in [32], which also discusses variants of this definition.
A drawback of the RS mass definition is that it needs a precise determination of the normalization N , which depends on the method employed and further on the number of light flavours, see (27). To fully define the RS mass, one needs to provide the order to whichc (as) n is included according to (21), and specify the value of N .

MSR mass
Another simple realization of the general subtraction condition (36) that avoids the drawback of the RS mass definition is to set the s n equal to c n and simply replace m(µ r ) in (20) by the subtraction scale µ f [31,34]. More precisely, for µ = µ f , which is assumed here, we define where the pole to MS mass conversion coefficients are evaluated at µ = µ r = m(µ r ) = m, in which case they are pure numbers. This gives the "practical version" [34] of the MSR mass definition The requirement (36) is satisfied since according to (20) s MSR The MSR mass subtraction is straightforward to implement, once the pole to MS mass conversion coefficients are given. The MSR mass interpolates between m for µ f = m and the pole mass for µ f = 0, although the latter limit cannot be taken as the coupling α s (µ f ) flows into the strong-coupling regime. The efficiency of the MSR mass subtraction is analyzed in detail in [34]. Both the RS and MSR mass satisfy a simple renormalization group equation in the subtraction scale, as discussed above. As for µ f = m, the MSR mass equals the MS mass m, the relation between m MSR (µ f ) and m can be obtained conveniently by solving the RGE equation, see [34]. Alternatively, as for the RS scheme, one can replace α s (µ f ) in (44) by its series expansion in α s (µ), where µ is the scale at which the pole to MS series is evaluated, as long as ln(µ/µ f ) is small enough not to require resummation.

PS mass
The potential-subtracted (PS) mass [36] is the first of two renormalon-free, shortdistance, on-shell masses, which are motivated and defined in terms of another physical quantity than the pole mass. A non-relativistic system of heavy quark and antiquark in a colour-singlet configuration experiences an attractive potential force, whose leading term is the Coulomb potential. In momentum space, where q = |q| and v c = α s + . . . incorporates the loop corrections to the tree-level potential.
The PS scheme is based on the observation that there is a cancellation of the leading divergent series behaviour in the combination 2m + [V (r)] Coulomb . This can be seen explicitly at the one-loop order and in the large-β 0 approximation [36,37,38], and by a diagrammatic argument at two loops [36] and beyond. The cancellation expresses the fact that while the separation of the total energy of the quarkoniumlike system into the quark pole masses and binding energy is ambiguous (as was the case for m+Λ for a heavy-light system), the total energy is physical and unambiguous. The PS mass at subtraction scale µ f is defined by which removes the leading IR contributions to the self-energy from q = |q| < µ f . The series expansion of v c (α s (µ), q/µ) appearing in the Coulomb potential (46) is conventionally written in the form v c (α s (µ), q/µ) |µ=q = α s (q) + ∞ n=1 a n α s (q) 4π The coefficients a 1,2,3 are known, where a 3 refers to the three-loop colour-singlet Coulomb potential [50,51,52]. I note that the potential here is not defined in terms of a Wilson loop, but as a matching coefficient [53] to potential non-relativistic QCD (PNRQCD) [54,55], defined with minimal subtraction. The last term in (48) is the first of an infinite series of terms, which contains an explicit dependence on the factorization or PNRQCD matching scale ν (to be distinguished from µ), which arises from an IR divergence related to the ultrasoft scale. Up to the third order to which the potential is currently known, performing the integration over q yields [36,56] where l 1 = ln(µ/µ f ) + 1 , To fully specify the PS mass definition, a value of ν must be chosen and the standard value is ν = µ f , which sets the logarithm ln(ν/µ f ) to zero in the last line. This still leaves a constant term 16π 2 C 3 A = 4263.67..., which is large compared to a 3 = 1461.32... (quoted for n l = 5). Since the former is related to an ultrasoft rather than potential effect, another well-motivated choice is ν = µ f e −1 , which nullifies the entire square bracket in the last line, resp. the extra term in (48) at this accuracy. I will refer to this choice as the PS * definition. The difference between the PS and PS * mass is largely irrelevant when the expanded formula (49) is used, since the nth order term is dominated by the l i terms, originating from the running coupling in lower order terms, for relevant values of µ f . On the other hand, the difference affects directly the size of the last presently known term in the anomalous dimension and RGE below.
It is instructive to interpret v c (q) = v c (α s (µ), q/µ) |µ=q = α s (q)+. . . as an effective coupling, and write the mass subtraction term as The µ f evolution of the PS mass is governed by the anomalous dimension The solution is evidently (see (51)) which allows us to compute m PS (µ f ) at widely different scales µ f 1 , µ f 2 without large logarithms by integrating the expansion of v c (q) in terms of the running MS coupling α s (q). The leading-logarithmic solution can be expressed in terms of the exponentialintegral function. The PS scheme is especially well-suited for quarkonium-like systems including open QQ systems near threshold, since it subtracts the leading IR contributions explicitly already in low orders of perturbation theory. Prime examples are quarkonium masses at next-to-next-to-next-to-leading order (NNNLO) [56], the determination of the bottom-quark mass from high moments of the pair production cross section at second [57] and third order [58] in PNRQCD, and in particular precision calculations of top-quark pair production near threshold to NNLO and NNNLO [55,59] in the PS scheme. For QQ systems near threshold, the scale µ f should be chosen parametrically of order mv ∼ mα s such that δm PS ∼ mv 2 , in order not to violate the power counting of the non-relativistic expansion. With this choice, the relation (47) is already accurate to order mα 5 s .

Kinetic mass
The kinetic mass scheme is another physical scheme, in this case related to the physics of semi-leptonic decays of heavy-light mesons [60,61]. The pseudoscalar B meson mass has the heavy-quark expansion (cf. (2)) where µ 2 π and µ 2 G are the B-meson matrix elements of the kinetic energy and chromomagnetic operators, respectively. The kinetic mass can be understood as a perturbative evaluation of this formula, in which the matrix elements include loop momentum integration regions below the scale µ f : The matrix elements on the right-hand side subtract the long-distance sensitive contributions to the pole mass order by order in µ f /m Q and α s . Comparing to (40), we note that the kinetic mass definition not only subtracts the leading IR renormalon divergence through [Λ(µ f )] pert , but also the IR sensitivity at subleading power Λ 2 QCD /m. Up to now the discussion has been general. The kinetic scheme is defined by providing a concrete prescription for calculating the perturbative matrix elements in terms of the perturbative evaluation of short-distance observables. It relies on the fact that theΛ parameter and kinetic-energy matrix element appear in the heavy-quark expansion of the dilepton-differential spectrum of inclusive semi-leptonic B → X c ν decays [62,63], which can be constructed from the imaginary part of the two-point function of the b → c transition current. The convention for the kinetic mass used in the literature employs an indirect definition of the matrix elements through heavy flavour sum rules [60,61] in the small-velocity limit, which is rather complicated when compared with the other three mass definitions above. The central quantity is the forward amplitude and its discontinuity Since the perturbative matrix elements (55) to be computed are spin-independent and universal, instead of the physical V-A current, one can define the kinetic scheme by adopting the scalar current J = QQ provided Q and Q are heavy. To simplify further, one can set Q = Q and compute the forward scattering of the heavy quark off the current J with momentum q = (q 0 , q) in the limit when v = q/m Q 1. To isolate the IR region of the final state momenta, the total energy of the quark and gluon final state X excluding the heavy quark Q is restricted to Employing the variables (ω, v) instead of (q 0 , q), the mass subtractions are defined in terms of the double limit of the first moments of the spectral function: The demonstration that the right-hand sides of these equations can be identified with the subtractedΛ and kinetic-energy parameters, which appear in the heavy-quark expansion of the meson mass, is given in [61]. It is sufficient to compute W (ω, v) in an expansion in ω, |q| m Q to order The two-loop computation [64] has been known for some time, but the three-loop result has been obtained only recently [65,66]. Different from the other three mass schemes, the O(α 4 s ) term of δm kin (µ f ) is presently not available. The kinetic scheme is especially well-suited for observables derived from semileptonic decays of heavy quarks, since it subtracts the leading IR contributions explicitly already in low orders of perturbation theory. Initially, it was suggested to eliminate the leading renormalon divergence from the semi-leptonic width by replacing the pole mass by the MS mass [67], but this does not improve the behaviour in low orders. The comparison to the series convergence when renormalon-free onshell masses are used (Table 2 of [68]) clearly shows the latter's advantage. The kinetic scheme was used for bottom quark mass and |V cb | determinations at second order [69]. Recently, the semi-leptonic decay width was calculated to NNNLO and the effectiveness of the kinetic mass scheme was demonstrated for the inclusive rate for the first time at this order [70].

Comparison
Once any of the leading renormalon-free, on-shell, short-distance masses has been determined from some observable, one is eventually interested in converting them to the MS reference mass m. Over the past twenty years the accuracy of the mass definitions and observables has improved by one order (typically from two-loop to three-loop, and three-loop to four-loop for the pole to MS mass series). It is therefore timely to update and extend the comparison [68] of different definitions (see also [25]).
The purpose of the following comparison is to display the good behaviour of the relation between the various subtracted masses and the MS mass contrary to the pole mass series (28) - (30). As above, the MS masses are fixed to m t = 163.643 GeV, m b = 4.20 GeV and m c = 1.28 GeV. The strong coupling is taken to be α      Table 3. Comparison of charm quark mass definitions for µ f = 1 GeV for given MS mass mc. αs(mZ ) = (0.1180 ± 0.0010). "n-loop" refers to the value of the n-loop contribution to the mass. All numbers in GeV. for the sake of comparison, even if different "canonical" values are often adopted for the different schemes. Internal mass effects are neglected, since they are not always known with comparable precision. In case of mass schemes originally defined in terms of α s (µ f ), the series have been converted into expansions in α s (m Q ) with the required four-loop accuracy. I use private code for the RS, MSR and PS mass. The RS and PS scheme (for ν = µ f ) is also implemented in CRunDec3.1 [27] and the PS mass also in QQbar threshold [71]. In case of the RS scheme I adapted the normalization constants N m = 0.563 (n l = 3), N m = 0.547 (n l = 4), N m = 0.527 (n l = 3) given in [32] (and hard-coded in CRunDec3.1) to the values from (27). Finally, the kinetic mass m kin is implemented for massless internal quarks with CRunDec3.1's mMS2mKIN[m Q , 0, 0, "[as]"*α (n l ) s (m Q ), m Q , µ f , n l , n l , 3, ""] call, presently available only to three-loop accuracy.
The results are summarized in Tables 1 to 3. It is evident that in terms of the size of mass corrections up to the shown four-loop order, all mass schemes are rather similar with exception of the kinetic scheme. In all cases, one observes a spectacular improvement of convergence relative to the pole mass series given in the first line. One expects the cancellation of the leading renormalon to become more and more effective in higher orders measured relative to the order of the minimal term, which can be seen explicitly by comparing the top, bottom and charm tables. The effect is particularly dramatic for the charm mass, for which the pole mass series starts diverging beyond the two-loop order, while the renormalon-subtracted masses are still well-behaved at the fourth order, with corrections in the few MeV range. The top pole mass series is still in the regime of decreasing coefficients at the four-loop order, albeit slowly, hence the relative improvement of the subtraction should increase in the next orders. The four-loop coefficient is typically (40 − 50) MeV, but the size of the next unknown term can be assumed to be small enough relative to the experimental precision that can be attained in the future, even from the scan of the pair production cross section in e + e − collisions [72].
"Sum" in the last column of the tables refers to the sum of the terms up to the fourloop order shown, and the error attached quantifies the variation of the sum under a variation of α s (m Z ) by ±0.001. It is apparent that once leading renormalon-free, on-shell masses are employed, the limitation of the accuracy of their relations to the MS mass is (currently) no longer determined by the convergence of the expansion, but by the precision of α s (m Z ). For the case of the top quark, this uncertainty is mainly caused by the large one-loop correction of a few GeV to be compared to the ultimate precision to which the subtracted masses can be obtained theoretically and (in principle) experimentally. Unlike the bottom and charm masses, to make use of this precision requires better knowledge of the strong coupling.