The MSR Mass and the ${\cal O}(\Lambda_{\rm QCD})$ Renormalon Sum Rule

We provide a detailed description and analysis of a low-scale short-distance mass scheme, called the MSR mass, that is useful for high-precision top quark mass determinations, but can be applied for any heavy quark $Q$. In contrast to earlier low-scale short-distance mass schemes, the MSR scheme has a direct connection to the well known $\overline{\rm MS}$ mass commonly used for high-energy applications, and is determined by heavy quark on-shell self-energy Feynman diagrams. Indeed, the MSR mass scheme can be viewed as the simplest extension of the $\overline{\rm MS}$ mass concept to renormalization scales $\ll m_Q$. The MSR mass depends on a scale $R$ that can be chosen freely, and its renormalization group evolution has a linear dependence on $R$, which is known as R-evolution. Using R-evolution for the MSR mass we provide details of the derivation of an analytic expression for the normalization of the ${\cal O}(\Lambda_{\rm QCD})$ renormalon asymptotic behavior of the pole mass in perturbation theory. This is referred to as the ${\cal O}(\Lambda_{\rm QCD})$ renormalon sum rule, and can be applied to any perturbative series. The relations of the MSR mass scheme to other low-scale short-distance masses are analyzed as well.


Introduction
Achieving higher precision in theoretical predictions in the framework of quantum chromo dynamics (QCD) is one of the main goals in high-energy physics and an essential ingredient in the indirect search for physics beyond the Standard Model. In this endeavor accurate determinations of the masses of the heavy charm, bottom and top quarks play an important role since they enter the description of many observables that are employed in consistency tests of the Standard Model and in the exploration of models of new physics. Because quark masses are formally-defined renormalized quantities and not physical observables, the quantities from which the heavy quark masses are extracted need to be computed in perturbative QCD to high order. Among the most precise recent high-order analyses to determine the heavy quark masses are QCD sum rules and the analysis of quarkonium energies for the charm and bottom quark masses [1][2][3][4][5][6][7][8][9][10] and the top pair production threshold cross section at a future lepton collider for the top quark mass [11][12][13]. Over time all of these analyses have been continuously updated and improved by computations of new QCD corrections, and more are being designed and studied currently to also allow for more precise determinations of the top quark mass from available LHC data [14][15][16][17][18][19][20][21].
In all the analyses of Refs. [1][2][3][4][5][6][7][8][9][10][11][12][13] the use of short-distance mass schemes was essential to achieve a well-converging perturbative expansion and a precision in the mass determination well below the hadronization scale Λ QCD ∼ 200 − 300 MeV. The heavy quark pole mass m pole Q , which is the perturbation theory equivalent of the rest mass of an onshell quark, on the other hand, leads to a substantially worse perturbative behavior due to its linear infrared-sensitivity, also known as the O(Λ QCD ) renormalon problem [22,23], and was therefore not adopted as a relevant mass scheme for analyses where a precision better than Λ QCD could be achieved. Nevertheless, the pole mass still served as an important intermediate mass scheme during computations because it determines the partonic (but unphysical) poles of heavy quark Green functions. Typical short-distance quark mass schemes which have been employed were the renormalization-scale dependent MS mass m Q (µ) and so-called low-scale short-distance masses such as the kinetic mass [24], the potential-subtracted (PS) mass [25], the 1S mass [26][27][28], the renormalon-subtracted (RS) mass [29] or the jet mass [30,31]. The basic difference between the MS mass to the low-scale short-distance mass schemes is that the perturbative coefficients of its relation to the pole mass scale linearly with the heavy quark mass, m Q (µ) − m pole Q ∼ m Q (α s + . . .), while for the low-scale short-distance mass schemes the corresponding series scales linearly with a scale R m Q . This feature enables the low-scale short-distance quark mass schemes to be used for predictions of quantities where the heavy quark dynamics is non-relativistic in nature and fluctuations at the scale of m Q are integrated out. This is because radiative corrections to the mass in such quantities involve physical scales much smaller than m Q . One very prominent example in the context of top quark physics is the non-relativistic heavy quarkonium dynamics inherent to the top-antitop pair production cross section at threshold at a future lepton collider [11][12][13], where the most important dynamical scale is the inverse Bohr radius m t α s ∼ 25 GeV m t . On the other hand, the MS mass is a good scheme choice for quantities that involve energies much larger than m Q , such as for high-energy total cross sections, or when the massive quark causes virtual and off-shell effects. This is because in such cases the heavy quark mass yields corrections that either scale with positive or negative powers of m Q such that QCD corrections associated with the mass have a scaling that is linear in m Q as well. The difference between the MS mass and the low-scale short-distance masses is most important for the case of the top quark because in this case the difference between m t and the dynamical low-energy scales can be very large numerically.
For the top quark mass there are excellent prospects for very precise measurements in low-scale short-distance schemes such as the PS mass or the 1S mass from the top-antitop threshold inclusive cross section at a future lepton collider [11][12][13]. Current studies indicate that a precision well below 50 MeV can be achieved accounting for theoretical as well as experimental uncertainties [32][33][34]. Currently, the most precise measurements of the top quark mass come from reconstruction analyses at the LHC [35,36] and the Tevatron [37] and have uncertainties at the level of 500 MeV or larger. Moreover, the mass is obtained from multivariate fits involving multipurpose Monte Carlo (MC) event generators and thus represents a determination of the top quark mass parameter m MC t contained in the particular MC event generator. Recently, a first high-precision analysis on how the MC top quark mass parameter can be related to a field theoretically well-defined short-distance top quark mass was provided in Refs. [38,39] and general considerations on the relation were discussed in Ref. [40,41]. For the analysis, hadron level predictions for the 2-jettiness distribution [42] for electron-positron collisions and O(α s ) QCD corrections together with the resummation of large logarithms at next-to-next-to leading order [31,43,44] were employed. Since the 2-jettiness distribution is closely related to the invariant mass distribution of a single reconstructed top quark, the relevant dynamical scales inherent to the problem are governed by the width of the mass distribution which amounts to only about 5 GeV in the peak region of the distribution where the sensitivity to the top mass is the highest. Interestingly, as was shown in Ref. [38], the dynamical scales increase continuously considering the 2-jettiness distribution further away from the peak. In the analysis of [38] the MSR mass scheme m MSR Q (R) was employed which depends on a scale R and for which the dependence on R is described by a renormalization group flow such that R can be continuously adapted according to which part of the distribution is predicted. Other applications of the MSR mass using a flavor number dependent evolution in R to account for the mass effects of lighter quarks were given in Ref. [45,46]. In contrast to the µ-dependent MS mass m Q (µ), which evolves only logarithmically in µ, the MSR mass has logarithmic as well as linear dependence on R.
The MSR mass scheme was succinctly introduced in Ref. [47] and discussed conceptually in Ref. [41], but a detailed discussion has so far not been provided. A key purpose of this paper is to provide sufficient details such that phenomenological MSR mass analyses, such as the results of Ref. [38], can be easily related to other common short-distance mass schemes that are being used in the literature. and is therefore the only low-scale short-distance mass suggested in the literature that is derived directly from on-shell heavy quark self-energy diagrams just like the MS mass. 1 The MSR mass thus automatically inherits the clean and good infrared properties of the MS mass. Furthermore, by construction, the MSR mass matches to the MS mass for R = m Q (m Q ) and is known to the same order as the series of m Q (m Q ) − m pole Q without any further effort, which is currently O(α 4 s ) from the results of Refs. [48][49][50][51][52][53][54][55]. As already argued in Refs. [40,47], the MSR mass can therefore be considered as the natural modification of the "running" MS mass scheme concept for renormalization scales below m Q , where the logarithmic evolution of the regular MS mass is known to be unphysical.
Since the MSR mass is designed to be employed for scales R < m Q , it can be useful -for applications where a clean treatment of virtual massive-flavor effects is importantto integrate out the virtual effects of the massive quark Q from the MSR mass definition. We therefore introduce two types of MSR masses, one where the virtual effects of the massive quark Q are integrated out, called the natural MSR mass, and one where these effects are not integrated out, called the practical MSR mass. The difference between these two versions of the MSR mass is quite small and very well behaved for all R values in the perturbative region, and the practical definition should be perfectly fine for most phenomenological applications. But the natural definition has conceptual advantages as its evolution for scales R < m Q does not include the virtual effects of the massive quark Q, which is conceptually cleaner since these belong physically to the scale m Q .
We note that the R-evolution concept of a running heavy quark mass scheme for scales R < m Q elaborated in Ref. [47] has already been suggested a long time ago in Refs. [56,57]. The R-evolution equation we discuss for the MSR mass was already quoted explicitly for the renormalization group evolution of the kinetic mass [24] at O(α s ) in these references, but the conceptual implications of R-evolution and its connection to the O(Λ QCD ) renormalon problem in the perturbative relations between short-distance masses and the pole mass were first studied systematically in Ref. [47]. The second main purpose of this paper is to give further details on R-evolution and also to discuss its relation to the Borel transformation focusing mainly on the case of the MSR mass. We note that the concept of R-evolution is quite general and can in principle be applied to any short-distance mass which depends on a variable infrared cutoff scale (such as the PS and the RS masses) or to cutoff-dependent QCD matrix elements with arbitrary dimensions. In fact, R-evolution has already been examined and applied in a number of other applications which include the factorizationscale dependence in the context of the operator product expansion [58], the scale dependence of the non-perturbative soft radiation matrix element in high-precision determinations of the strong coupling from e + e − event-shape distributions [59][60][61][62], even accounting for the finite mass effects of light quarks [63,64] and hadrons [61,65].
The basic feature of the R-evolution concept is that for the difference of MSR masses at two scales, m MSR Q (R) − m MSR Q (R ), its linear dependence on the renormalization scale provides, completely within perturbation theory, a resummation of the terms in the asymptotic series associated to the pole-mass renormalon ambiguity to all orders. The R-evolution then resums the factorially growing terms in a systematic way that is O(Λ QCD )-renormalon free and, at the same time also sums all large logarithms that arise if R and R are widely separated. This cannot be achieved by more common purely logarithmic renormalization group equations, but is fully compatible with a Wilsonian renormalization group setup. We note that the summations carried out by the R-evolution was achieved prior to Ref. [47] for the RS mass in [66] (see also Ref. [67]). Their method (and the RS mass) is based on using an approximate expression for the Borel transform function. The summation for a difference of RS masses (for scales R and R ) is obtained by computing the inverse Borel integral over the difference of the two respective Borel functions. This method and R-evolution lead to consistent results, but the R-evolution does not rely on the knowledge of the Borel functions.
The essential and probably most interesting conceptual feature of the perturbative series of the R-evolution equations is that it provides a systematic reordering of the terms in the asymptotic series associated to the O(Λ QCD ) renormalon ambiguity in leading, subleading, subsubleading, etc. contributions. So using the analytic solution of the R-evolution equations allows one to derive analytically (i.e. without any numerical procedure or modeling) the Borel-transform of a given perturbative series from the perspective that it carries an O(Λ QCD ) renormalon ambiguity. As a result one can rigorously derive an analytic expression for the normalization of the non-analytic terms in the Borel transform that are characteristic for the O(Λ QCD ) renormalon. The analytic result for this normalization factor was already given and discussed in Ref. [47], but no details on the derivation were provided. We take the opportunity to show the details of the derivation here. We call the analytic result for the normalization of the O(Λ QCD ) renormalon ambiguity the O(Λ QCD ) sum rule, because it can be quickly applied to any given perturbative series. To demonstrate the use and the high sensitivity of the O(Λ QCD ) renormalon sum rule we apply it also to a number of other cases, pointing out subtleties in its application to avoid inconsistencies and misinterpretations of the results.
We note that also other methods to determine the normalization factor have been used. In Ref. [29] it was determined from a computation of the residue of the Borel transform of the series following a proposal in Ref. [68]. This approach, which we call Borel method can also be carried out analytically and provides the correct result, but has been observed to converge very slowly. We can identify the reason for this analytically from the solutions for the R-evolution equations, and we also discuss the connection of this method to our O(Λ QCD ) sum rule based on explicit analytic expressions. In Ref. [69] the normalization factor was computed taking the ratio of the n-th term of the series to the asymptotic behavior. This ratio method converges very fast and provides results very similar to the O(Λ QCD ) sum rule. Recently, the ratio method was applied in Ref. [70], accounting for the O(α 4 s ) corrections to the pole-MS mass relation [54,55]. We show that our O(Λ QCD ) sum rule provides results that are in full agreement with the ones obtained in Ref. [70] and also leads to very similar uncertainties.
The paper is organized as follows: In Sec. 2 we provide the definition of the natural and practical MSR masses, m MSRn Q , and we also analyze the difference between these two MSR masses. This section provides the conventions we use for the coefficients of perturbative series, but it can otherwise be skipped by the reader not interested in the MSR masses. In Sec. 3 we present the R-evolution equations which describe the scale dependence of the MSR masses and we also show explicitly how the solutions of the R-evolution equations sum large logarithms together with the high-order asymptotic series terms related to the O(Λ QCD ) renormalon. We in particular show for the top quark mass under which conditions the use of the R-evolution equations and its resummation is essential and superior to renormalonfree fixed-order perturbation theory, which does not sum any large logarithms. To our knowledge, such an analysis has not been provided in the literature before. We also point out that the solution of the R-evolution equations is intrinsically related to carrying out an inverse Borel transform over differences of functions in the Borel plane such that the singularities related to the O(Λ QCD ) renormalon cancel. In Sec. 4 we present the analytic derivation of the O(Λ QCD ) renormalon sum rule and demonstrate its utility by a detailed analysis concerning the normalization of the O(Λ QCD ) renormalon ambiguity in the series for the difference of the pole mass and the MSR masses. The derivation of the sum rule allows to derive a new alternative expression for the high-order asymptotic behavior of a series that contains an O(Λ QCD ) renormalon which we discuss as well. To demonstrate the high sensitivity of the sum rule and to explain its consistent (and inconsistent) application we discuss its strong flavor number dependence and apply it to the massive quark vacuum polarization function, the series for the PS mass-pole mass difference, the QCD β-function, and the hadronic R-ratio. This section can be bypassed by the reader not interested in applications of the O(Λ QCD ) sum rule, but we note that Sec. 4.5.3 discusses implications for the PS mass that are relevant for Sec. 5 and may be important for high-precision top quark mass determinations. Some subtle issues in the relation of the MSR masses to the PS, 1S and MS masses are discussed in Sec. 5. Finally, we conclude in Sec. 6. The paper also contains two appendices. In App. A we specify our convention for the QCD β-function coefficients and present a number of expressions and formulae for coefficients, quantities and matching relations that arise in the discussion of R-evolution, the O(Λ QCD ) renormalon and on various mass definitions throughout this paper. In App. B we provide details on the relation of the Borel method and our sum rule method to determine the normalization of the O(Λ QCD ) renormalon ambiguity of the pole mass. Finally, in App. C we quote the coefficients that define the PS and the 1S masses for the convenience of the reader and also show how the MSR masses can be obtained from a given value of the 1S mass in the non-relativistic and Υ-expansion counting scheme [26,27].

Basic Idea of the MSR Mass
The MS mass m Q (µ) serves as the standard short-distance mass scheme for many highenergy applications with physical scales of the order or larger than the mass of the quark Q. It relies on the subtraction of the 1/ divergences in the common MS scheme in the on-shell self-energy corrections calculated in dimensional regularization. Despite the fact that it is an unphysical (i.e. theoretically designed) mass definition, it is infrared-safe and gauge invariant to all orders [48,71] and its series relation to the pole mass m pole Q thus serves as the cleanest way to precisely quantify the renormalon ambiguity of the pole mass. The relation of m Q ≡ m ) to the pole mass in the approximation that the masses of all quarks lighter than Q are zero reads  [54,55], and the quoted numerical uncertainties have been taken from Ref. [55]. Using the method of Ref. [72] the uncertainties of the n -dependent terms may be further reduced. Using renormalon calculus [22,23,73] one can show that the high-order asymptotic behavior series of Eq. (2.1) has an ambiguity of order Λ (n ) QCD , which depends on the number of massless quarks (indicated by the superscript) but is independent of the actual value of m Q .
A coherent treatment of the mass effects of lighter quarks is beyond the scope of this paper, and we therefore use the approximation that all flavors lighter than Q are massless. These mass corrections come from the insertion of massive virtual quark loops in the self-energy Feynman diagrams and start at O(α 2 s ). At this order and at O(α 3 s ) the mass corrections from the virtual massive quark loops have been calculated analytically for all mass values in Ref. [49] and [74], respectively. The dominant linear mass corrections at O(α 3 s ) were determined in Ref. [75]. At O(α 4 s ) and the mass corrections are not yet known, but the corrections in the limit of large virtual quark masses are encoded in the ultraheavy flavor threshold matching relations of the RG-evolution m Q (µ) at scales above m Q [76].
The idea of the MSR mass is based on the fact that the O(Λ QCD ) ambiguity of the perturbative series on the RHS of Eq. (2.1) does not depend on the value m Q , as already mentioned above. This is an exact mathematical statement within the context of the calculus for asymptotic series and means that we can replace the term m Q by the arbitrary scale R on the RHS of Eq. (2.1) and use the resulting perturbative series as the definition of the R-dependent MSR mass scheme. It was pointed out in Ref. [41] that, for a given value of R, one can also interpret the MSR mass field theoretically as having a mass renormalization constant that contains the on-shell self-energy corrections of the pole mass only for scales larger than R. In other words, the pole mass and the MSR mass at the scale R differ by self-energy corrections from scales below R: while the pole mass absorbs all self-energy corrections for quantum fluctuations up to scales m Q , the MSR mass at the scale R absorbs only self-energy corrections between R and m Q . Since the pole mass renormalon problem is related to the self-energy corrections from the scale Λ QCD < R, this explains why the MSR mass is a short-distance mass. In this illustrative context the MS mass absorbs no self-energy corrections up to the scale m Q . Since the scale R is variable, the MSR mass can serve as a short-distance mass definition for applications governed by different physical scales and thus can also interpolate between them. Since the MSR mass is expected to have applications primarily for R < m Q , it is further suitable to change the scheme from n + 1 dynamical flavors, which includes the UV effects of the quark Q, to a scheme with n dynamical flavors. This can be achieved in two ways, either by simply rewriting α (n +1) s in terms of α (n ) s , or by integrating out the virtual loop corrections of the quark Q. This results in two different ways to define the MSR mass, where we call the former the practical MSR mass and the latter the natural MSR mass, either one having advantages depending on the application.
We note that the notion of a scale-dependent short-distance mass which was first suggested in Refs. [56,57] has also been adopted for the kinetic [24], the PS [25], RS [29] and jet masses [30,43]. However, none of these short-distance masses is defined directly from the on-shell self-energy diagrams of the massive quark Q such as the MSR mass. This has a number of advantages, for example when discussing heavy flavor symmetry properties in the pole-MS mass relation of different heavy quarks.

Natural MSR Mass
The natural MSR mass definition is obtained by integrating out the corrections from the heavy quark Q virtual loops in the self-energy diagrams of the massive quark Q, such that its relation to the pole mass reads where the coefficients are given in Eq. (2.2). The natural MSR mass only accounts for gluonic and massless quark corrections, and has a non-trivial matching relation to the MS mass. The matching between the natural MSR mass and the MS mass can be derived from the relation [ m Q ≡ m and will be discussed in more detail in Sec. 5.3. We note that, formally, the natural MSR mass (as well as the practical MSR mass discussed in the next subsection) agrees with the pole mass in the limit R → 0. However, taking this limit is ambiguous as it involves evolving through the Landau pole of the strong coupling and dealing with its non-perturbative definition for |R | < Λ QCD . This issue is a manifestation of the renormalon problem of the pole mass.

Practical MSR Mass
The practical MSR mass definition is directly related to the MS-pole perturbative series of Eq. (2.1). To obtain its defining series one rewrites α    numerical difference between these two masses is quite small. The natural MSR mass is larger than the practical MSR mass and the difference increases with R reaching about 30 MeV at R = 170 GeV. The error bands reflect variations of the renormalization scale µ in α s between R/2 and 2R, showing very good convergence, exhibiting a perturbative error of ± 5 MeV for R ∼ 1 GeV and below ± 1 MeV for R 3 GeV due to missing terms of O(α 5 s ) and higher. This indicates that the different way how the natural and practical MSR masses treat the virtual massive quark effects does not reintroduce any infrared sensitivity, as is expected since the mass of the virtual quark provides an infrared cutoff. The numerical uncertainties in the O(α 4 s ) correction are below the level of 0.1 MeV and negligible. Note that the difference between the natural and the practical MSR masses at the common scale R starts at O(α 2 s ) and that the uncertainty band from scale variation is an underestimate at this lowest order. However, the series results and error bands at O(α 3,4 s ) show good behavior and convergence. In Ref. [38] the practical MSR mass was employed, but the numerical difference to the natural MSR mass is subdominant to the uncertainties obtained in the analysis there.
In the rest of the paper we will simply use the notation of the MSR mass with the definition m pole Q − m MSR Q (R) = R n a n α s (R)/(4π) n when the difference between the natural and practical definitions and the value of n are insignificant but we will specify explicitly our use of the practical or the natural MSR masses (or any other mass scheme) and the massless flavor number n for any numerical analysis.

R-Evolution
The dependence of the MSR mass m MSR Q on the scale R is described by the R-evolution equation [47], which is derived from the logarithmic derivative of the defining equations (2.3) and (2.5) and using that the pole mass is R independent: The overall minus sign on the RHS of Eq. (3.1) indicates that the MSR mass always decreases with R. Note that this equation applies to all MSR schemes and we have therefore suppressed the superscript on the a n 's. The crucial feature of the R-evolution equation is that it is free from the O(Λ QCD ) ambiguity contained in the series that relates the MSR mass to the pole mass because the ambiguity is R-independent. This is directly related to the fact that for determining the R-evolution equation also the overall linear factor of R on the RHS of Eqs. (2.3) and (2.5) has to be accounted for. Therefore the R-evolution equation does not only have a logarithmic dependence on R, as common to usual renormalization group equations (RGEs), but also a linear one. Both of these issues are actually tied together conceptually. The numerical expressions for the coefficients γ n for the natural and practical MSR masses are given explicitly in Eqs. (A.11) and (A.12). We implement renormalization scale variation in the R-evolution equation by simply expanding α s (R) in Eq. (3.1) as a series in α s (λR) and by varying λ, typically in the range 0.5 < λ < 2. In principle one may also consider varying the boundaries of integration, as it is common for usual RGEs, but only the former way of implementing scale variations in the R-evolution leads to variations of the scale solely in logarithms, which is the standard used for the usual logarithmic RGEs. By solving the R-evolution equation one sums, at the same time and systematically, the asymptotic renormalon series as well as the large logarithmic terms in m MSR It is straightforward to solve the R-evolution equation numerically and it shows very good perturbative stability even for low values of R very close to the Landau pole [58] in the perturbative strong coupling. Details of how to solve the R-evolution equations analytically have already been given in [47] and shall not be repeated here.
The series by itself is divergent and not summable, but is easily seen to be convergent. In the context of FOPT, when the sum over n is truncated, the unavoidable appearance of large logarithms log(R 0 /R 1 ) for let's say R 0 R 1 may degrade the convergence and cause sizable perturbative uncertainties. Due to the additional linear dependence on R 0 and R 1 , as shown in Eq. (3.5), these logarithms cannot be summed by common logarithmic renormalization group (RG) equations. The same type of logarithms also appear for example in the relation of any other low-scale shortdistance mass to the MS mass and their effects can be significant particularly for the top quark. By solving the R-evolution equation one sums, at the same time and systematically, the asymptotic terms in the renormalon series as well as the large logarithmic terms in m MSR to all orders in a manner free from the O(Λ QCD ) renormalon. It is again instructive to see how this is achieved in the β 0 /LL approximation of Eq. (3.4), which explicitly shows the factorial growth of the perturbative series. When calculating the derivative to get the R-evolution equation, the whole series collapses exactly (i.e. without any truncation!) to which is the one-loop version of Eq. (3.1). Moreover, the exact solution of the R-evolution equation at this order can be easily seen to be exactly equal to the RHS of Eq. (3.5) which sums the renormalon series and the large logarithms at the same time into a convergent series. Conceptually, the solution of the R-evolution equation is directly related to the Borel space integral over the Borel transform for the series for m MSR . Since this has not been shown in [47] we briefly outline this calculation here at the β 0 /LL level. Starting from Eq. (3.7) one can shuffle the integration over R into an integral over α s (R) by using the QCD β-function and the relation Λ LL QCD = R exp (− 2π/β 0 α s (R)). Using the variable t = − 2π/(β 0 α s (R)) one can then rewrite the integral as [ where the two integrals in the last line are just the difference of the MSR masses at R 0,1 to the pole mass, and the pole mass ambiguity is encoded in the singularity at t = 0, which arises because t 0,1 < 0,  [GeV] Difference of the natural top quark MSR mass (n = 5) at two different scales R including contributions from one to four loops. Results are shown for the difference between a high scale R 1 = 161 GeV and two lower scales R 2 = 2 GeV (top two panels) and R 2 = 50 GeV (lower two panels). The high and low scales are connected by a fixed-order perturbation theory conversion [ left two panels, as a function of the scale µ in α s (µ) ] or via R-evolution [ right two panels, as a function of the λ renormalization parameter ]. Here is the well-known Borel transform with respect to α s (µ) of the β 0 /LL series in Eq. (3.4). In Eq. (3.10) the singular and non-analytic contributions contained in the individual Borel functions cancel and the integral becomes ambiguity-free.
To illustrate the impact of using R-evolution compared to using FOPT we show in Fig. 2 for n = 5 in fixed-order perturbation theory (FOPT) and with R-evolution. The curves in Fig. 2(a) show ∆m MSRn t for (R 0 , R 1 ) = (2, 161) GeV in FOPT for the common renormalization scale µ between R 0 and R 1 at 1 loop (cyan), 2 loop (green), 3 loop (blue) and 4 loops (red). We see a good convergence for µ around √ R 0 R 1 , but a deterioration of the series when µ gets closer to either R 0 or R 1 . For µ 1/2 √ R 0 R 1 the series even gets out of bounds and breaks down completely. If one uses scale variation as an estimate of the remaining perturbative error, one therefore obtains a significant dependence on the choice of the lower bound of the variation, and one has no other choice than to abandon in an ad hoc manner scales closer to R 0 to estimate the scale variation error. The curves in Fig. 2 for (R 0 , R 1 ) = (2, 161) GeV from numerically solving the R-evolution equation as a function of the renormalization scale parameter λ between 0.5 and 2. The color coding for the order of the R-evolution equation used for the evaluation is the same as for Fig. 2(a). As explained below Eq. (3.1), the parameter λ is the renormalization scaling parameter in the R-evolution equation which determines by how much the scale in α s differs from the scale R. Thus a variation between 0.5 and 2 means that in the solution of the R-evolution equations scales between R/2 and 2R are covered at each value of R along the evolution, which in this case includes scales between 1 and 322 GeV. Comparing the curves in Fig. 2(a) and 2(b) we see that the renormalization scale variation in the R-evolved results is much smaller than the one of FOPT. For the FOPT result with scale variation between √ R 0 R 1 /2 -which we pick by hand -and R 1 we obtain ∆m t = (9.838 ± 2.504, 8.981 ± 0.361, 9.465 ± 0.222, 9.427 ± 0.047) GeV at (1, 2, 3, 4) loops. Using R-evolution with λ variation between 0.5 and 2 we obtain ∆m t = (8.817 ± 1.059, 9.440 ± 0.246, 9.512 ± 0.040, 9.486 ± 0.025) GeV which is fully compatible with the FOPT result, but shows more stability and smaller errors. It is also quite instructive to see that using R-evolution the 3-loop result is significantly closer to the 4-loop result than the corresponding 3-loop FOPT result. The results show that for R 0 R 1 employing R-evolution to calculate MSR mass differences is clearly superior to FO perturbation theory.
To compare to a situation where the scales R 0 and R 1 are of similar size we have also shown in Figs. 2(c) and 2(d) the results for ∆m t in FOPT and from R-evolution for (R 0 , R 1 ) = (50, 161) GeV. Here the results from both approaches are completely equivalent showing that the logarithm log(R 0 /R 1 ) is not large and the summation of the renormalon contributions from higher orders only constitutes very small effects. Furthermore using renormalization scales close to R 0 or R 1 in FOPT is not problematic. Numerically, using FOPT with scale variations between R 0 and R 1 we obtain ∆m t = (5.618 ± 0.498, 5.928 ± 0.086, 5.961 ± 0.010, 5.954 ± 0.004) GeV at (1, 2, 3, 4) loops, while using R-evolution with λ variations between 0.5 and 2 we obtain ∆m t = (5.555 ± 0.577, 5.919 ± 0.114, 5.959 ± 0.015, 5.954 ± 0.005) GeV. We find that FOPT and R-evolution give equivalent results even for (R 0 , R 1 ) = (20, 161) GeV, and that the use of R-evolution is essential for R 0 /R 1 < 0.1.
Overall we see that, if R 0 and R 1 are of similar size, FO perturbation theory and R-evolution lead to equivalent results, but that it is in general safer to use R-evolution. So the situation is very similar to the one we encounter when considering the relation of the strong coupling for two different renormalization scales.
We note that the possibility to sum the renormalon-type logarithms displayed in Eq. (3.5) by considering the Borel integral over the difference of Borel transforms as shown in Eq. (3.10) was pointed out already in Ref. [66] prior to Ref. [47]. However, this exact equivalence [ via a transformation of variables as given below Eq. (3.9) ] of R-evolution and the method using the integration over Borel transform differences can only be analytically shown at the β 0 /LL approximation. Beyond that, both approaches sum up the same type of logarithms but differ in subleading terms. Numerically, both approaches converge to the same result and have comparable order-by-order convergence. From a practical point of view, however, the concept of R-evolution may be considered more general. This is because R-evolution can be applied directly to any series having the form of (2.3) or (2.5) while using the Borel integration method requires that the corresponding Borel transforms are known or constructed beforehand. For general series, such as for the difference of MSR masses as discussed above, this is not possible without making additional approximations. In practice, the approach of Ref. [66] to sum the renormalon-type logarithms has therefore only been applied for series (referred to as RS-schemes) which were explicitly derived from a given expression for the Borel transform.

Analytic Borel Transform and Renormalon Sum Rule
Using the solution of the R-evolution equation it is possible to derive, analytically and rigorously, an expression for the Borel transform of the MSR-pole mass relation. This Borel transform is designed to focus on the singular contributions that quantify the O(Λ QCD ) renormalon of the pole mass. This result was already quoted in the letter [47] where, however, no details on the derivation could be given due to lack of space. In the following we provide these details on how to obtain the analytic result for the normalization of the singular terms. The analytic results for the normalization can be applied to other perturbative series as a probe of O(Λ QCD ) renormalon ambiguities, and we therefore call it the O(Λ QCD ) renormalon sum rule. This sum rule was first given in Ref. [47], and is very sensitive to even subtle effects if O(α 4 s ) corrections are known. We apply the sum rule to obtain an updated determination of the size of the pole mass O(Λ QCD ) ambiguity, accounting for the O(α 4 s ) results of Refs. [54,55] which became available recently but were unknown when Ref. [47] appeared. To demonstrate the sum rule's capabilities to probe O(Λ QCD ) renormalon ambiguities in perturbative series and to clarify subtleties in how to use it properly, we also apply it to a few other cases. Interestingly, the analytic manipulations arising in the derivation of the sum rule lead to an alternative expression for the high-order asymptotic behavior of a series that contains an O(Λ QCD ) renormalon. This expression differs from the well known asymptotic formula which is known since a long time from [77], and we therefore discuss it as well.

Derivation
The analytic derivation for the Borel transform of the MSR-pole mass relation starts from its expression related to the solution of the R-evolution equation given in Eq. (3.1) which was already derived in Ref. [47].
where in the second line we changed variable to t = − 2π/(β 0 α s (R)) and used the identity (A.6) to scale out Λ QCD , and in the third line we employed the coefficients given in Eq. (A.15). The expression in Eq. (4.1) gives an all-order representation of the original series that is more useful for analyzing O(Λ QCD ) renormalon issues than Eqs. (2.3) and (2.5). This is because using the R-evolution equation of Eq. (3.1) (which is linear in R) and its solution, provides, through the sum in k, a reordering of the original series in leading and subleading series of terms from the perspective of their numerical importance in the asymptotic high order behavior related to the O(Λ QCD ) renormalon. This allows to derive rigorously a representation of the Borel transform [ given in Eq. (4.7) ] reflecting efficiently the hierarchy of leading and subleading terms with respect to the O(Λ QCD ) renormalon, which is the information that is not contained in the original series. That such a separation is possible in a systematic way may not be obvious, but it is achieved by the R-evolution equation. We stress that the result of Eq. (4.7) should not be considered as the exact expression for the Borel transform because it does not encode information on possible poles (or non-analytic cuts) other than at u = 1/2. We note that these poles and the associated renormalons can be studied by considering solutions of R-evolution equations involving powers of R different from the linear dependence shown in Eq. (3.1), see [78]. We note that the expression in the last line of Eq. (4.1), which involves the incomplete gamma function Γ(c, t) = ∞ t dx x c−1 e −x , also arises in the analytic solution of the mass difference (3.3), Here the cut in the gamma functions Γ(c, t) for t < 0 cancels in the difference for each k in the sum, and the result on the RHS is real. We mention that the first term (k = 0) in the sum over k provides the summation of the leading terms in the β 0 /LL approximation shown in Eqs. To proceed we asymptotically expand the incomplete gamma function in inverse powers of t (i.e. powers of α s ) where the coefficients g are given in Eq. (A.13), and coincide with the s k coefficients defined in Ref. [77]. We stress that the equality in Eq. (4.3) is the asymptotic expansion and is not an identity, so that the imaginary part due to the cut in the incomplete gamma function does not arise on the RHS. Inserting Eq.
We then perform the Borel transform with respect to powers of α s (R) according to the rule Using identities for the hypergeometric function we can rewrite and the Borel transform can then be cast into the form [47] B αs(R) m MSR where and N 1/2 and P 1/2 are two conventions for the normalization. Here Since the S k coefficients are renormalon-free and further damped by the factorial in the denominator, this sum is finite. Furthermore, the sum on the second line of Eq. (4.7) is also finite for u = 1/2. Therefore one concludes that the sum of Q coefficients is regular at u = 1/2, implying that the first line of Eq. (4.7) fully contains the leading-renormalon singular behavior. In Ref. [47] the expression for the Borel transform in Eq. (4.7) was given using P 1/2 , but here we have shown an alternate convention with N 1/2 which agrees with the terms N m and N discussed in Refs. [8,70], and hence eases comparison of our numerical results with theirs. For the phenomenological relevant values n = (3, 4, 5) we have N 1/2 /P 1/2 = (1.27, 1.18, 1.09). The analytic difference between these normalizations is that P 1/2 vanishes in the limit n → −∞ while N 1/2 is finite in this limit. We will predominantly use N 1/2 for the numerical examinations in the following subsections.
The manipulations that lead to the expressions for P 1/2 and N 1/2 involve the rearrangement of the infinite sums over and k in Eq. (4.5). These can be seen to be identities if one assumes that the QCD β-function and its inverse have some region of convergence. In practice, because only the first few terms in perturbation theory are known and one truncates the sums over and k, no formal convergence issue arises. We note that the analytic manipulations involving the R-evolution equation and the derivation of Eq. (4.7) are also valid in schemes for the strong coupling other than MS, and to apply them to such schemes one simply needs to account for the perturbative rearrangement for the coefficients a n and the QCD β-function due to the scheme change. As an example, all manipulations and the results simplify considerably in a strong coupling schemeᾱ where the coefficientŝ b n vanish for n > 1 and which also implies g = 0 for > 0 and that the coefficients of the QCD β-function have the exact form β n = β 0 (β 1 /β 0 ) n . Since such a scheme change can be achieved via a relation of the form α s (µ) =ᾱ(µ) + [ β 2 /β 0 − (β 1 /β 0 ) 2 ]ᾱ 3 (µ) + . . . , which does not contain any O(ᾱ 2 s ) term, the overall normalization of N 1/2 (or P 1/2 ) remains unchanged [73]. In this scheme we have S k>0 =γ R k −b 1γ R k−1 , and Eq. (4.8) can be rewritten in the equivalent form and was derived recently in Ref. [79]. There is, however, no advantage in using this form, because the coefficientsγ R k in theᾱ scheme still have to account for the reordering of the series due to the scheme change from α s toᾱ. Other schemes, such as the 't Hooft scheme, where all coefficients of the QCD β-function beyond β 0 and β 1 vanish, have been studied in Ref. [78].
We discuss the structure of the non-analytic terms multiplied by N 1/2 in Eq. (4.7) in Sec. 4.4 below. The second term in Eq. (4.7) is purely polynomial and represents contributions in the Borel transform B(u) that account for the portions in the original series of Eqs. (2.3) and (2.5) that go beyond the pure O(Λ QCD ) renormalon corrections that numer-ically dominate the series. These terms may include renormalon contributions of a different kind [ such as O(Λ QCD ) k>1 ], which are however not probed by an R-evolution equation that is linear in R [58]. Moreover, they account for the difference of the pure O(Λ QCD ) renormalon asymptotic form of the series (encoded in the value of N 1/2 ) and the actual coefficients of the original series given in Eqs. (2.3) and (2.5). The latter are recovered in the asymptotic limit were the sums over k and are carried out up to infinity. Note that in practice, for a finite order determination of the Borel transform for a given value of N 1/2 or P 1/2 , one truncates the sum over k and in Eq. (4.9), and in this case the terms coming from the Q represent finite polynomials. For the construction of a Borel transform that reproduces the known coefficients exactly, it may then be more suitable to simply fit the coefficients of the remaining polynomial terms such that the known coefficients in the original series are reproduced exactly.

Renormalon Sum Rule
The analytic expression for N 1/2 is quite useful as it can be applied to any perturbative series as a probe for O(Λ QCD ) renormalons, given the information on the available coefficients of a perturbative series. We therefore call the formula for N 1/2 (or equivalently P 1/2 ) in Eq. (4.8) the O(Λ QCD ) renormalon sum rule [47]. Formally to any given order in k, N 1/2 is a linear functional acting on perturbative series in powers of α s since the coefficients S k in Eq. (4.10) As a word of caution, we emphasize that applying the N 1/2 sum rule to a truncated series does (like any other type of renormalon calculus in the context of perturbative QCD) not rigorously and mathematically prove or disprove the existence of an O(Λ QCD ) renormalon, since the existence of renormalons is by definition related to the asymptotic high-order behavior and mathematically strict proofs, if they exist, are related to elaborate all-order studies of Feynman diagrams. So using the sum rule should be better thought of as an analytic projection of the known terms of a perturbative series onto the known pattern of a pure O(Λ QCD ) renormalon series, which is generated from the singular terms in the Borel transform in Eq. (4.7) that are multiplied by N 1/2 or P 1/2 and known to all orders. This projection becomes more accurate the more terms of a series are known and mathematically converges (only) if the yet unknown high order terms keep following the renormalon pattern expected from the low order terms. 3 Although the series in k for N 1/2 in Eq.  in the original series of Eqs. (2.3) and (2.5) and subsequently expanding again in α s (R). This leads to and one can show that in the asymptotic limit, i.e. to all orders in k, the sum rule expression for N 1/2 or P 1/2 is invariant under variations of λ. Thus for a finite order determination of N 1/2 the λ-dependence decreases with order, and the remaining variation with λ can be taken as an estimate for the uncertainty due to the missing higher order terms in the same way as renormalization scale variation in RG-invariant power series in α s is commonly used to estimate perturbative uncertainties. The invariance under changes of λ is directly related to the facts that the O(Λ QCD ) renormalon ambiguity of the series in Eqs. (2.3) and (2.5) is R-independent and that carrying out the Borel transform of Eq. (4.5) in the previous section with respect to α s (µ) instead of α s (R) leads to the simple rescaling factor µ/R of all the non-analytic terms proportional to N 1/2 .

Sum Rule for the Pole Mass Renormalon
We now apply the sum rule to the series of the MSR-pole mass relations to quantify the O(Λ QCD ) renormalon of the pole mass. Note, that to fully determine the order k result, the  [54,55] and the O(α 6 s ) 5-loop correction to the QCD β-function [80] are required. To simplify terminology we call the result that truncates the series for N 1/2 after the k-th term the "(k + 1)-loop" or "O(α k+1 s ) result", referring to the order to which the series is being probed with the sum rule.
In Fig. 3a the numerical results for N 1/2 (n = 5) are shown for the natural (solid lines) and practical (dashed lines) MSR masses for 0.5 < λ < 2 using terms in the series for N 1/2 up to k = 0 (cyan), k = 1 (blue), k = 2 (green) and k = 3 (red). The thickness of the O(α 4 s ) curves correspond to the numerical error of the coefficients quoted in [55] and shown in Eqs. (2.6) and (2.2) and indicates that this error is more than an order of magnitude smaller than the uncertainty due to missing higher order terms and therefore negligible. We therefore do not account for this uncertainty any further and adopt the central values given in Eqs. (2.6) and (2.2). Using the λ dependence in the range 0.5 < λ < 2 as an error estimate due to the missing higher orders we obtain for Both results are fully compatible, as is expected since the difference of the natural and practical MSR masses is free from an O(Λ QCD ) renormalon as already discussed in Sec. 2.3. We see that the λ-dependence of N 1/2 nicely decreases when including more higher-order terms and that there is excellent convergence. The convergence and the reducing λ-dependence both indicate that the numerical size of the recently calculated 4-loop correction in the MS-pole mass relation [54,55] is fully compatible with the expectations based on the knowledge of the corrections up to 3 loops and the proposition that the MS-pole mass is dominated by an O(Λ QCD ) renormalon behavior already at the known low orders.
It is quite instructive that one can invert this line of arguments and use the sum rule as a tool to determine a prediction for higher order terms in the perturbative series under the assumption that the O(Λ QCD ) renormalon-type behavior observed at lower orders persists also at higher orders. Indeed, using for example the O(α 3 s ) result for the practical MSR mass N prac 1/2 (n = 5) = 0.494 ± 0.032 and the coefficients a  [55]. The prediction for the O(α 4 s ) coefficient based on the sum rules has a larger error but is fully compatible with the results from the explicit loop calculations. This is remarkable given that the sum rule result is obtained with essentially no additional computational effort. We note that estimates for the coefficient a MS an uncertainty for the estimate using the known corrections up to O(α 3 s ) and obtained the results a MS 4 (n = 5, 1) = 241920 ± 23552 and a MS 4 (n = 5, 1) = 229632 + 7936 − 44800 , respectively, which are fully compatible with the sum rule estimate we showed above at the same order.
The results for N 1/2 (n = 5) represent the O(Λ QCD ) renormalon ambiguity for the top quark pole mass assuming that the other quark flavors including the charm and bottom quarks are massless. The other cases of phenomenological interest are n = 3 and n = 4 and the corresponding results for the natural and practical MSR masses are given in Tab Note that the uncertainties are slightly larger than the ones quoted in Tab. 1. Following Ref. [70] we have also included an additional uncertainty coming from varying the defining coefficients a MS n = a MS n (n , 0) of the natural MSR mass based on the idea that using the association of R with the MS mass at the scale of the MS mass is in principle not mandatory. Since one may as well consider different renormalization scales for the MS mass and the O(Λ QCD ) renormalon ambiguity is not affected by this choice, we have determined modified coefficients a n from Eq. (2.3) by setting R = m The results of Eqs. (4.12) -(4.14) are compatible with those given in Refs. [8,70]. For example for n = 5 [70] obtained 0.4616 +0.027 −0.070 ± 0.002, where the first uncertainty is from a double scale variation similar to ours and the second uncertainty is from the numerical determination of the four loop coefficient. In Refs. [8,70] the determination of the normalization N 1/2 was based on the ratio method, which arises from a comparison of the perturbative coefficients a n from explicit QCD loop calculations to the coefficients a asy n of the series generated by a pure O(Λ QCD ) renormalon in Eq. (4.16) based on the relation that lim n→∞ a n /a asy n = 1. In Ref. [8] the static QCD potential and the MS-pole mass relation were studied, and in Ref. [70] the MS-pole mass was examined. (In Ref. [8] the static potential based numbers are roughly 1.4σ higher than those in Eqs. (4.12)-(4.14), which may be related to the points discussed below in Sec. 5.1 for the PS mass.) The agreement of our sum rule results and those obtained from the ratio method in Ref. [70] underlines the capabilities of R-evolution and the renormalon sum rule concept.
In Tab. 1 we have also shown the results for a number of other n values as these results are also of theoretical interest. Our results are in full agreement with and have compatible uncertainties to the results given in Tab. 1 of Ref. [70] and in particular confirm that N 1/2 → 1 for n → −∞, which is the classic large-n limit where the perturbative series are fully dominated by the massless quark bubble chain and the non-Abelian QCD effects are diluted away. Our result for n = 0 is also in agreement with Ref. [8] and the lattice determinations of Refs. [69,85], which found N 1/2 (n = 0) = 0.600 ± 0.029, N 1/2 (n = 0) = 0.660 ± 0.056 and N 1/2 (n = 0) = 0.620 ± 0.035, respectively. We note that our analytic expression for N 1/2 gets unstable and non-conclusive for 10 n 30 which is the so-called conformal region where the coefficient β 0 of the QCD β-function becomes small and in particularb 1 = β 1 /(2β 2 0 ) becomes large. In this region the analytic formula for N 1/2 has singularities and does not approach any stable value. This is connected to the fact that in this region no definite statement on the asymptotic large order behavior of the perturbative series and in particular on the O(Λ QCD ) renormalon can be made because the infrared and ultraviolet structure of the QCD β-function strongly depend on a complicated numerical interplay of the coefficients β i>0 , which can become quite large and have different signs. The unstable behavior of our analytical formula for 10 n 30 differs from the results obtained in Refs. [8,70], where the normalization N 1/2 was observed being tiny. However, as emphasized in Ref. [70], this feature was an artifact of the ratio method used in Refs. [8,70], and again indicates that in this n region the canonical renormalon calculus cannot be applied.
In Ref. [29] the Borel method to compute N 1/2 was suggested based on the idea that the Borel function (1 − 2u) 1+b 1 B αs (u) eliminates all non-analytic contributions in the first term on the RHS of Eq. (4.7) and thus isolates the term N 1/2 in the limit u → 1/2 [68]. This approach entails that after the low-order terms in the expansion of the Borel transform B αs (u) around u = 0 are determined from the original series, one expands (1−2u) 1+b 1 B αs (u) in powers of u and subsequently evaluates the resulting series for u = 1/2. The results of Refs. [29,68] were based on the assumption that the analytic contributions [ involving the functions Q (u) ] on the RHS of Eq. (4.7) quickly tend to zero when multiplied by (1−2u) 1+b 1 and are unimportant. This is not the case, as the Taylor expansion (1 − 2u) 1+b 1 around u = 0 converges very slowly to zero if one sets u = 1/2. This can be traced to the fact thatb 1 is non-integer and in general the convergence radius of the binomial series is 1.
Here u = 1/2 corresponds exactly to the border of this radius. These terms are therefore numerically sizable at any truncation order. As we show in App. B, neglecting them leads to a much larger dependence on the renormalization parameter λ at a given truncation order. This is because the λ dependence of these terms is multiplied by a factor converging to zero, but the convergence is rather slow. When many orders are included, as shown in Ref. [69] which accounted for terms up to O(α 20 s ), the dependence vanishes and the method converges to N 1/2 , which we have confirmed through a reanalysis. This observation is consistent with the large scale uncertainties found in the detailed numerical analysis of Ref. [8]. The Borel method to determine N 1/2 is therefore not very precise if only the first few terms of the series are known. Interestingly, accounting for the analytic terms on the RHS of Eq. (4.7), which are contained in the polynomials Q and are computed systematically from R-evolution as shown in Sec. 4.1, one can derive an improved version of the Borel approach which agrees exactly with our sum rule formula of Eq. (4.8). The corresponding analytic calculation and a brief numerical analysis are given in App. B.

Asymptotic Higher Order Behavior
In this section we use the analytic manipulations that arise in the derivation of the sum rule to derive an alternative expression for the high-order asymptotic form of a series containing an O(Λ QCD ) renormalon that differs from the well known formula derived in [77]. The latter formula is related to the sum of the non-analytic terms, which are multiplied by N 1/2 or P 1/2 in the Borel function of Eq. (4.7), and reads giving the asymptotic form of the coefficients a asy is the Pochhammer symbol. Given the value for P 1/2 or N 1/2 the structure of the perturbative coefficients of Eq.
β 0 αs(R) (4.17) with Λ QCD given in Eq. (A.6). As a side remark, we note that inserting the series in Eq. (4.15), with a given value for N 1/2 , into the sum rule expression of Eq. (4.8) one recovers N 1/2 in the limit of carrying out the sums over k, n and to infinity. Interestingly, Eq. (4.4) provides a remarkable alternative expression for the high-order asymptotic of the MSR-pole mass series as it can be rewritten in the form In contrast to Eq. (4.15) this expression still depends on the S k coefficients non-trivially and thus carries all the information contained in the original series due to the identity a n = (2β 0 ) n This relation is interesting because it provides a separation of the coefficients of the original series into leading and subleading terms with respect to the asymptotic high-order behavior.
So truncating the sums over k and in Eq. (4.19) (e.g. accounting for the coefficients S k and g up to the order they are known) provides the correct high-order asymptotic behavior for n beyond the truncation order and, at the same time, reproduces exactly the coefficients of the original series up to the truncation order.
Currently the coefficients a n for the MSR-pole and the MS-pole mass relations are known to order O(α 4 s ) and the QCD β-function is known to order O(α 6 s ) so that the coefficients S k and g are known up to k max = max = 3. We may therefore write down estimates for the still uncalculated coefficients a n>4 using the expression a asy which is the established formula from [77] shown in Eq. (4.15), and for N 1/2 . The uncertainties for the coefficients a asy n are based on the uncertainties shown in Eqs. (4.12) -(4.14) and those for the coefficients a asy n are determined from λ variations 1/2 < λ < 2, as explained in Sec. 4.2 and η variations 1/2 < η < 2, as explained below Eq. (4.14). The coefficient estimates for the MS mass have been obtained by using the second equality of (5.8) and Eq. (A.7) to the order shown. We see that both estimates are completely equivalent and have the same uncertainties. Our estimates for the MS mass coefficients for n = 5 also agree perfectly with those given in Ref. [70] which used the approach of Eq. (4.20). We note that the relation (4.19) can also be inverted to provide closed iterative expressions for the S k coefficients to all orders, which are given in App. A and in particular in Eq. (A.18).
We note that the asymptotic series coefficients a asy n in Eq. (4.16) and the expression for the coefficients a n in Eq. (4.19) allow for an alternative derivation of the renormalon sum rule formula since the ratio a n /a asy n approaches unity for n → ∞. Taking that ratio one .
To the extent that the sums over k in the sum rule formula of Eq. (4.8) and in Eq.

Other Applications of the Sum Rule
To conclude our considerations concerning the O(Λ QCD ) renormalon sum rule we discuss in this section a number of subtleties in its proper use and a few interesting applications. As it is sufficient for the purpose of the examinations, we use for simplicity only λ variations, as explained in Sec. 4.2, when quoting uncertainties of the sum rule evaluated here.

Number of Massless Flavors
An important feature of the O(Λ QCD ) renormalon sum rule is that it probes the infrared sensitivity of the perturbative series, which physically depends on the number of massless quarks, n , one employs in the computation of the series. In a computation in QCD, however, n might not be equal to the number of active flavors, n f , which governs the ultraviolet behavior and the renormalization group evolution of the strong coupling α s , as it is done in the definition of the practical MSR mass, or by integrating out the effects of the n f − n massive quarks, as it is done in the definition of the natural MSR mass. The latter approach is the physically cleaner way (which was the reason for using the name 'natural'), but both approaches are consistent as far as the application of the sum rule is concerned.
In the following we discuss the pitfalls of using the sum in an inconsistent way. To discuss the issue we recall that, since the O(Λ QCD ) renormalon sum rule is a functional on the perturbative series, it can also be seen as a function N 1/2 [ n , {a n }] acting on the coefficients a n of the [ α s /(4π) ] n terms in the series. As indicated, N 1/2 is a function of the number of massless flavors n through its dependence on β 0 and the coefficientsb k , which appear in Eq. (4.8) and a function of the coefficients a n contained in the expressions for the S k as shown in Eq. (A.15). The function N 1/2 [ n , {a n }] is therefore probing the series defined by the set of coefficients {a n } with respect to an O(Λ QCD ) renormalon for n massless flavors, and it is essential for the sum rule to work properly that the value of n agrees with the number of massless flavors used for the computation of the coefficients a n . Let us now apply the sum rule to the coefficients {a MS,n n } of the series for m pole Q − m Q (m Q ) (n +1) in Eq. (2.1), which is a series in α (n +1) s , but contains the effects of n massless flavors. Here we use the shorthand notation a MS,n n ≡ a MS n (n , n h = 1) .

(4.23)
To be specific we take n = 5. Probing the series with respect to an O(Λ QCD ) renormalon for n + 1 = 6 massless flavors, in accordance with the scheme for α s , one obtains n }] = (0.531 ± 0.318, 0.526 ± 0.1298, 0.623 ± 0.070, 0.6360 ± 0.016) at order n = (0, 1, 2, 3), where the errors are obtained from varying λ in the range 0.5 < λ < 2 and the central values are the mean value of the respective maximal and minimal values obtained in the λ variation. We see that the sum rule appears to approach a value that is much larger than the correct result of Eq. (4.14), but this is a consequence of an inconsistent application of the sum rule. Indeed, one can show by simple algebra in the β 0 /LL approximation [ whereb i≥1 = β i≥1 = 0, a asy,n n+1 = a 1 (2β 0,n ) n n! and β 0,n = 11 − 2/3 n ] that the order n expression for N 1/2 that is obtained -when probing with respect to an O(Λ QCD ) renormalon for n f massless flavors -has the form β 0,n β 0,n f n .

(4.24)
As long as β 0,n is a positive number this expression diverges for n f > n in the limit n → ∞, which explains the behavior of the sum rule results shown above. On the other hand, the expression of Eq. . We also learn that adopting for the strong coupling α (n f ) s a flavor number scheme where n f agrees with the number of massless flavors is clean conceptually, but not crucial numerically such that the sum rule works reliably. This is related to the fact that the matching relation of the strong coupling in different flavor number schemes does not suffer from an O(Λ QCD ) renormalon behavior.
This brief examination above underlines the importance that the O(Λ QCD ) sum rule, which probes the infrared sensitivity of the perturbative series, is applied consistently with respect to the number of massless quarks, which may not agree with the number of active flavors in the normalization group equation that is governed by ultraviolet effects. Of course this feature may as well be used as a tool, as studying the convergence of the sum rule may be employed to determine the number of massless flavors used, let's say, in a numerical computation of a perturbative series.

Moments of the Vacuum Polarization Function
The zero-momentum moments M i , i = 1, 2, 3, . . ., of the massive quark Q vector current , (4.25) provide one of the most precise methods to determine the charm and bottom quark MS masses [1][2][3][4][5][6][7][8][9][10] and are known to utterly fail in precision when expressed in terms of the charm and bottom pole masses. This mass sensitivity comes from the fact that the perturbative series for the moments M i is due to dimensional reasons proportional to m −2i Q in the form where n is the number of massless flavors and we use the n -flavor scheme for the strong coupling. 5 The moments M i are related to weighted integrals over the hadronic R-ratio of QQ production and thus free from the O(Λ QCD ) renormalon. They can be rewritten in the form   This analysis underlines the importance of using renormalon-free parameters for series coefficients that are being probed with the O(Λ QCD ) renormalon sum rule, but also illustrates the high sensitivity of the sum rule to even subtle high order effects.

Infrared Sensitivity of the PS Mass Definition
The PS (potential subtracted) mass [25] is based on the concept that the total static potential energy of a color singlet massive quark-antiquark pair with separation r, 2m pole whereṼ ( q 2 ) is the momentum-space static potential calculated in perturbation theory. To the extent that the total static potential is a well-defined and unambiguous quantity, the PS mass is free from an O(Λ QCD ) renormalon. The coefficients of the series for m pole Q −m PS Q (µ f ), expressed as a series in powers of α (n ) s (µ f )/(4π), are given in Eq. (C.1). We now apply the O(Λ QCD ) renormalon sum rule to the relation of the pole mass to the potential PS mass. The examination is of interest because the static potential has infrared divergences starting at O(α 4 s ) arising from higher Fock QQ-gluon states which lead to retardation effects that invalidate the frame-independent static limit [97,98]. The definition of the PS mass at O(α 4 s ) and beyond is therefore known to depend on the scheme used for the subtraction prescription for these infrared divergences. In Refs. [99] the authors defined the following convention: the infrared divergence in the O(α 4 s ) corrections to the momentumspace static potential [100,101] is regularized dimensionally (with the MS convention for the definition of µ), and the 1/ divergence together with the corresponding logarithm log(µ/µ f ) that arises from the integral over the momentum-space static potential in Eq. (4.31) are subtracted. We call this the standard convention, and it leads to the coefficient a PS 4 shown in Eq. (C.2), where the term with the logarithm log(µ/µ f ) is dropped. In a minimal subtraction convention, only the 1/ divergence is subtracted and the logarithmic term displayed in a PS 4 remains. So the convention of Ref. [99] is equivalent to the choice µ/µ f = 1 for the dimensional scale in the minimal subtraction convention.
Using the O(Λ QCD ) renormalon sum rule we can now track quantitatively if and how much the convention for the infrared subtraction may affect the higher-order behavior in the PS-pole mass relation. Applying the sum rule to the PS mass in the standard convention of Ref. [99] we obtain for n = 5, relevant for the top quark, The order n = 3 result that involves the O(α 4 s ) coefficient a PS 4 is 22% higher and within errors only marginally compatible with the result N 1/2 (n = 5) = 0.446 ± 0.026 of Eq. (4.14). This indicates that a PS 4 in the standard convention is somewhat larger than expected assuming that the pole-PS mass series is dominated by the pole mass renormalon. The same observation has also been made in Refs. [54,102] in the context of relating the PS mass to the MS mass.
It is interesting to consider other minimal subtraction scheme choices that differ from the standard scheme by reasonable variations of the subtraction scale µ. For example, for the choice µ/µ f = 1/5 we obtain N µ/µ f =1/5 1/2 = 0.455 ± 0.021 at order n = 3 for n = 5, which is fully compatible with Eq. (4.14). That the sum rule result for the PS mass agrees with the correct result of Eq. (4.14) much better for a smaller infrared subtraction scale is quite suggestive because the infrared divergence in the static potential is known to be physically regulated by the massive quark kinetic energy, which is of order q 2 /m Q ∼ µ f v where v is the relative velocity, and hence is parametrically smaller than | q | ∼ µ f . We stress that our analysis does neither validate nor invalidate the concept of the standard PS mass as a suitable mass scheme to carry out ongoing high-precision threshold studies [11,13], as the sum rule only probes the calculated orders and the effect of the retardation singularity on the perturbative coefficients in the static potential beyond O(α 4 s ) on the PS mass scheme is unknown. However, the analysis demonstrates that the scheme dependence in the PS mass coming from the infrared divergences in the static potential at O(α 4 s ) is not a numerically irrelevant issue and may become even more serious beyond O(α 4 s ). As far as the known O(α 4 s ) results are concerned the issue already seems to affect the relation of the standard PS mass to the MSR and MS masses as discussed in Sec. 5.1.

QCD β-Function and Massless Quark R-ratio
As the concluding part of the discussion in this section we now apply the O(Λ QCD ) renormalon sum rule to series that are known not to be plagued by any O(Λ QCD ) renormalon. As examples we take the series for the QCD β-function with a β n = β n−1 , (4.33) as defined in Eq. (A.1) and the hadronic R-ratio for n massless quarks where √ s stands for the center-of-mass energy, with [103][104][105][106][107] a R 1 = 4 , at order n = (0, 1, 2, 3). The errors are obtained from the variation 0.5 < λ < 2. In both cases all results for N 1/2 beyond O(α s ) are compatible with zero as expected. We note that at least for the hadronic R-ratio it is known that its perturbative series given in Eq. (4.34) has a renormalon ambiguity that is suppressed and scales with the fourth power of the hadronic scale Λ QCD . This leads to an ambiguity in the R-ratio of O(Λ 4 QCD /s 2 ), which is associated to the gluon condensate, and adding the effects of the gluon condensate in the context of an operator product expansion in terms of low-energy QCD matrix elements [108,109] this ambiguity is compensated in a physical prediction. For the QCD β-function no conclusive statements on a higher-order renormalon ambiguity exist. The results in Eqs. (4.36) and (4.37) show that the O(Λ QCD ) renormalon sum rule is only probing for an O(Λ QCD ) renormalon and not sensitive to any higher order renormalon ambiguity.
It is straightforward to generalize the sum rule discussed here to higher order renormalons, which has already been studied in Ref. [78].

Relation to Other Short-Distance Masses
From the perturbative series that relate other short-distance masses to the pole mass it is straightforward to determine the perturbative series for the difference of these shortdistance masses to the MSR masses by eliminating the pole mass systematically such that the O(Λ QCD ) renormalon is canceled exactly. If regular fixed-order perturbation theory can be applied this is achieved by simply using a common renormalization scale µ and a consistent scheme for the strong coupling throughout the calculation when the pole mass is eliminated order by order. The corresponding formulae and codes for the relation of frequently used short-distance mass schemes (such as the kinetic mass [24], the PS mass [25], the 1S mass [26][27][28], the RS mass [29] and the jet mass [30,43]) to the MSR masses can be obtained on request, and we therefore do not intend to cover all possible cases in this paper. However, we will cover several of them explicitly since there are a number of non-trivial practical and conceptual aspects that arise in the relation of the MSR masses to a number of other short-distance mass schemes we would like to point out in the following.

Potential Subtracted Mass
The relations of the PS mass [25] and the natural and practical MSR masses at the common scale R up to O(α 4 s ) have the form [ a s ≡ α    We conclude that the conversion of the MSR masses to the PS mass in the standard scheme of Ref. [99] has, even at O(α 4 s ), perturbative uncertainties due to unknown higherorder terms of about 20 -40 MeV and that this behavior is related to the fact that the O(α 4 s ) coefficient in the relation of the PS mass to the pole mass in the standard scheme appears to be unnaturally large in the context of its expected size with respect to the pole mass O(Λ QCD ) renormalon ambiguity. On the other hand, using an infrared subtraction scheme for the PS mass, where the subtraction scale is much lower, leads to a much better perturbative behavior and to much smaller uncertainties in its relation to the MSR masses. This observation is fully consistent with the conclusions from the renormalon sum rule analysis we carried out for the PS mass in Sec. 4.5.3. Since the MSR masses for R = m Q are very close or identical to the MS mass m Q (m Q ) the conclusions we draw on the perturbative relation of the standard PS mass to the MSR masses also applies to the perturbative relation of the standard PS mass to the MS mass. For R = m Q the O(α 4 s ) correction is typically at the level of 40 MeV. We note that this issue of the standard PS mass scheme becomes problematic once a precision in top quarks mass determinations below 30 -40 MeV can be reached. Given the projections of top mass determinations of a future lepton collider, see e.g. [110,111], this may become a pressing issue, but for current studies of high-precision top quark mass determinations the standard PS mass scheme is adequate for most applications.

1S Mass
The 1S mass [26][27][28] is defined as half of the mass of the heavy quarkonium spin triplet ground state. In terms of the pole mass the 1S mass is defined as where the coefficients c n,k are known up to n = 4 and given for convenience in Eq. (C.3). Because the 1S mass originates from a calculation in the non-relativistic context, there are a few subtleties when calculating its relation to the MSR masses so that the O(Λ QCD ) renormalon cancels properly. For the case R ∼ m Q it is essential that terms of order [C F α s m Q ]α n s are formally counted as O(α n s ) in the conversion. This is because [C F α s m Q ] is the inverse Bohr radius, which is the relevant physical mass scale and should not be counted as an O(α s ) correction. This counting is called the Υ-expansion [26,27] or the relativistic order counting, and must also be used when relating the 1S mass to the MS masses in fixed-order perturbation theory. The resulting formula for the 1S mass as a function of the MSR mass for µ = Here a n are the coefficients in the MSR scheme. The inverse of Eq. (5.4) is given in Eq. (C.4).
For the case R ∼ m Q α s , which is relevant for non-relativistic applications where α s may scale with the quark velocity α s ∼ v 1, the non-relativistic counting R ∼ M B ∼ m Q α s should be used, such that the leading correction in the 1S-MSR mass difference is of order α 2 s . In this case the formula for the 1S mass as a function of the MSR mass for The inverse of Eq. (5.5) is given in Eq. (C.5). We note that in order to implement a general renormalization scale µ in Eqs. (5.4) as well as (5.5), also the dependence of M B on α s needs to be accounted for consistently, which leads to quite involved expressions for the relativistic counting of the Υ-expansion. For the top quark and R ∼ m t α s ∼ 30 GeV the numerical difference between using the relativistic or the non-relativistic counting is below 10 MeV at the highest order and may be not significant. However, for all other cases the difference can be more sizable such that a consistent use of the order counting is mandatory in general.
In   are from renormalization scale variations R/2 < µ < 2R. To these uncertainties the errors from the R-evolution calculation just shown above still have to be added quadratically to obtain the complete conversion uncertainty, which is shown in the parentheses. We see that the direct relativistic conversion, which does not account for the resummation of logarithms and renormalon corrections, leads to uncertainties of ± 20 MeV at highest order, compared to ± (10 -13) MeV for the conversion that uses R-evolution from 160 GeV down to non-relativistic scales ∼ M B . Given the projections of high precision top mass determinations at future lepton colliders [110][111][112], the increased precision obtained by using the resummation of higher order terms provided by R-evolution could be relevant, but for the conversion of the MSR mass (and also the MS mass) to the 1S mass the fixed-order expansion is adequate for most current applications in top quark physics.

MS Mass
The relation of the MSR masses to the MS mass is conceptually special since the MSR masses are directly derived from the perturbative series of the pole-MS mass relation. The concept of the MSR mass addresses the conceptual question of how the MS mass evolves for scales much smaller than the quark mass. This question simply expresses the situation that the MS mass m Q (µ) for µ m Q can be readily computed solving its renormalization group equation, but does not have any physical significance, because it breaks the power counting of heavy quark problems involving (non-relativistic) physical scales much smaller than the mass. This power counting breaking comes from the perturbative series of the pole-MS mass relation that scales with m Q even for µ m Q and which spoils the perturbative series for non-relativistic problems where smaller dynamical scales govern the size of the perturbative corrections and the scale m Q is integrated out and hence not a dynamical scale any more.
Since the perturbative series for the pole-MSR mass relations scale with R, which is adjustable, but also match to the pole-MS mass series for R = m Q , one can consider the concept of the MSR mass m MSR Q (µ) as the most reasonable answer of how the MS mass concept should be extended to scales µ m Q . Thus for µ m Q R-evolution is the proper concept of the renormalization group running of a heavy quark mass for scales below m Q . Both the natural and the practical MSR masses differ by the way how the virtual massive quark Q effects are treated in their matching relation to the MS mass at the scale µ ∼ m Q , and this matching may be considered in analogy to the flavor-number matching of the strong coupling schemes α (n ) s (µ) and α (n +1) s (µ) when the scale µ crosses m Q . In this context, the natural MSR mass is conceptually cleaner than the practical MSR mass, since in the natural MSR mass the virtual massive quark loops are integrated out at the scale µ = m Q , but this issue is irrelevant for practical applications, where the practical MSR mass has an advantage due to its simpler matching relation to the MS mass.
The most efficient way to relate the MSR masses m MSRn Q (R) and m MSRp Q (R) to the MS mass m Q (µ) is to (i) evolve the MSR masses from R to m Q using the R-evolution equations Eq. (3.3) with n active flavors, (ii) employing the regular renormalization group equation for m Q (µ) to evolve it from µ to m Q with (n + 1) active flavors, The solution of the R-evolution equation is [47] where Λ QCD and the coefficients γ R n , S k andb 1 are given in Eqs.  For R < m t the MSR mass m MSR t (R) is substantially smaller than the MS mass m t (R) and approaches the pole mass for R → 0. The MSR mass remains well defined for all R Λ QCD , whereas the exact value for the limit m MSR t (R → 0) is ambiguous due to the Landau pole in the evolution of the strong coupling in the R-evolution equation (5.7). This illustrates the ambiguity of the pole mass concept.

Conclusions
This paper had two main aims. The first aim was to give a detailed presentation of the MSR mass, which is an R-dependent short-distance mass designed for high-precision determinations of heavy quark masses from quantities where the physical scales are smaller than the quark mass, R < m Q . Since such scale hierarchies can only be really large for the top quark, the MSR mass concept is most useful in the context of top quark physics, but it may be useful for bottom and charm quark analyses as well. The MSR mass is obtained from the results of heavy quark on-shell self-energy diagrams which is not the case for any earlier low-scale short-distance mass given in the literature. The MSR mass has therefore a very close relation to the well-known MS mass m Q (µ), and should be viewed as the generalization of the MS mass concept for renormalization scales below m Q , where the MS mass is known to be impractical and does not capture the proper physics. The main feature of the MSR mass is that its renormalization group evolution is linear and logarithmic in the scale R, compared to the purely logarithmic evolution of the MS mass. This linear scale dependence in the renormalization group flow of the MSR mass is called R-evolution and the MSR mass is well defined for any R Λ QCD . Formally, in the limit R → 0, the MSR mass can be evolved to the pole mass. However, taking this limit is ambiguous as it involves evolving the strong coupling through the Landau pole, which illustrates the O(Λ QCD ) ambiguity of the pole mass scheme. Since there are two options to treat the corrections coming from virtual heavy quark loops in the heavy quark self-energy diagrams, we defined two variants of the MSR mass, the natural MSR mass m MSRn Q (R), where these effects are integrated out, and the practical MSR mass m MSRp Q (R), where they are still included in the mass definition. Both MSR masses can be easily related to all other short-distance mass schemes available in the literature. We have provided all necessary formulae such that conversions can be carried out to O(α 4 s ) and we have discussed in detail the cases where there are subtleties in the conversion.
The second aim of the paper was to give a detailed presentation of how R-evolution can be used to derive an analytic expression for the normalization of the high-order asymptotic behavior of the MSR-pole (or MS-pole) mass perturbative series related to the O(Λ QCD ) renormalon ambiguity contained in the pole mass. This analytic result can be applied to any perturbative series and be used to probe the known coefficients for the series pattern related to an O(Λ QCD ) renormalon ambiguity. Since using the result does not involve any numerical comparison of the series coefficients, but is a very simple analytic function of the coefficients, we call it the O(Λ QCD ) renormalon sum rule. Using the sum rule we reanalyzed the O(Λ QCD ) renormalon in the MSR-pole (and MS-pole) perturbative series and showed that the sum rule results are fully compatible with previous available methods. We examined the relation between these methods to our sum rule analytically and explained the reason why one of them has very slow convergence. We also applied the sum rule to a number of other quantities known to high order and demonstrated its high sensitivity. These examples included the PS-pole mass relation, the moments of the massive quark vacuum polarization, the hadronic R-ratio and the QCD β-function.
where β 0 = 11 − 2/3 n with n being the number of dynamical flavors. The coefficients are known up to β 4 from Refs. [80,[114][115][116][117][118][119]. The equation can be used to write and the first four coefficients relevant for renormalon sum rule applications up to O(α 4 s ) arê One can show the following recursion relation for theb k coefficients (b 0 ≡ 1): which can be used for an automated computation. From Eq. (A.2) one can also derive the known relation that gives Λ N k LL QCD if the series in G(t i ) is truncated after the k-th term. The matching relations for the strong coupling in the n and the (n +1)-flavor schemes at the scale m Q ≡ m The uncertainties appearing in the coefficients γ Rn,Rp 3 are from numerical errors in the results of Ref. [55]. They amount to an uncertainty in the solutions of the R-evolution equation of 1 MeV or less for all relevant cases and are smaller than the uncertainty due to missing higher orders. Therefore they can be neglected for all practical purposes.
The coefficients g defined by the series ∞ =0 g (−t) − ≡ e G(t) e −t (−t) −b 1 relevant for the renormalon sum rule up to O(α 4 s ) read One can proof the following recursion relation for g : suitable for automated computation. The coefficients g agree with the coefficients s given in Refs. [73,77].
The coefficients S k defined from the series ∞ k=0 S k (−t) −k ≡ − t γ R (t)b(t) e −G(t) e t (−t)b 1 relevant up to O(α 4 s ) read [γ R k = γ R k /(2β 0 ) k+1 ] (2 +b 1 +b 2 ) + a 1 2β 0 (2 +b 1 )b 2 + 1 2 (b 2 2 +b 3 ) , The relation between the S k coefficients and the R-anomalous dimension can be compactly written as follows: In addition one can use Eq. (4.19) to write a recursion relation for the S k coefficients, which are then expressed in terms of a i : where (b) n = b (b + 1) · · · (b + n − 1) = Γ(b + n)/Γ(b) is the Pochhammer symbol. This formula can be used for an automated implementation of S k once the g coefficients have been computed. We note that in order to determine the coefficients S k , one needs all terms up to k loops in the R-evolution equation, and the (k + 1)-loop QCD β-function. There is an interesting alternative way to determine the sum rule formula which starts from the Borel function B αs(R) (u) given in Eq. (4.7) without knowing the expression for N 1/2 . This expression is equivalent to the Borel transform of the original series −R ∞ i=1 a i [ α s (R)/(4π) ] i which has the form: in the fixed-order expansion in powers of the Borel variable u. Consider now the modified Borel function (β 0 /4πR)(1 − 2u) 1+b 1 B αs(R) (u). Inserting Eq. (4.7) for B αs(R) (u) one obtains: where the role of analytic and non-analytic terms is just reversed compared to Eq. (4.7). Truncating the series in at order n (which corresponds to including the coefficients a i , S k , g up to i = n + 1, k = n and = n, respectively), one can see that expanding Eq. (B.2) in powers of u up to order n and taking the limit u → 1/2 one singles out N 1/2 on the RHS: Although lengthier, it can be checked that this formula agrees exactly with the sum rule of Eq. (4.8) at (n + 1)-loop order (i.e. when truncated with k ≤ n as shown). In Ref. [29] (see also Ref. [68]), a version of the above considerations to determine the normalization of the non-analytic terms in Eq. (4.7), which we refer to as the Borel method, was proposed. They made the additional assumption that the analytic terms on the RHS of Eq. (4.7) can be neglected because they quickly tend to zero when multiplied by (1−2u) 1+b 1 in the limit u → 1/2. Therefore they did not include the terms related to the polynomials Q . This leads to a formula for the normalization that only contains the first term on the RHS of Eq. (B.4), which they called N m . After a bit of algebra, the double sum of this term can be recast into a single summation, yielding: 6 However, the contribution from the second term on the RHS of Eq. (B.4) is actually not negligible because it involves the expansion of the (1 − 2u) 1+b 1 and setting u = 1/2 afterwards. In particular, the β-function coefficients β n>1 contained in the g are essential for the cancellation of the λ-dependence with n beyond 2-loop order, i.e. for n > 1. This is shown in Fig. 6 where we plot N (n) 1/2 (solid lines) and N (n) m (dashed lines) obtained from the natural MSR mass for n = 0 (cyan), n = 1 (green), n = 2 (blue) and n = 3 (red) for n = 5 as a function of λ in the interval [ 0. 5,2 ]. We see that the results for N (n) m differ 6 We note that no analytic formula for N (n) m was provided in Ref. [29], and that Eq. (B.6) correctly encodes the prescription given there. In formula (7) of Ref. [113] the following analytic double series formula was given: Γ(2 +b1)(−1) m a n +1 Γ(m + 1)Γ(n + 1)Γ(2 +b1 − m) where in the second line we have converted to our conventions for ease of comparison. Eq. (B.5) is not fully specified because it does not provide a prescription how to systematically truncate the two series in order to compute Nm at (n+1)-loop order. The sum for (1−2u) 1+b 1 = ∞ m=0 (2u) m Γ(2+b1)/[ Γ(m+1)Γ(2+b1 −m) ] converges to zero at u = 1/2, while the other, which is Eq. (B.1), is divergent for u = 1/2. To obtain Eq. (B.6) from Eq. (B.5) one switches variable from (m, n ) to (k, m) with k = m + n , and then finally truncates with respect to the variable k. substantially from N (n) 1/2 showing that the terms neglected in the approach of Ref. [29] are numerically sizable and, in particular, do not decrease with the order n. Moreover, the results for N (n) m do not appear to show any reduced λ-dependence beyond 2-loop order, in contrast to the results for N (n) 1/2 . Interestingly, in Ref. [69] it has been shown that when many more terms of the expansion are known [ they accounted for terms up to O(α 20 s ) for the quark and gluino QCD static potential ], Eq. (B.6) does eventually converge to the right value and shows reduced scale variation. We have numerically confirmed that using series generated from the Borel function of Eq. (4.7) setting (by hand) explicit expressions for the functions Q (u), such as Q (u) = δ ,0 . The eventual convergence at very high orders n can be understood from the fact that the contributions in the asymptotic behavior of the perturbative coefficients a n that arise from the β-function coefficients β n>1 become 1/n suppressed and eventually become also numerically small, see Eqs. (4.16) and (4.19). But in any case, its very slow convergence renders the Borel method less practical and less precise for most phenomenological applications, for which only a few terms of the perturbative expansion are known.

C Other Short Distance Masses
The PS mass [25] is defined by the integral of the momentum space color singlet static potential between a quark-antiquark pair, each having infinite mass. The relation of the PS mass to the pole mass has the form In the standard convention for the PS mass defined in Ref. [99] the term log(µ/µ f ) appearing in a PS 4 is set to zero. The definition of the 1S mass [26][27][28] in terms of the pole mass is given in Eq.