Bottom and Charm Mass determinations from global fits to $Q\bar{Q}$ bound states at N$^3$LO

The bottomonium spectrum up to $n = 3$ is studied within Non-Relativistic Quantum Chromodynamics up to N$^3$LO. We consider finite charm quark mass effects both in the QCD potential and the $\overline{\mathrm{MS}}$-pole mass relation up to third order in the $\Upsilon$-scheme counting. The $u = 1/2$ renormalon of the static potential is canceled by expressing the bottom quark pole mass in terms of the MSR mass. A careful investigation of scale variation reveals that, while $n = 1, 2$ states are well behaved within perturbation theory, $n = 3$ bound states are no longer reliable. We carry out our analysis in the $n_\ell = 3$ and $n_\ell = 4$ schemes and conclude that, as long as finite $m_c$ effects are smoothly incorporated in the MSR mass definition, the difference between the two schemes is rather small. Performing a fit to $b\bar{b}$ bound states we find $\overline{m}_b(\overline{m}_b) = 4.216\pm 0.039$ GeV. We extend our analysis to the lowest lying charmonium states finding $\overline{m}_c(\overline{m}_c)=1.273 \pm 0.054$ GeV. Finally, we perform simultaneous fits for $\overline{m}_b$ and $\alpha_s$ finding $\alpha_s^{(n_f=5)}(m_Z)=0.1178\pm 0.0051$. Additionally, using a modified version of the MSR mass with lighter massive quarks we are able to predict the uncalculated $\mathcal{O}(\alpha_s^4)$ virtual massive quark corrections to the relation between the $\overline{\mathrm{MS}}$ and pole masses.


Introduction
The precise determination of hadron spectroscopy from fundamental principles pursues to unveil QCD at its non-perturbative regime. The non-perturbative nature of QCD at hadronic scales implied the development of phenomenological approaches such as quark models [1][2][3][4][5] or, more recently, computer-based calculations using Lattice QCD [6][7][8][9]. However, the unique properties of heavy quarkonium systems allow an entire calculation in terms of non-relativistic QCD (NRQCD for short) in its weak coupling regime (that is, in pure perturbation theory). Effective Field Theories (EFT for short) provide a clean separation of scales, enabling non-perturbative effects to be either factorized or treated in an Operator Product Expansion. Within perturbative NRQCD the quarkonium masses have been computed in recent years up to N 3 LO precision, that is up to O(m Q α 5 s ) and O(m Q α 5 s log α s ) [10][11][12][13][14][15][16][17][18][19]. The development of EFTs such as velocity NRQCD (vNRQCD) [20] and potential NRQCD (pNRQCD) [21,22], which describe the interactions of a non-relativistic system with ultrasoft gluons, organizing the perturbative expansions in α s and the velocity of heavy quarks systematically, has been crucial to reach such accuracy. Finally, a closed expression for arbitrary quantum numbers can be found in Ref. [23], which uses mathematical methods to convert infinite sums into finite nested sums and transcendental numbers, and expresses the so called Bethe logarithm as a one-parameter (numerical) integral.
The central object to compute the energy levels of QQ states 1 is the QCD color-singlet static potential V QCD (r), which is known to very high accuracy : the leading correction, O(α s ) with respect to the Coulomb term, was obtained in [10], the O(α 2 s ) terms where computed in Refs. [12,24], while the O(α 3 s ) corrections where calculated in Refs. [14,[17][18][19]25]. It is well understood that the QCD static potential suffers from an O(Λ QCD ) renormalon ambiguity that exactly cancels that of the pole mass [26][27][28], such that the static energy E stat (r) = 2 m pole Q + V QCD (r) is renormalon free. 2 This cancellation is essential to make the energy levels of quarkonia perturbative objects. For this cancellation to take place, the pole mass has to be expressed in terms of so called short-distance masses. Previous studies of the bottomonium and charmonium spectra used the renormalizationscale-dependent MS mass m Q (µ) evaluated at µ = m Q [29][30][31]. Although this scheme will make the renormalon cancellation happen, it introduces a conceptual (and many times also practical) problem. In the pole scheme, perturbative logs appearing in the theoretical expression of the quarkonium energy levels have the following form: log[ nµ/(C F α s m Q ) ], with n the principal quantum number. When expressing the pole mass in terms of m Q (m Q ) a new class of logs appear, namely log(m Q /µ). 3 It is clear that no single choice of the scale µ will cause both logs to vanish simultaneously. The origin of this mismatch can be traced to the fact that in NRQCD the mass of the heavy quark is a hard scale which has been integrated out. This situation is resolved if instead a low-scale short-distance mass is used, such that the logs introduced when switching scheme either have the form log(µ/R), being R an arbitrary scale the mass depends on (MSR mass [32,33], Potential Subtracted (PS) mass [28], Renormalon Subtracted mass (RS) [34], kinetic mass [35] or jet mass [36,37]), which can be chosen of the same order as (or equal to) µ, or they are already of the exact same form as the original ones (1S mass [38][39][40]). While the recent analysis of Ref. [41] employs the RS mass, we will make use of the MSR mass, motivated by the fact that it is straightforward to include effects of lighter massive quarks [42]. 4 At this point a comment is in order : while the problem just described is indeed critical for the top quark, it is a lot less severe for the bottom and charm quarks, given that the hierarchy between their masses and the respective non-relativistic scales is not very large. It is nevertheless theoretically more sound to use a low-scale mass. 1 In the rest of the paper Q will stand for the heavy, non-relativistic quark forming the bound states, while q will refer to the next lighter quark. 2 For this cancellation to happen it is essential that the ambiguity of the pole mass and the potential are independent of the mass and r, respectively. 3 In practice these logs are present because even if the MS mass is evaluated at the scale µ = mQ, the relation of the pole mass to mQ(mQ) has to be written as a series in powers of αs(µ) for the renormalon cancellation to take place. 4 Here we will use a modified version of the MSR mass in which the effects of virtual massive quarks participate in the R-evolution.
The last piece needed to assemble the complete N 3 LO theoretical expression in a shortdistance scheme is the four-loop coefficient of the relation between the MS and pole masses, which has been computed recently [43,44]. This term is necessary to compute the MSR mass to the same accuracy as the static potential in the so called Υ-expansion scheme [38,39]. To make the list of theoretical ingredients complete one needs to include the effects of the finite charm mass both in the static potential and the MS to pole mass relation. Both are known up to O(ε 3 ) : at two [45] and three [46] loops for the latter; for any state at two loops [47] and for n = 1 states at three loops [48], for the former. Therefore, if we are to include charm quark mass effects in the static potential, we have to include them in the MSR mass as well. The MSR mass has so far been used only in phenomenological applications up to O(α s ), and always in the context of jet physics [49,50]. One (theoretical) motivation of our study is thus employing the MSR mass at O(α 4 s ) and in the context of a non-relativistic system. Furthermore, we want to test how efficiently the MSR mass deals with lighter massive flavors in a practical application.
Comparing the theoretical predictions for the bottomonium and charmonium masses with the corresponding experimental values one can determine the bottom and charm quark masses with good accuracy in a clean environment. Precise values of these heavy quark masses are key to test the validity of the Standard Model in the precision frontier, both in Flavor [51] and Higgs Physics [52]. Two recent analyses have determined the bottom quark mass from the Υ(1S) state [41], 5 and bottom and charm quark masses from n = 1 states [54]. Both analyses use N 3 LO theoretical predictions and consider finite charm quark mass corrections. While [41] uses the RS scheme for the heavy quark mass and varies µ and R independently (but one at a time) in a non-relativistic range (1.5 GeV to 4 GeV), Ref. [54] uses the MS scheme and the principle of minimal sensitivity, setting µ at scales that are larger than the heavy quark mass itself (5 GeV to 12 GeV for bottom, 2 GeV to 4.5 GeV for charm), therefore relativistic. In our analysis we will use the MSR scheme varying both renormalization scales independently and simultaneously around the value that minimizes perturbative logarithms (a non-relativistic scale). Additionally, based on the observation made in Ref. [30], that is, finite charm quark mass effects in bottomonium are closer to the decoupling (m c → ∞) than to the massless (m c → 0) limit, the authors of Ref. [41] argue that using n = 3 active flavors makes this decoupling more explicit, and therefore carry out their analysis in this scheme. On the other hand, Ref. [54] performs the bulk of their analysis in the n = 4 flavor scheme, and uses its difference to the n = 3 result to estimate the uncertainties due to missing higher terms on the finite charm mass correction. We perform fits in both flavor schemes but take the n = 3 determination for our final result.
The main motivation to carry out our analysis is to include in the determination of the bottom quark mass states with principal quantum number n ≥ 1. It has been claimed that the gross spectrum of bottomonium up to n = 4 can be described purely within perturbation theory [29][30][31]. Those analyses are based on the MS scheme for the bottom quark and on the principle of minimal sensitivity to the renormalization scale, which in practice translates into settings the scale µ far from the (low energy) values that would make perturbative logarithms vanish, and vary it in a very narrow range or towards larger scales. We shall show that the scale of minimal sensitivity strongly depends on the order of perturbation theory considered, and also on the , s and j quantum numbers which do not appear in the perturbative logarithms. We preform a dedicated study of scale variation inspired by the usual lore, that is, the argument of perturbative logarithms should vary roughly between 1/2 and 2, supplemented with the additional constraint that the renormalization scale should not become smaller than 1 GeV, such that perturbation theory is not jeopardized. We find that the central renormalization scale decreases as n increases, becoming smaller than 1 GeV already at n = 3. Even stopping scale variation at 1 GeV for n = 3, the predictions for the masses become unstable and the perturbative uncertainties grow dramatically as compared to smaller values of n.
Our analysis would not be complete if non-perturbative effects are simply ignored. 6 We will assume that the soft scale mv is much larger than Λ QCD , and that the ultra-soft scale mv 2 is either much larger than Λ QCD , or at the very worse, of the same order. The analysis of Ref. [55] suggests that for states with n = 1 we are in the former situation (therefore non-perturbative effects come in the form of local condensates), while for n = 2 likely falls in the latter (with non-perturbative effects manifesting themselves as non-local condensates). In either situation the main contribution to the energy levels is pertubation theory. Our analysis seems to indicate that for n ≥ 3 one has mv 2 < Λ QCD , being the static potential directly affected by non-local condensates. Our approach to non-perturbative effects will be rather heuristic : given that the operator product expansion ties together perturbative and non-perturbative physics, we will assume that as long as the perturbative expansion for the energy levels is well behaved, non-perturbative effects do not heavily affect the perturbative result. We will give extra validation to this ansatz by comparing the determination of the bottom quark mass from fits to states with principal quantum numbers n = 1 and n = 2. From this analysis we can however not discard that for n = 2 we are already in a situation in which perturbative and non-perturbative effects become entangled, and until a fully fledged study of non-perturbative effects becomes available, it is not possible to draw further conclusions. As emphasized in Ref. [41], one probably needs to have ultrasoft effects under control before dealing with non-perturbative physics.
This article is organized as follows : In Sec. 2 we provide the theoretical expressions for quarkonia masses including corrections from massive lighter quarks. In Sec. 3 we review the MSR short-distance scheme for the mass of a heavy quark and show how to include effects from massive lighter quarks in their R-evolution, explaining how to integrate those effects out at low scales. In Sec. 4 we perform an investigation of renormalization scale variation and the effects of finite charm mass in bottomonium. The fitting procedure and the results for the determination of the bottom and charm quark masses, as well as simultaneous fits to m b and α s , are presented in Sec. 5. We compare our results to previous determinations in Sec. 6. Our conclusions are contained in Sec. 7. Some formulas are collected in App. A.

Analytic expression for massless quarks
The energy of a QQ non-relativistic bound state characterized by the (n, j, , s) quantum numbers and with n massless active flavors reads in the pole scheme [16,23,56] : with H n the harmonic number. In Eq (2.1) ε is a bookkeeping parameter that labels the various orders in the Υ-expansion. The c i,j coefficients can be computed from the c i,0 imposing µ independence of the energy states, what implies the following recursion relation : where β i are the (i + 1)-loop coefficients of the beta function defined in Eq. (3.2). The c i,0 coefficients have been calculated up to i = 3 and their values can be found in Refs. [30,31]. In general they depend on the quantum numbers (n, j, , s) and c 3,0 also depends on log(α s ). Eq. (2.1) inherits the u = 1/2 renormalon of the static potential, what makes it unfit for a precision determination of heavy quark masses. This can be resolved by expressing the pole mass in terms of a short-distance mass. Moreover, if large logs of the ratio of the non-relativistic scale and the quark mass are to be avoided, a low-scale short-distance mass should be used. In our analysis we employ the two versions of the MSR mass discussed in Ref. [33]. Let us consider a generic short-distance mass whose relation to the pole scheme is expressed as where m SD Q depends on some energy scale denoted generically by R, and δ SD n contain powers of log(µ/R). One then has that This relation can be used to convert the sum that appears in Eq. (2.1) to a short-distance scheme : Here we have defined the δ L i,j coefficients from i=1 x i δ L i j = i=j x i δ L j,i , which can be calculated from the recursion relation in the last line of Eq. (2.5). The only thing left to express the energy levels of quarkonium in a short-distance scheme is converting the global factor m pole Q : where δ 1,i is the Kronecker symbol taking the value 1 for i = 1 and zero for higher values of i. Having in mind that ultimately we want to determine the mass of the heavy quark Q, one can devise perturbative expansions based on Eq. (2.6) in which m SD Q (or even m Q ) is singled out. We use two such expansions, which are in complete correspondence to the "linearized" and "linearized iterative" methods defined in Ref. [57], and denote them as expanded out and iterative. The former corresponds to : directly obtained from Eq. (2.6) isolating the global factor m SD Q , with E exp X the experimental value for the bound state and A exp i some easy-to-calculate coefficients. One can remove the m SD Q dependence that still persists on the A exp i in an iterative way, by introducing lowerorder expressions for m SD Q in higher-order ones and expanding consistently in ε. This way one obtains the iterative expansion : There is still some residual m Q dependence on the right hand side due to the threshold matching conditions of α s . This can be eliminated numerically by iterating Eq. (2.8) a few times.

Finite Charm Mass Effects on Bottomonium
Equation (2.1) considers that the n active flavors that contribute to the energy of a QQ bound state are all massless. However, massive charm quark loops contribute both to the binding energy and the relation of the pole and short-distance masses, which should be properly accounted for. In this section we consider both m Q m q in the pole scheme, since switching any of them to a short-distance scheme is trivial. Whenever we use a shortdistance scheme for m Q , we will take m q in the MS scheme, and more specifically we will only use m q ≡ m ). Massive lighter quarks effects on the relation between the pole and other short-distance schemes will be discussed in Secs. 3.2, 3.3 and 3.4.
Modifications to the Coulomb potential due to non-zero lighter quark masses have been calculated up to O(ε 3 ). The direct consequence of such corrections in the Coulomb potential is an energy-shift effect in the heavy quarkonium mass. In the pole scheme they read : If Eq. (2.9) is expressed in the n scheme, where the massless limit is realized, these corrections obviously vanish if m q → 0. Likewise, if Eq. (2.9) is written in the (n − 1) scheme, the decoupling limit is realized and the corrections should vanish if m q → ∞. However, in the n (n − 1) scheme the decoupling (massless) limit is not manifest. We can use this information to constrain the form of δE (i) X in the n scheme for large values of m q requiring that, once α (n ) s is expressed in terms α (n −1) s through the threshold relation in Eq. (3.9) the decoupling limit is reached. This strategy was used in Ref. [30].
The ε 2 correction has been calculated in Ref. [47], for arbitrary quantum numbers, and in the n scheme it reads : − πnρ 3 (n − − 1)(n + ) + 1 3 (n + 1)(2n + 1) + ρ 2 (n − − 1)(n + ) + (2n + 1)n and f D (n, , k, ρ) is defined in Eq. (A.1). In the (n − 1) scheme one simply has δE (1) such that the decoupling limit is manifest. The exact O(ε 3 ) finite charm mass corrections to the binding energy have been computed for the n = 1 bottomonium energy levels in Ref. [48], and for states with arbitrary n and = 0 in Ref. [58], remaining unknown for other sets of quantum numbers. Since these corrections for n > 1 are prohibitively slow, in order to include them for non-1S states we will use the results of the analysis carried out in Ref. [30], where the authors concluded that for n = 1 the m c → ∞ limit is an excellent approximation, which becomes even better for large values of the principal quantum number. The formula for such limit in the pole mass scheme is detailed in Eq. (A.2). We will supplement this approximation with the requirement that in the n scheme the correction must vanish if m c → 0. Our approach uses the decoupling approximation for values of m c larger than m * c (n), where m * c (n) = (1 GeV)/ √ n, where the coefficients a and b are adjusted such that the transition is smooth at the junction point. This functional form enforces the massless limit, and the dependence of m * c in n takes the decoupling approximation as exact for lower values of m c . We have tested our approach for n = 1 and found that it reproduces the exact known result within a few percent, as can be appreciated in

The MSR scheme
As mentioned in the Introduction, the use of the pole mass in perturbative calculations is ineffective, not only for its bad perturbative behavior but also because of the existence of confinement, which hides the quark propagator pole in the non-perturbative regime.
Alternatively, the MS mass is an adequate scheme for physical situations that involve energies much larger than the heavy quark. Ideally, for the heavy quarkonium, one would like to extend such scheme for energies below m Q without losing the good infrared properties of the MS scheme. At such scales the evolution of the logarithms in the regular MS mass is unphysical and behaves badly. A more adequate short-distance scheme is the MSR mass.
The MSR scheme, first introduced in Ref. [32], and extensively discussed in Ref. [33], is a natural extension of the MS mass for renormalization scales below the heavy quark mass. The definition of the MSR mass, expressed as the MSR-pole mass difference, is derived directly from the MS pole mass relation, exploiting the fact that the renormalon ambiguity is independent of the value of the heavy quark mass. Let us start form the perturbative relation of the MS and pole masses : where n f = n +n h and we have defined m Q ≡ m The strong coupling constant is expressed in the MS scheme with the usual renormalization group equation : (3.2) For later use we split the two-and three-loop terms into their various flavor components In contrast to the MS mass, which only depends logarithmically in the scale µ, the MSR mass has a logarithmic and linear dependence on R : where a n,k are derived from Eq. (3.1), and are different for the Practical and Natural versions of the MSR mass (see Sec. 3.1). The advantages of this mass scheme is that it can be safely employed for scales R < m Q when a clean treatment of virtual massive-quark effects is relevant, as the virtual effects of the massive flavor is integrated out from the MSR mass expression. While the first line of Eq. (3.4) defines the MSR mass and is used to derive its anomalous dimension, the second line is more useful when implementing the MSR scheme in a series expressed in powers of α s (µ). Since the MSR mass is µ-independent, the a n,k coefficients can be computed from the a k,0 with the following recursion relation : The R dependence of the MSR mass is described by the following renormalization group equation : where γ R n are the R-anomalous dimension coefficients [33], which can be calculated as follows : Since the ambiguity of Eq. (3.4) is R independent, the R-anomalous dimension is automatically renormalon-free. This RGE provides a systematic reordering of the terms in the asymptotic series related to the O(Λ QCD ) renormalon ambiguity : which results in the summation of powers of log(R 1 /R 0 ) to all orders in perturbation theory. 7

Natural and Practical MSR schemes
As we discussed in Sec. (3), in contrast to the MS mass, the MSR mass is a suitable scheme for applications requiring scales lower than the heavy quark mass. For that reason, the UV effects of the quark Q can be integrated out, changing from a scheme with n + 1 dynamical flavors to one with only n . There are, hence, two alternative but equivalent ways to achieve this transformation. On the one hand, the threshold matching relation for the strong coupling can be used to rewrite α which once implemented into Eq. (3.1) is used to define the Practical MSR mass : (3.10) The a MSRp n coefficients can be computed as follows : where we have included the n h argument in the MSRp scheme coefficients for later convenience. The coefficients ξ i,j are defined as ( For i > 1 they can be computed with the following recursion relation : On the other hand, the virtual loop corrections of the heavy quark Q can be directly integrated out of the MS to pole relation taking n h = 0 in Eq. (3.1), which leads to the definition of the Natural MSR mass: . Each materialization of the {n +1 → n } scheme change has its advantages, depending on the specific applications. The Natural MSR mass only exhibits corrections arising from gluons and massless quarks, being its relation with the MS mass mediated by a non-trivial expression : (3.14) Alternatively, the MSRp -MS relation is more direct, as the Practical MSR mass is equal to the MS mass at the scale of the mass to all orders in perturbation theory: Finally, we note that the MS scheme used in Refs. [29][30][31] exactly corresponds to the MSRp mass with R = m. Therefore their theoretical expressions are contained in ours.

MS mass with light fermion masses
Corrections from massive lighter quarks modify the relation of the pole mass to other shortdistance schemes. In this section we shall focus the discussion to the MS mass, though for our numerical analysis the MSR scheme will be used. The corrections from massive lighter quarks to the MS-pole mass relation start at O(α 2 s ) and vanish in the m q → 0 limit. Hence the corrected relation can be expressed as : where δm Q is given in Eq. (3.1) and ξ = m q /m Q . The functions ∆ i obey two constraints : 8 The ε 2 term for the non-zero charm mass corrections has been calculated exactly in Ref. [45] : 9 8 These relations will be refereed to as "the ξ = 0 constraint" and "the ξ = 1 constraint", respectively, in the rest of the paper. 9 Eq. (3.17) is written in a way in which each term is manifestly real for any value of ξ between 0 and 1. Fig. 3(c). The ξ = 1 constraint in these corrections takes the form

This correction, together with its linear approximation [ see Eq. (3.24) ], are shown in
The ε 3 correction can be split in three contributions : The ξ = 1 constraint can be decomposed in the following relations : The exact expression for ∆ (g,n ,n h ) 3 has been analytically calculated in Ref. [46], but it is terribly lengthly, therefore impractical for a numerical implementation. Ref. [46] provides 8 terms of the expansion for small ξ, which can be combined with the ξ = 1 constraint in a Padè parametrization to provide an approximation which is accurate to 8 significant digits across the range 0 ≤ ξ ≤ 1, which is enough for our purposes. Our parametrizations are explicitly shown in Eq. (A.3). In Fig. 3 is shown, together with the approximation given in Eq. (3.25).

MSR mass with light active fermion masses
To include effects from lighter massive quarks we follow the same approach that was used in Ref. [60] to implement R-evolution in the gap parameter of the soft function when secondary heavy quarks are produced through gluon splitting. This implementation in the MSR mass happens to satisfy Heavy Quark Symmetry (HQS for short) exactly. HQS states that in the limit of an infinitely heavy quark, low-energy QCD effects are flavor independent. This limit has to be taken only in the virtual effects, as only quantum fluctuations are integrated out. If one looks into m pole Q − m MS Q in this limit, n h becomes 0, what effectively converts this series to the MSRn scheme, and m q /m Q tends to 0, what makes the lighter q quark massless. For m pole q − m MS q one has that m q → 0 in the virtual effects implies n h → 0 and n − 1 → n . Therefore in this limit m pole With our definition this equality will be satisfied even for finite values of the heavy quark mass. Although there is no theoretical advantage in having (for the MSRn mass) heavy quark symmetry (accidentally) exact, and the impact of this feature in the final results is small, it has a practical advantage, as it ensures a smooth transition to an MSR mass in which the lighter quark q is also integrated out.
Our definition of the MSR mass including non-zero lighter quark masses reads : 10 where ξ R = m q /R. In the MSRn scheme one has ∆ (n) For the MSRp scheme, at two loops one has ∆ where we have used the ξ = 1 constraint. For the MSRp scheme HQS is not exact, but the correction is calculable in perturbation theory and thanks to the ξ = 1 constraint happens to be identical to the MSRn to MS matching condition : Therefore HQS corrections can be calculated even if the lighter quark finite mass effects have not been computed yet, as long as the massless corrections are known. The R-anomalous dimension is modified as a consequence of the finite lighter quark mass : This equation can be integrated either numerically as it stands, or recast into a form that factors out Λ QCD , as explained in Ref. [60]. Given that the O(α 4 s ) finite lighter quark mass corrections are unknown, we will cut the sum of the second term strictly at n = 3, even if there are lower order ∆ terms that contribute at n = 4 in the sum over j. Such decision 10 In Ref. [42] the MSR mass with light massive quarks was implemented differently : δm MSR Q (R, mq) = δm MSR Q (R) + mQ ∆m q (mQ, mq/mQ), which is well suited to study the large-order behavior of the series. With this definition the R-evolution equation is the same as for massless lighter quarks, but one might have potentially large logs of the form log(R/mQ) when implementing the MSR scheme in a perturbative series. With this definition one needs to include Heavy Quark Symmetry breaking corrections to the MSRn mass when integrating out the massive lighter quark, whereas with our implementation these corrections are absent. Various contributions to 3 is adopted because the inclusion of such terms could overestimate the real size of that correction if large cancellations happen. These cancellations are indeed expected since ∆ n are afflicted by a renormalon but δγ n R are not. In Fig. 2 this is graphically depicted at two and three loops, where it is seen that the effect is larger at smaller ξ. Since the cancellations are expected to be larger for high values of n (where the renormalon dominates), we can estimate ∆ 4 (r) by requiring that δγ R 3 (ξ) = 0. In general one would have : where ∆ n (1) are known from the ξ = 1 constraint. Eq. (3.24) automatically satisfies the ξ = 1 and ξ = 0 constraints. If this approximation is used from n = 1 one gets the following predictions : This approximation corresponds to the estimate made in Eqs. (3.12) and (3.13) of Ref. [42] if one sets µ = m Q [ see Eq. (3.14) of that reference ]. Since ∆ 2 and ∆ 3 are exactly known, we can use them to make a better prediction for ∆ 4 , which we call ∆ best 4 . To test this method, one can also use the known expression of ∆ 2 to predict ∆ 3 , calling it again ∆ best 3 , and consider the difference of ∆ best 3 and ∆ app 3 as an estimate of the uncertainty of this procedure. We observe that ∆ best 3 is a better approximation to ∆ 3 than ∆ app 3 , and that ∆ 3 is contained in the uncertainty band, although quite close to its upper bound. This is depicted in Fig. 3(a). Finally one can compute the estimate of ∆ 4 which uses the exact form of ∆ 2 and ∆ best 3 , which we call ∆ good 4 . We observe that ∆ good  will be used to estimate the uncertainty on ∆ best 4 , which happens to be 2% at worst across the whole spectrum for either 4 or 5 light flavors. Our estimate is shown graphically in Fig. 3(b). Our prediction for ∆ 4 and its uncertainty band can be parametrized with the following fit functions : (3.26) The method of requiring a vanishing R-anomalous dimension is not as powerful in predicting the massless coefficients, and for those one needs to know at least the one-loop term. For instance, for n = 4, and assuming that all orders lower than the one that is predicted are known, one gets : there the quoted uncertainty corresponds to the difference of the central value and the prediction that assumes that the previous order is also estimated with this method. The exact results for a MS n (n = 4, 0) for n = 2, 3, 4 are 130.128, 4582.54, 214100. Finally, we note that the uncertainty associated to the absence of the ∆ 4 , δγ R 3 and δE (3) X terms can be estimated by comparing the schemes with n and (n − 1) dynamical flavors.

MSR mass in the n flavor scheme
The MSR mass described in Sec. 3.3 accounts for the charm mass corrections explicitly, as it considers the charm quark a dynamical flavor. However there are physical situations in which, due to the energy scales involved, it is convenient to integrate out the charm quark as well. For bottomonium, the charm mass is sufficiently large compared to the non-relativistic scales to assume one is close to the decoupling limit, i.e., m c → ∞. Indeed, the closeness of the exact finite-charm mass corrections of Sec. 2.2 to the decoupling limit for scales around 1 to 4 GeV points to such conclusion. In that case, the charm quark can be integrated out and one should switch to an (n −1)-flavor MSR scheme with the charm quark decoupled to all orders in Υ expansion, reducing to a theory with (n − 1) dynamical flavors.
We define the MSR mass with n active flavors, either Natural or Practical, as follows : where it is understood that m MSR q has (n −1) dynamical flavors. Therefore, the R-evolution in this scheme does not include charm mass effects, and its relation to the pole mass has only (n − 1) flavors. One needs to connect the MSR (n −1) mass to m Q , and if large logs of m Q /m q are to be avoided, one should not directly subtract Eqs. (3.28) and (3.16). R-evolution can sum up these logs if the relation is organized as follows : where one can use different versions of the MSR mass in the n and (n − 1) schemes, hence we distinguish them with a prime. We have used Eq. (3.28) in the first and second terms in square brackets of the first line to get to the second line. One uses R-evolution to sum up logs of R/m q and m q /m Q in the first and fifth terms in square brackets, respectively. The second and one-to-last terms in square brackets are the MS-MSR matching conditions, which are zero in the Practical scheme. The sum of the third and fourth terms in square brackets correspond to the HQS breaking expression, which is zero in the Natural scheme.
In practice one can write Eq. (3.29) as where m MSR q and m MSR Q include matching to the MS mass and R-evolution, and HQS = 0 if MSR' is the Natural scheme. The sequence of R-running and matching can be better visualized in Fig. 4.
In Ref. [41] the finite charm quark mass corrections to a three-flavor bottom quark MS mass are computed in fixed-order, without R-evolution. In our notation that computation corresponds to m b − m pieces that sum up logarithms

Scale Variation and Charm Quark Mass Effects
In this section we perform a numerical investigation of the scale variation that one needs to perform to properly estimate uncertainties coming from missing uncalculated higher-order terms. 12 These unknown terms appear in the perturbative expansion for the quarkonium energy levels as well as on the relation between the pole and MS masses. The way one estimates the former is varying the scale µ, while varying R provides an estimate for the latter. The range in which these two parameters should be varied depends on the principal 11 Ref. [50] computes the similar quantity m b − m , finding an effect which is also compatible with zero. We find 0.04 ± 0.04 GeV, using our slightly different MSR definition. Furthermore, in our uncertainty estimate of HQS we probe scales a factor of two smaller (larger) than their lowest (highest) scale. 12 For the numerics in this article we run αs with the five-loop beta function, and we match at the various quark mass thresholds taking the five-loop expression with µq = mq. For the MSR mass we perform 4-loop massless R-evolution (and matching to mq) and include finite charm mass corrections at three loops. The matching between the MSR (n =4) and MSR (n =3) masses is performed at four loops. quantum number n. For µ this is easy to see, since n appears in the argument of the perturbative logarithms. Since there are logarithms of µ/R in the MSR mass subtractions, R has to be of the same order of µ, and that is how the R variation inherits its n dependence.
In any case we observe that changing the value of R in a given range causes variations on the predicted masses much smaller than varying µ in the same range. Therefore we interpret that µ variation estimates the error coming from missing higher order terms in the theoretical expression for the bound-state masses, whereas R variation accounts for the arbitrariness that one has in connecting the relativistic R ∼ m Q and non-relativistic regimes R ∼ m Q α s C F , in the same way that one uses the MSR mass to smoothly match the 1S mass into the MS one. For simplicity we use the same range of variation for both µ and R.
We will start with the usual procedure to figure out the "natural" value of the renormalization scale : µ nat has to be such that the argument of the perturbative logarithms equals unity. To estimate the range of variation we can simply require that the argument of the logarithms becomes 2 ±φ : (µ nat ± ∆µ) ∼ 2 ±φ C F α (n ) s (µ)m Q /n, where the specific value of m Q depends on the scheme one is using, and one can adjust the value of φ or even take different values for upwards or downwards variations. For this numerical exercise we take m Q = m MSR Q (R = µ) and half of the average of the masses of states with principal quantum number n (experimental values). Both choices yield similar results : for bottomonium and taking φ = 0.5 we find for n = 1, 2 the following ranges : µ n=1 ∼ 1.9 +1.6 −0.4 GeV, µ n=2 ∼ 1.25 ± 0.25 GeV. Most of the variation from this ranges comes from the lower edge, therefore we simplify the scale variations to 1.5 GeV ≥ µ n=1 ≥ 4 GeV and 1 GeV ≥ µ n=2 ≥ 4 GeV. We observe that using these ranges the estimated uncertainty bars make the results compatible order by order in perturbation theory, as can be seen in Figs. 5(a) and 5(b). We observe that for n = 1 a turnover takes place between N 2 LO and N 3 LO, as the error bar increases when increasing the order of perturbation theory, contrary to the expectations. We observe that this behavior persists even if the lower scale in the variation range is reduced to 1 GeV. For n = 2 this does not happen and the minimal uncertainty is achieved at the highest perturbative order, as expected. It is hard to pin down the origin of this behavior, but one could imagine that some sort of accidental numerical cancellation happens at N 2 LO and n = 1, which makes the renormalization scale variation method underestimate the actual uncertainty due to missing higher order terms. This behavior is obviously transmitted to the extraction of the heavy quark masses.
The same exercise for n = 3 yields a lower bound below 1 GeV, presumably too low for perturbation theory to work. Therefore we try the same variations as for n = 1, 2, and as shown in Figs. 5(b) and 5(c) none of these choices is satisfactory : the n = 1 variation makes the N 3 LO prediction incompatible with the LO and NLO results; on the other hand the n = 2 variation makes the error bars equally large for all orders. We therefore conclude that perturbation theory is not applicable for values of n equal or larger than 3.
The same analysis for charmonium again yields values for the natural scale µ below 1 GeV. In this case we take a completely heuristic approach and choose the scale variation such that there is order-by-order convergence and agreement. We find that varying the scales in the range 1.2 GeV ≥ µ charm ≥ 4 GeV gives a pattern very similar to the n = 1  scheme it is similar to the MS scheme, but with R-evolution summing up logarithms of m c /m b . We make this choice because it is the most similar to what is used in Refs. [30,54], while the outcome of this analysis in the MSRn scheme is very similar. Our results are shown in Fig. 6. We find that the PMS largely depends on the perturbative order one is considering : only in the two highest orders a maximum is attained, and between these two differences as large as 3 GeV for n = 1 bottomonium states occur, and around 1 GeV in the other cases we studied. There is a moderate dependence on the number of flavors one is using (1 GeV at worst). In the case of bottom, the highest order MS scheme with n = 3 shows both a minimum and a maximum for the ground state. For n = 1 the maximum position is biased towards large (relativistic) scales.
The last thing we want to explore in this section is the m c dependence of the bottomonium binding energies and the various ways in which the charm quark mass corrections can be implemented. As discussed already in Sec. 2.2, one can assume a massless charm quark to which one adds non-zero charm quark mass corrections [ what we call the α (n =4) s scheme ], or one can assume a theory with a decoupled (that is, integrated out) charm quark to which one adds corrections to the decoupling limit [ what we call the α (n =3) s scheme ]. In Refs. [30,41] it has been argued that the latter is the most accurate description, because the decoupling limit is much closer to the physical situation than the massless limit. Here we confirm this claim, and actually show that it holds for almost any value of the charm quark mass, as can be seen in Fig. 7(b) : the decoupling limit and the α (n =3) s scheme are much close to one another than the massless limit and the α

Fits to Experimental Data
We have created two independent computer codes, one (slow) written in Mathematica [61] and another one (fast) written in Fortran 2008 [62]. Both codes agree within 8 significant digits. The duplication of the code allows a cross-check of all the formulas and procedures described above, giving confidence in our results. For our numerical analysis we mainly use the Fortran code.
The aim of our code is to compute the perturbative mass of a heavy quarkonium state with arbitrary quantum numbers [ Eq. (2.1) ] in the MSR mass scheme 13 as a function of the MS mass of the heavy quark and two renormalization scales µ and R. For a given dataset, the value of m Q is extracted from the minimum of the following χ 2 function, which depends on a set of pairs of renormalization scales {µ n , R n }, one for each principal quantum number that appears in the dataset : where the sum extends to the individual states in the given dataset and M exp i and σ exp i are the experimental masses and errors extracted from the PDG [63]. Such fit would give us a best-fit MS mass value as a function of µ n and R n .
One could think the optimal way of including theoretical uncertainties in our fits is by adding these as part of the χ 2 functions. The problem is that theoretical errors are highly correlated among various states, as stems from the following facts : 1. All masses are determined from the same static potential. Therefore whatever the unknown N 4 LO static potential is, it is the same for all states. Varying the renormalization scale µ uncertainties coming from this and higher-order unknown terms are estimated. Therefore the value of µ has to be varied in a correlated way for all states included in the dataset.
2. All masses in a short-distance scheme use the same MS-pole mass relation to cancel the factorially divergent behavior, and no matter how the O(α 5 s ) correction looks like, it is going to the same for all states. Uncertainties coming from this and higher order uncalculated terms are estimated varying the scale R. Therefore the value of R has to be varied in a correlated way for all states in the dataset.
3. All masses depend on the same value of α (n f =5) s (m Z ) and m c , therefore uncertainties associated to these two quantities are obviously 100% correlated.
We performed a numerical study to estimate the theoretical error matrix and found the correlation is very close to 100%. Given that perturbative errors greatly dominate over experimental uncertainties, a fit including such a highly correlated error matrix is severely affected by the so-called d'Agostini bias [64], resulting in a global best-fit value which is considerably lower than individual determinations from each state in the dataset. Two possible ways out of this problem are : a) perform global fits with χ 2 functions which depend on µ and R; b) determine the heavy quark masses from fits to individual states and a posteriori perform a weighted average of the best-fit results. For the latter one has to compute the weighted average using only the experimental errors, since the highly correlated theoretical errors cannot judge the quality of individual results (and would again cause the bias towards small values). The theoretical uncertainty is then estimated as the average of the individual theoretical uncertainties. In a more formal way, this is achieved by using for the weighted average a theoretical correlation matrix in which each entry is substituted by the average of all the entries in the original matrix. This procedure is quite standard, and it is motivated by the idea that the parameter one is fitting for is unique, and therefore its theoretical error is unique as well, being the average of its various estimates the least biased approximation. A theoretical error matrix exactly proportional to the unit matrix avoids the d'Agostini bias and yields as the result of the fit precisely the weighted average with experimental uncertainties only. For our analysis we take approach a) as our default and use b) as a validation of our results. We find both methods are in quite good agreement, although in same cases b) yields slightly larger perturbative uncertainties.
In order to estimate the dependence of m Q with the renormalization scales we vary them, within reasonable ranges, in an independent way. The specific energy windows for each scale depend on the convergence behavior for each heavy quarkonium state. As already argued in Sec. 4, for n = 1 states one should vary the renormalization scales between (µ 1 , R 1 ) ∈ [ 1.5, 4 ] GeV, whereas for states with n = 2 one should rather take the ranges (µ 2 , R 2 ) ∈ [ 1, 4 ] GeV. For n = 3 states perturbation theory is already badly behaved such that scales around 1 GeV are too low and the perturbative series starts to fail, which significantly increases the error associated to the scales. For that reason, the ranges used for n = 3 will be (µ 3 , R 3 ) ∈ [ 1.5, 4 ] GeV, which still yields uncertainty bars much larger than for n = 1, 2 states. In any case n = 3 states are only used as an illustration and do not enter our final numbers. Since the µ n and R n scales have to be varied together to account for theory correlations we use the following parametrization : µ 2 (µ) = µ, R 2 (R) = R, µ 1,3 (µ) = 1.5 GeV + 2.5 (µ − 1 GeV)/3, R 1,3 (µ) = 1.5 GeV + 2.5 (R − 1 GeV)/3, and vary µ and R between 1 GeV and 4 GeV. This linear rescaling for n = 1, 3 implements the correct correlation to the n = 2 state avoiding unnaturally low scales, and makes the χ 2 function dependent on the global µ and R only. The central value for m Q will be calculated as the average of all the best-fits on a (µ, R) grid.
From the χ 2 minimization we extract the uncertainty coming from the experimental error of the heavy quarkonium masses, denoted as ∆ exp . Around the best-fit value of a given (µ, R) pair, the χ 2 function can be approximated by The central value for the experimental error will be taken as the average of all the ∆ exp in the (µ, R) grid (which happen to be all very similar). Due to the high accuracy of the PDG heavy quarkonium masses up to n = 3, such uncertainty is very small. The largest source of uncertainty will be that associated to the variation of the scales, dubbed ∆ pert . Its value will be taken as the difference of the maximum and minimum best-fit values in the grid : Finally, the errors associated to the uncertainty on the strong coupling ∆ αs (using the 2016 PDG world average α (n f =5) s (m Z ) = 0.1181 ± 0.0011 [63]) and, for the bottom mass fits, the charm mass ∆ mc (m c = 1.28 ± 0.03 GeV [63]) will be computed by repeating the fits using the 1-σ upper and lower values for these quantities, and taking the half of the differences of the respective resulting best-fit values. 14 For our fits we have followed two complementary methods, which we have carried out independently as an extra check. In our first strategy we created grids in m Q for the theoretical prediction of each individual state. We created different grids for each pair {µ, R} in the grid and theoretical setup. The grids are then combined into χ 2 functions for the various datasets, which are minimized to fit for the bottom mass. The second strategy computes the bottom mass from individual states numerically solving the equation M theo = M exp . We find that this equation can be solved iteratively to arbitrary precision, and that with 10 iterations one obtains 10 significant digits. In order to perform global fits to multiple states, in this second strategy we directly minimize the χ 2 function numerically using the COMPASS_SEARCH [65] routine. Both methods agree in more than three significant digits.

Bottomonium Fits and Determination of m b
In this section we present the core result of our analysis, the determination of the bottom quark mass. We provide results for fits to 14 individual bottomonium states (all states with n ≤ 3), as well as for global fits, for which we create a few datasets : 14 For charmonium fits we proceed in the same way with the bottom quark mass PDG value to determine  schemes, respectively. Within each flavor-scheme block, the uncertainty is split in columns second to fifth into experimental, perturbative, due to uncertainty in α (n f =5) s (m Z ) and due to uncertainty in m c (m c ), respectively. For the various global fits the error associated to the experimental uncertainty (that is, the error coming from the fit) is rescaled by the factor χ 2 min /d.o.f. All masses and errors are expressed in GeV.
Our results at N 3 LO are collected in Tables 1 and 2 for the MSRn and MSRp schemes, respectively, and in Fig. 8(a) for the MSRn scheme with n = 3 active flavors. The table presents the results using perturbative expansions in terms of α (n =4) s and α (n =3) s , shown in the left and right blocks, respectively. In columns second to fifth we split the error in its various contributions : experimental, perturbative, and due to the uncertainties in α and m c (m c ). We observe that the perturbative uncertainty clearly dominates over the rest, followed by the α (n f =5) s (m Z ) error. The error due to the uncertainty on the charm mass is negligible. The experimental uncertainty deserves a more detailed explanation : for individual fits it is negligibly small in all cases, but for global fits we modify it to account for the large values of the χ 2 min . For instance, in the Set n≤2 we get χ 2 min /d.o.f. = 21300, indicating that the theoretical accuracy is much less than the experimental precision. In order to correct for this deficiency of our fits, we inflate our experimental fit uncertainties by a factor such that χ 2 min /d.o.f. becomes unity. This procedure cannot be applied to individual fits as for those both χ 2 min and d.o.f are identically zero. Lower orders for Set n=1 , Set n=2 and Set n≤2 in the MSRn scheme with n = 3 active flavors are shown in Fig. 8(b). Finally in Fig. 9(a) we show the results of various global fits at N 3 LO in the MSRn scheme both with n = 4 and n = 3 active flavors.
Following the observations made in Ref. [41], which we confirmed in Sec. 4, we favor the expansion with n = 3 flavors over n = 4, and take it for our final determination. In any case, the perturbative uncertainties are slightly larger for n = 3 [ see Fig. 9(a) ], so our choice is conservative. Furthermore, we consider Set n≤2 as our default, as all states contained in it can be accurately described within perturbation theory, and is less biased than taking only ground states. Finally, we observe than the difference between the MSRn and MSRp schemes is at the sub-MeV level for the central values, (0.02%) but the MSRn scheme yields perturbative uncertainties 13 MeV (28%) smaller than the MSRp. Accordingly we consider the MSRn scheme (which is also theoretically cleaner) as our default, finding our final result for the bottom mass : The result in the n = 4 scheme is 8.6 MeV higher than our final result with n = 3. If we add this difference to our uncertainties to account for effects of missing higher-order charm mass corrections we would get a total error of 40 MeV. If we add the difference between Set n=1 and Set n=2 as an estimate of non-perturbative effects our total uncertainty would raise to 43 MeV (44 MeV if both effects are considered simultaneously). At the sight of this, we believe our uncertainty of 39 MeV is conservative enough. In Fig. 9(b) and Table 3 the predictions for n ≤ 3 bottomonium masses are shown at N 3 LO using our best-fit value for the bottom quark mass. From the results it is evident  Table 2. Same as Table 1 for the MSRp scheme.
that the experimental precision is far superior than the theoretical accuracy. Finally we explore the dependence of our final result on the value of α The central value rapidly decreases as α (n f =5) s (m Z ) grows, but the perturbative error mildly increases [ see Fig. 10(a) ].

Charmonium Fits and Determination of m c
One can use the master formula in Eq. (2.6) in the MSR scheme to compute the masses of cc bound states. One simplification with respect to bottomonium is that there are no lighter quarks with mass larger than Λ QCD , therefore there are no corrections from finite mases of lighter quarks. On the other hand the physical scales in the problem are smaller,  hence one expects much larger perturbative (as well as non-perturbative) uncertainties. As discussed in Sec. 4, we will restrict our analysis to n = 1 states and will vary µ and R between 1.2 GeV and 4 GeV, which yields order-by-order convergence and agreement with moderate perturbative uncertainties. We will perform individual (state by state) and global (n = 1) fits, which are summarized at N 3 LO in Table 4, whereas lower orders are shown graphically in Fig. 11(a). We find that the result of the global fit is almost identical to the individual fit to the J/ψ state. This is nothing but expected given that the experimental uncertainty of J/ψ is 80 times smaller than that of η c . However, the combined fit yields very large χ 2 min values, of the order of 45700, indicating that the theory accuracy is much worse than the experimental one. Therefore we apply the same rescaling procedure already applied in the bottom fits.
Our results are collected in Table 4 for the MSRn (left block) and MSRp (right) schemes. The error is divided into various contributions : experimental, perturbative, and due to the uncertainties in α s and m b (m b ). For the latter we take the 2016 world average m b = 4.18 +0.04 −0.03 GeV [63]. Same as we did for the bottom analysis, we observe than theoretical uncertainties greatly dominate over α s uncertainties, and again experimental uncertainties (coming form the fit procedure) are negligibly small. The m b uncertainty is, as expected, tiny, as the dependence on the bottom mass comes only from the α s threshold matching from 5 to 4 flavors. Within three-digit rounding, the total and perturbative errors are the same.
We observe that the difference between the MSRn and MSRp schemes is 1 MeV for the central values, (0.4%) but the MSRn scheme yields perturbative uncertainties 9 MeV (10%) smaller than the MSRp. Accordingly we again consider the MSRn scheme as our default, and taking the global fit as the least biased we find our final result for the charm mass : MeV. These variations are much larger than what we find in the bottom analysis, reflecting that perturbation theory is obviously less convergent for charmonium. Nevertheless they are well contained in our estimate for higher-order corrections. Finally, if we add the difference between the results to individual fits to J/ψ and η c to our error budget to estimate non-perturbative corrections, our combined uncertainty would increase to 68 MeV. In Fig. 11(b) and Table 5 we show the prediction for n = 1 charmonium states at N 3 LO using Eq. (5.6) for charm quark mass. It appears very clear that the experimental precision is far superior than the theoretical accuracy. We also explore the dependence of the central value and perturbative error with α (n f =5) s (m Z ). We again find a rather linear dependence of the former while the latter is best approximated by a quadratic fit form :

Simultaneous Fits to
The large value of χ 2 per degree of freedom in fits to the bottomonium spectrum and the big amount of precise data at our disposal suggests that one could fit another parameter. In this section we briefly elaborate on the possibility of performing a simultaneous fit to both the bottom quark mass and the strong coupling constant, as was first suggested in Ref. [30].

MSRp
States  Table 4. Results for m c fits for different set of states at N 3 LO. The error is split by their different contributions. All masses and errors are expressed in GeV. Columns 2 to 5 and 6 to 9 show the MSRn and MSRp schemes, respectively. Within each scheme block, the uncertainty is split in columns second to fifth into experimental, perturbative and due to uncertainty in α (n f =5) s (m Z ) and m b , respectively. For the n = 1 fits the error associated to the experimental uncertainty (that is, the error coming from the fit) is rescaled by the factor χ 2 min . (5) 3.07 ± 0.14 J/ψ(1S) 1 3 S 1 3.096900 (6) 3.10 ± 0.10 Table 5. Rescaling the experimental errors by χ 2 /d.o.f. the uncertainties increase to 0.016 GeV and 0.0010 for m b and α s , respectively. Although the uncertainty on α s is only 4% (good in absolute terms), and our determination is certainly compatible with the world average, it cannot compete with its current precision, which is at the percent level. and charm analyses.

Bottom Quark mass
In Fig. 12(a) we compare our result for the bottom quark mass with other determinations that also use bottomonium energy levels. We compare to the N 2 LO analysis of Ref. [30] and the N 3 LO analyses of Refs. [53,54]. While our analysis and those of Refs. [30,53] use the n = 3 flavor scheme as the reference calculation, Ref. [54] uses n = 4 for the main analyses and takes the n = 3 computation to estimate uncertainties due to the finite charm quark mass. While Refs. [30,53] fit only to the Υ(1S) state and Ref. [54] considers both states with n = 1, we include all states with principal quantum number n ≤ 2. Ref. [54] takes for their final determination the (unweighted) average of the individual determinations for Υ(1S) and η b (1S), while we perform global fits which take into account theoretical correlations and weight each individual state by its experimental uncertainty. Refs. [30] and [54] use the MS scheme and vary scales according to the principle of minimal sensitivity. Finally, Ref. [30] uses α Our theoretical treatment is very similar to that of Ref. [53], since both analyses use a low-scale short-distance mass (MSR vs RS). As for scale variation, for n = 1 it is identical for µ, while the lower limit for R is 1 GeV in Ref. [53], as opposed to our 1.5 GeV. On the other hand we vary both scales independently on a square grid, while Ref. [53] varies one scale at a time. Both our central values and theoretical uncertainties are in excellent agreement.
The theoretical error found in Ref. [54] is only 20 MeV, half the size of ours, while their central value is 19 MeV lower. This uncertainty estimate is based on the PMS, for which the canonical scales are found to be 5.352 GeV and 6.157 GeV for the Υ(1S) and η b (1S) states, respectively. We already argued in Sec. 4 those values are unnaturally large for a nonrelativistic system of bottom quarks. The uncertainty is then estimated by comparing the results of the canonical scale with those that use twice that scale. Since the variation is only towards larger scales, the variance found is rather small : 21 MeV and 18 MeV, respectively. We also observe that this scale variation leads to a slight inconsistency between LO and N 3 LO results.
The result of Ref. [30] is 26 MeV lower than ours, but still in full agreement. But being only an N 2 LO analysis, it claims an uncertainty due to missing higher order corrections of 25 MeV, which is 26% smaller than our estimate. This analysis also uses the PMS to fix the scale, but the perturbative uncertainty is obtained by estimating the effect of different implementations of the Υ-expansion scheme. In our code we can also use two counting schemes, depending how one considers the scaling of the parameter R. The Υ-expansion scheme considers R ∼ m Q , but in the so called non-relativistic counting one takes R ∼ m Q α s . The difference between our results in the Υ-scheme and non-relativistic counting is 6 MeV. We can also compare our results with the expanded out and iterative methods of Eqs. (2.7) and (2.8) 15 , finding differences below 6 MeV. All these are much smaller than our perturbative uncertainty.
In Fig. 12(b) we compare the outcome of our analysis with determinations of the bottom mass from other observables. We compare to a Lattice determination [66], one relativistic [67] and two non-relativistic [58,68] sum rule analyses. 16 Ref. [66] uses a nonrelativistic lattice action to compute high moments of the vector correlator. 17 These are compared to relativistic continuum perturbation theory to extract the bottom quark mass, a procedure that could be called into question. Refs. [58,68] use non-relativistic sum rules in a low-scale short-distance scheme, but the former uses fixed-order perturbation theory at N 3 LO while the latter performs large-log resummation at (partial) N 2 LL in the framework of vNRQCD. Finally Ref. [67] uses the second moment of the vector correlator in the MS scheme and varies independently the scales associated to α s and m b . The error from relativistic sum rules are generically smaller since low-energy scales are not being probed.

Charm Quark mass
In Fig. 13 we compare our charm quark mass determination with a selection of previous results, which include relativistic sum rules [67], Lattice QCD [71] and fits to the lowest lying charmonium states [54]. In the following we discuss the latter by itself, and two former together. 15 These equations can be recast into an expression for the binding energy of the state, and can therefore be implemented into our χ 2 function. 16 The result of Ref. [58] has been updated in [69] with the four-loop coefficient of the MS-pole mass relation. 17 Another recent lattice determination can be found in Ref. [70], which computes the ratio m b /mc on the lattice and determines m b from their mc lattice determination. They obtain m b = 4.184 ± 0.089 GeV. The same approach was used in Ref. [71] which found m b = 4.162 ± 0.048 GeV. These results are less precise than the direct determination of [66] m b = 4.196 ± 0.023 GeV.  shows the outcome of our analysis, a lattice determination (blue), two results from non-relativistic sum rules (purple) and one from relativistic sum rules (green). Both panels : The PDG world average value is shown as a vertical gray band. Note that the PDG world average has asymmetric uncertainties.
Let us start by comparing our result in Eq. (5.6) with the analysis of Ref. [54], which also uses non-relativistic QCD at N 3 LO, but employing the MS short-distance scheme. Furthermore, the principle of minimal sensitivity is used to set the default renormalization scale µ, finding 2.14 GeV and 2.42 GeV for J/ψ and η c , respectively. These are much larger than what we find form requiring a small size for the perturbative logarithms (∼ 1 GeV). Finally, the perturbative uncertainty is estimated by setting the renormalization scale to twice the minimal sensitivity value. That yields for the average of their individual determinations an uncertainty of 23 MeV, which is a factor of two smaller than our estimate. Their central value is 27 MeV smaller than ours, but results are compatible within errors. Finally, while we perform a global fit two both n = 1 states, which takes into account theoretical correlations, Ref. [54] determines the charm mass from J/ψ and η c individually and computes the (unweighted) average of those as their final determination.
The analyses of Refs. [67,71] use relativistic sum rules at O(α 3 s ) for the vector and pseudo-scalar correlators, respectively. 18 Although the analyses look similar at first glance, they differ in the way perturbative uncertainties are estimated. While [67] varies renormalization scales associated to α s and m c independently in the range m c to 4 GeV, Ref. [71] sets µ = 3 GeV and makes an educated guess for missing higher order corrections, resulting in a much smaller perturbative error. The investigations performed in Ref. [67] suggest that the convergence of the pseudo-scalar perturbative series is significantly worse than of the vector correlator, and that the theoretical uncertainty of Ref. [71] might be underestimated. The reader is referred to Ref. [67] for more details on the analysis.

Conclusions
In this work we have studied the bottomonium and charmonium spectra within Non-Relativistic QCD at the perturbative level up to N 3 LO. For the former we have considered corrections coming from the finite charm quark mass up to N 2 LO, and studied setups with n = 3 and n = 4, concluding that the former is very close to the decoupling limit and likely captures most of the finite charm quark mass effects. To cancel the O(Λ QCD ) renormalon of the QCD static potential we employ the low-scale short distance MSR mass, realized in its Practical and Natural versions. We make a dedicated study of scale setting by requiring that the perturbative logarithms become minimal, and choose to vary around that canonical scale such that the size of the argument of those logarithms departs 40% from unity. We vary the renormalization scale µ and the infrared scale R independently in the same range. With this prescription we find that bottomonium states with principal quantum number n smaller than 3 can be described within perturbation theory, but for n = 3 states no convergence is found. For charmonium we find that n = 1 states can be predicted although the convergence of perturbation theory is significantly worse. One could argue that a more appropriate short-distance mass for bottomonium physics is, for instance, the 1S mass [38][39][40], which does not explicitly depend on any additional low-energy scale. 19 However R variation accounts for the uncertainty associated in the conversion of the 1S mass into the MS mass, in which the MSR mass plays the role of a smooth interpolator between typical non-relativistic scales of the order of the inverse of the Bohr radius and relativistic scales as the mass itself.
For bottomonium we include effects of the charm quark mass in the R-evolution of the MSR mass, what ensures there are no Heavy Quark Symmetry breaking corrections for the Natural MSR mass. Furthermore, requiring that the massive R-anomalous dimension vanishes we predict the uncalculated O(α 4 s ) virtual massive quark corrections to the relation between the MS and pole masses, and obtain an (iterative) expression for these corrections in the asymptotic limit. We also define the MSR scheme with (n − 1) flavors, which is smoothly connected to the MSR (n ) scheme at the scale R = m q . Of course the MSR (n ) scheme is also smoothly connected to the MS scheme at R = m Q , what makes our connection between the MSR (n −1) and MS schemes smooth. In this context smooth means summing up large logarithms of ratios of scales.
To determine the bottom and charm quark masses we compare to individual states and perform global fits. Since theoretical uncertainties are highly correlated among the various states we perform fits for given values of the renormalization scales, and scan over those in a grid to determine the uncertainties due to missing higher-order corrections. Since the lowest bound in which these scales are varied depends on the principal quantum number, we correlated the values of µ and R for each value of n by rescaling the n = 1 scales to the appropriate ranges. For our final result we choose the most complete dataset that includes states with n ≤ 2 for bottom and n = 1 for charm. Taking the MSRn scheme with n = 3 active flavors in both cases we find : where all errors have been added in quadrature, but are fairly dominated by the perturbative uncertainties. We have compared our results with previous determinations which use either quarkonium spectroscopy or other methods, and find good agreement with all of them and with the world average. Our estimate of the uncertainty due to missing higher-order corrections is larger than those of Refs. [30,54], which use the principle of minimal sensitivity, but very similar to that of Ref. [53], which varies both µ and R scales independently (but one at a time) in a range similar to ours. We have also explored the possibility of performing simultaneous fits to m b and α s , finding that if both n = 1 and 2 states are included a fit to both parameters is possible. We find : where again all errors have been added in quadrature and are completely dominated by theoretical uncertainties. The bottom quark mass uncertainty grows by 64%, and the uncertainty on α s , although good in absolute terms, cannot compete with recent determinations that reach ∼ 1%. Let us close this article discussing possible improvements on our analysis. Although the study performed in this article suggests that non-perturbative effects are small for n = 1 and 2, a more systematic study of these could make our phenomenological observation into a definite statement. Having higher-order corrections in the near future seems unrealistic, but one could perform the analysis using a renormalization-group-improved theoretical expression, possibly in the context of pNRQCD. This would sum up ultra-soft effects that already arise at O(ε 4 ), deteriorating the convergence of the perturbative series. One would expect that such an RGE analysis would yield smaller perturbative uncertainties, from which the charmonium analysis would particularly benefit. On the other hand, one could incorporate the whole QCD static potential at the leading-order Hamiltonian and carry out perturbation theory around it, as in the study of Ref. [72]. Finally one could use our estimate for the O(α 4 s ) charm mass correction to the MS-pole mass relation and the decoupling limit approximation to estimate the missing finite charm quark mass corrections at O(ε 4 ), and check if this brings the n = 3 and n = 4 schemes closer to each another.
Finally, the three-loop correction from massive lighter quarks to the relation between the pole and the MS masses, ∆