1 Introduction

Highly precise values of the charm and bottom quark masses can be obtained in QCD perturbation theory, because they are sufficiently large to suppress non-perturbative effects. The object of interest is the vector current correlation function, which can be studied experimentally in a clean environment in electron-positron annihilation. Furthermore, by considering moments of the correlator one arrives at theoretically most accessible inclusive observables, which – at least in the case of the vector current – offers excellent perturbative convergence even in the context of the charm quark mass, \(m_c\), where the strong coupling, \(\alpha _s(m_c) \sim 0.4\), is not all that small. By the specific method [1] reviewed in Sect. 2, which is a concrete implementation of the general QCD sum rule idea [2, 3], we were able to determine \(m_c\) with a controlled theory uncertainty [4], competitive even with the results from lattice gauge theory simulations [5], and in excellent agreement with them [6].

In the case of the bottom quark mass, \(m_b\), lattice gauge theory faces an impediment, as the strong interaction scale of \(\mathcal{O}(m_\rho )\) differs significantly from \(m_b\) itself. By contrast, this separation of scales is a virtue in any approach effectively utilizing the operator product expansion (OPE). Together with the smaller value of the strong coupling, \(\alpha _s(m_b) \sim 0.23\), this turns the QCD sum rule approach into the method of choice to determine \(m_b\).

On the other hand, much less experimental information is available on the bottom quark current correlator compared to that of the charm quark. This is because the \(b{\bar{b}}\) electro-production cross section is by more than an order of magnitude smaller than non-\(b{\bar{b}}\) quark production, so that B tagging is needed in order to determine the exclusive cross section. Furthermore, while formally the domain of the dispersion integration extends to infinite energy, the experimentally scanned kinematical region for bottom meson pair production does not exceed \(\sqrt{s} \approx 11.2\) GeV, leaving a roughly four times smaller window in relative comparison to open charm production. Fortunately, this problem can be solved by considering higher moments, which in contrast to the charm case [4, 7,8,9] is a viable option for \(m_b\).

The essential feature of our approach (Sect. 2) is that the masses and electronic decay widths of the low-lying \(\Upsilon \) resonances provide sufficient experimental knowledge to determine \(m_b\), as long as the 0th moment is considered alongside the more standard positive-n moments. We may then use the limited experimental information from the continuum region that is available to test the stability of our results in Sect. 3 as a function of the moment number, and to control (in fact over-constrain) the theoretical uncertainty (see Sect. 4). We present our conclusions and a comparison with other approaches in Sect. 5.

2 Moment sum rules

The transverse part of the correlation function \({\hat{\Pi _q}}(t)\) (quantities marked with a caret are defined in the \(\overline{\textrm{MS}}\) renormalization scheme) of two heavy quark vector currents obeys the subtracted dispersion relation [10],

$$\begin{aligned} 12 \pi ^2 \frac{{\hat{\Pi _q}} (0) - {\hat{\Pi _q}} (-t)}{t} = \int \limits _{4 {{\hat{m}}}_q^2}^\infty \frac{\textrm{d} s}{s} \frac{R_q(s)}{s + t}\ , \end{aligned}$$
(1)

where \(R_q(s) = 12 \pi \text{ Im } {\hat{\Pi _q}}(s)\), and where \({{\hat{m}}}_q = {{\hat{m}}}_q ({{\hat{m}}}_q)\) is the heavy quark mass. Taking derivatives in the limit \(t \rightarrow 0\), one obtains the moments [2, 3, 11],

$$\begin{aligned} {{\mathcal {M}}}_n := \left. \frac{12\pi ^2}{n !} \frac{d^n}{d t^n} {\hat{\Pi _q}}(t) \right| _{t=0} = \int \limits _{4 {{\hat{m}}}_q^2}^\infty \frac{\textrm{d} s}{s^{n+1}} R_q(s) \quad (n \ge 1).\nonumber \\ \end{aligned}$$
(2)

A 0th moment [1] can also be defined,

$$\begin{aligned} {{\mathcal {M}}}_0:= & {} - \lim _{t \rightarrow \infty } \left[ {\hat{\Pi }}(-t) - {\hat{\Pi ^\infty }}(-t) \right] \nonumber \\= & {} \int \limits _{{{\hat{m}}}_q^2}^\infty \frac{\textrm{d} s}{s} \left[ R_q(s) - R_q^\infty (s) \right] , \end{aligned}$$
(3)

provided the limit \(t \rightarrow \infty \) and the integration over \(\text{ Im } {\hat{\Pi }}(s)\) at \(s \rightarrow \infty \) is regularized by properly chosen subtractions \({\hat{\Pi ^\infty }}(-t)\) and \(R_q^\infty (s)\) [1, 4]. \({{\mathcal {M}}}_0\) is then obtained from the dispersion relation for the difference \({\hat{\Pi _q}} (-t) - {\hat{\Pi _q}} (0)\) where the (unphysical) constant \({\hat{\Pi _q}} (0)\) is subtracted. The subtraction and the explicit sum rule for \(\mathcal{M}_0\) will be given below.

The left-hand sides of Eqs. (2) and (3) can be calculated in perturbative QCD (pQCD) order by order in the strong coupling \({\hat{\alpha _s}}({{\hat{m}}}_q)\) as a function of \({{\hat{m}}}_q\). On the right-hand side one can use the optical theorem to relate \(R_{q}(s)\) to the cross section for heavy quark production in \(e^+e^-\) annihilation. It can be split into a contribution from a small number of narrow resonances below the heavy quark production threshold and a continuum contribution above,

$$\begin{aligned} R_q(s) = R_q^{\textrm{res}}(s) + R_q^{\textrm{cont}}(s). \end{aligned}$$
(4)

One possible method to determine \({{\hat{m}}}_q\) is thus to combine data, where available, for the evaluation of the integrals on the right-hand side of Eqs. (2) and (3) with predictions from pQCD at large s where there are no data. This approach has been followed, for example, in Refs. [7, 9, 12,13,14]. A certain amount of modeling is necessary since experimental information about \(R_q(s)\) is restricted to relatively small energies.

Here, we will choose a different strategy. The idea is to describe the continuum region above the heavy quark production threshold on average only, not having to rely on local quark-hadron duality. We follow Refs. [1, 4] and use the simple ansatz,

$$\begin{aligned} R_q^{\textrm{cont}}(s)= & {} 3 Q^2_q \lambda ^q_1 (s) \sqrt{1 - \frac{4\, {\hat{m}}_q^2 (2 M)}{s^\prime }} \nonumber \\{} & {} \times \left[ 1 + \lambda ^q_3 \left( \frac{2\, {\hat{m}}_q^2(2 M)}{s^\prime } \right) \right] , \end{aligned}$$
(5)

where \(3 Q_q^2 \lambda _1^q(s)\) is the zero-mass limit of \(R_{q}(s)\) and \(s' := s + 4 [{\hat{m}}_q^2(2M) - M^2]\). Note that using the variable \(s^\prime \) ensures that the physical threshold is at \(\sqrt{s} = 2M\) where M is taken as the mass of the lightest pseudoscalar meson, i.e. in the case of the bottom quark, \(M = M_{B^{\pm }} = 5.27934\) GeV [6]. This ansatz guarantees a smooth transition between the onset of the heavy quark production threshold at 2M and pQCD at large s, including a leading \({\hat{m}}_q^2/s\) pQCD correction. Since we only need to consider moments, fine details of the ansatz are not very important. However, we will also investigate variations of our ansatz where the resonances above the threshold \(4M^2\), \(\Upsilon (4S)\), \(\Upsilon (5S)\), and \(\Upsilon (6S)\), are explicitly added to the expression (5).

The two unknowns, namely the heavy quark mass \({\hat{m}}_q({\hat{m}}_q)\), and the single free parameter in Eq. (5), \(\lambda ^q_3\), will be determined from Eq. (3) and one of the Eqs. (2). The other moments are then fixed and can be used to check the consistency of the approach [4]. Thus, besides the value of M, only the masses and electronic decay widths of the low-lying resonances are needed as the experimental input to extract \({\hat{m}}_q({\hat{m}}_q)\). The quark mass and \(\lambda _3^q\) can, in principle, be determined from any combination of two moments. But only including the 0th moment provides the leverage to sufficiently break the correlation between \(\lambda _3^q\) from \({\hat{m}}_q\). The particular details of the ansatz proposed in Eq.(5) are of minor importance when consistency across pairs of moments is found.

We now give the explicit expressions needed for our numerical evaluation. From now on we particularize to the bottom quark case, in which we may neglect higher-dimensional operators in the OPE, such as from the gluon condensate.Footnote 1 Perturbative QCD predictions for the positive moments can be cast into the form,

$$\begin{aligned} {{\mathcal {M}}}_n^{\textrm{pQCD}} = \frac{1}{4} \left( \frac{1}{2{\hat{m}}_b({\hat{m}}_b)} \right) ^{2n} {\hat{C}}_n\ , \end{aligned}$$
(6)

with

$$\begin{aligned} {\hat{C}}_n= & {} C_n^{(0)} + \left( \frac{{\hat{\alpha _s}}}{\pi }+ \frac{\alpha _{\textrm{em}}}{12 \pi }\right) C_n^{(1)} + \frac{{\hat{\alpha }}_s^2}{\pi ^2} C_n^{(2)}\nonumber \\{} & {} +\frac{{\hat{\alpha }}_s^3}{\pi ^3} C_n^{(3)} + {{\mathcal {O}}}({\hat{\alpha }}_s^4), \end{aligned}$$
(7)

where \(\hat{\alpha }_s = \hat{\alpha }_s({\hat{m}}_b( {\hat{m}}_b ) )\). The coefficients \({\hat{C}}_n\) are known [15,16,17,18,19,20] up to \({{\mathcal {O}}}({\hat{\alpha }}_s^3)\) for \(n \le 4\), and up to \({\mathcal {O}}({\hat{\alpha }}_s^2)\) for the rest [21, 22]. The numerical values required for our analysis are collected in Table 1.

Table 1 Coefficients \(C_n^{(i)}\) for the perturbative expansion of the QCD moments entering Eq. (7). The values quoted with an uncertainty are taken from Ref. [23]
Fig. 1
figure 1

The theoretical moments \({{\mathcal {M}}}_n^{\textrm{pQCD}}\) in Eq. (6) (multiplied by \(10^{2n+1} \text{ GeV}^{4n+2}\)) for the reference value \({\hat{m}}_b({\hat{m}}_b) = 4.180\) GeV at different orders in \({\hat{\alpha }}_s\). Blue squares show results at \({{\mathcal {O}}}({\hat{\alpha }}_s)\), red circles include \(\mathcal{O}({\hat{\alpha }}_s^2)\) terms, and green triangles refer to the full \({{\mathcal {O}}}({\hat{\alpha }}_s^3)\). The error bars are the truncation uncertainties from Eq. (12) at the given order

Since we evaluate the moments up to \({{\mathcal {O}}} ({\hat{\alpha }}_s^3)\), we use the predictions for \(n>4\) provided in Ref. [23], inducing the uncertainties shown in the table. In the approach of Ref. [23], based on the Mellin–Barnes transform, the two-point correlator at \(\mathcal{O}(\alpha _s^3)\) is reconstructed from the Taylor expansion at \(q^2=0\), the threshold expansion at \(q^2 = 4{{\hat{m}}}_q^2\), and the high-energy expansion at \(q^2 \rightarrow \infty \). The reconstruction is analytic and systematic, and is controlled by an error function which becomes smaller as more terms in the expansions are known. Once the correlator is reconstructed, one can calculate the moments in Eq. (2). An overview of our theory errors for the moments up to \(n=7\) is shown in Fig. 1. An alternative prediction of these coefficients in Ref. [24] has quoted errors that are smaller by one order of magnitude or more, leading to a total error in the extracted \({{\hat{m}}}_b\) about an MeV smaller, but we believe that the conservative approach of Ref. [23] is a better reflection of the corresponding error.

We shall approximate the contributions of the narrow resonances by \(\delta \)-functions,

$$\begin{aligned} R_b^{\textrm{res}}(s) = \sum _{R = \Upsilon (1S), \Upsilon (2S), \Upsilon (3S)} \frac{9\pi }{\alpha _{\textrm{em}}^2(M_R)} M_R \Gamma _R^e \delta (s - M_R^2), \nonumber \\ \end{aligned}$$
(8)

where the masses \(M_R\) and electronic widths \(\Gamma ^e_R\) [6] are listed in Table 2. The values of the running fine structure constant at the resonance are also given in the table.Footnote 2

Table 2 Resonance data [6] used in the analysis. The uncertainties from the resonance masses are negligible. The first three resonances are below the continuum threshold and define \(R_b^{\textrm{res}}(s)\), while the higher ones will be needed later when we evaluate the theoretical uncertainties

Finally, we need the regularized expression for \({{\mathcal {M}}}_0\), which requires to subtract the zero-mass limit of \(R_{b}(s) = 3 Q_b^2 \lambda _1^b(s)\). While it is known to order \(\mathcal{O}({\hat{\alpha }}_s^4)\), we need only the third-order expression [26],

$$\begin{aligned} \lambda ^b_1(s)&= 1 + \frac{{\hat{\alpha }}_s}{\pi } + \frac{3 Q_b^2 \alpha _{\textrm{em}}}{4 \pi } \left( 1-\frac{1}{3} \frac{{\hat{\alpha }}_s}{\pi } \right) \nonumber \\&\quad + \frac{{\hat{\alpha }}_s^2}{\pi ^2} \left[ \frac{365}{24} - 11 \zeta (3) + n_b \left( \frac{2}{3} \zeta (3) - \frac{11}{12} \right) \right] \nonumber \\&\quad + \frac{{\hat{\alpha }}_s^3}{\pi ^3} \left[ \frac{87029}{288} - \frac{121}{8} \zeta (2) - \frac{1103}{4} \zeta (3) + \frac{275}{6}\zeta (5) \right. \nonumber \\&\quad + \left. n_b \left( - \frac{7847}{216} + \frac{11}{6} \zeta (2) + \frac{262}{9} \zeta (3) - \frac{25}{9}\zeta (5) \right) \right. \nonumber \\&\quad \left. + n_b^2 \left( \frac{151}{162} - \frac{\zeta (2)}{18} - \frac{19}{27} \zeta (3) \right) \right] , \end{aligned}$$
(9)

where \({\hat{\alpha }}_s = {\hat{\alpha }}_s (\sqrt{s})\), \(\alpha _{\textrm{em}} = \alpha _{\textrm{em}}(\sqrt{s})\), and \(n_b = 5\) is the total number of active flavors. Using the results of Refs. [27, 28], the sum rule for \({{\mathcal {M}}}_0\) defined in Eq. (3) reads explicitly,

$$\begin{aligned}&\sum \limits _{\textrm{resonances}} \frac{27\pi \Gamma ^e_R}{M_R \alpha _\textrm{em}^2 (M_R)} + \int \limits _{4 M^2}^\infty \frac{\textrm{d} s}{s} \left[ 3 R_b^{\textrm{cont}}(s) - \lambda _1^b(s) \right] \nonumber \\&\quad - \int \limits _{{\hat{m}}_b^2}^{4 M^2} \frac{\textrm{d} s}{s} \lambda ^b_1 (s) = - \frac{5}{3} + \frac{{\hat{\alpha _s}}}{\pi }\left[ 4 \zeta (3) - \frac{7}{2} \right] + \frac{{\hat{\alpha }}_s^2}{\pi ^2} \nonumber \\&\quad \times \left[ \frac{2429}{48} \zeta (3) - \frac{25}{3} \zeta (5) - \frac{2543}{48} + n_b \left( \frac{677}{216} - \frac{19}{9} \zeta (3) \right) \right] \nonumber \\&\quad + \frac{{\hat{\alpha }}_s^3}{\pi ^3} A_3 = - 1.667 + 1.308\, \frac{{\hat{\alpha _s}}}{\pi }+ 2.192 \frac{{\hat{\alpha }}_s^2}{\pi ^2} - 8.117 \frac{{\hat{\alpha }}_s^3}{\pi ^3}\ , \end{aligned}$$
(10)

where \({\hat{\alpha _s}} = {\hat{\alpha }}_s({{\hat{m}}}_b)\). The third-order coefficient \(A_3\) is available in numerical form [4, 24, 29],

$$\begin{aligned} A_3 = -9.863 + 0.399 \, n_b - 0.010 \, n_b^2\ . \end{aligned}$$
(11)

In the last line of Eq. (10) we show the numerical values for \(n_b=5\). The onset of the continuum is at 2M, the pseudoscalar threshold. The lower integration limit in the subtraction term involving \(\lambda _1^b(s)\) is, in principle, arbitrary, but is set to \({\hat{m}}_b^2\) in concordance with the choice to evaluate \(\hat{\alpha }_s\) on the right-hand side of Eq. (10) at scale \(\hat{m_b}\). In our numerical analysis we use the reference value \(\hat{\alpha }_s(M_Z) = 0.1182\) as input. With five-loop running [32,33,34,35] and four-loop matching [36] of \({\hat{\alpha _s}}\), this corresponds to \(\hat{\alpha }_s({\hat{m}}_b({\hat{m}}_b ) ) = 0.225\).

For both the theoretical predictions of the moments and the contributions from resonances and continuum to the sum rules one has to assess the uncertainties. To assign a truncation error to the pQCD prediction of the moments we follow the method proposed in Refs. [1, 4] and consider the largest group theoretical factor in the next un-calculated perturbative order,

$$\begin{aligned} \Delta {{\mathcal {M}}}_n^{(i)} = \pm Q_q^2 N_C C_F C_A^{i-1} \left[ \frac{{\hat{\alpha _s}} ({\hat{m}}_q)}{\pi }\right] ^i \left[ \frac{1}{2 {\hat{m}}_q({\hat{m}}_q)} \right] ^{2n}, \nonumber \\ \end{aligned}$$
(12)

(\(N_C = C_A = 3\), \(C_F = 4/3\)). Alternatively, the dependence on the renormalization scale is often used to estimate theory errors, where, for example, in Refs. [9, 13] the scale was varied between 5 and 15 GeV. Our prescription is more conservative, as has already been observed in our previous analysis [4] of the charm quark mass.

In order to determine the error from the continuum contribution we proceed as follows. First, we choose a pair of moments \(({{\mathcal {M}}}_0, {{\mathcal {M}}}_n)\) from which \({\hat{m}}_b({\hat{m}}_b)\) and \(\lambda _3^{b}\) are determined. Then we input this value of \({\hat{m}}_b({\hat{m}}_b)\) into Eq. (5) and integrate with the weight corresponding to the 0th moment as in Eq. (3), but with the energy integration range restricted to the threshold region, \(2 M_B\le \sqrt{s} \le 11.20~\text{ GeV }\). As this is a function of \(\lambda _3^b\), we can adjust its value to coincide with the corresponding integral over the experimentally determined threshold region (see Sect. 4) yielding an experimental value, denoted \(\lambda _3^{b,\textrm{exp}}\). In the final step, we use \(\lambda _3^{b,\textrm{exp}}\) in the nth moment sum rule to re-calculate \({\hat{m}}_b({\hat{m}}_b)\), and treat the difference between these two \({\hat{m}}_b({\hat{m}}_b)\) values as an additional uncertainty. It serves as a control of the error component associated with the entire methodology which we will denote by \(\lambda _3^b \ne \lambda _3^{b,\textrm{exp}}\). For example, neglected non-perturbative contributions to the moments such as from condensates or from residual duality violations would become visible in the comparisons of the values \(\lambda _3^b\) from the theoretical moments with \(\lambda _3^{b,\textrm{exp}}\). The experimental errors in the threshold data induce an uncertainty \(\Delta \lambda _3^{b,\textrm{exp}}\) in \(\lambda _3^{b,\textrm{exp}}\) itself, which we will also need to account for.

Fig. 2
figure 2

\(R_b(s)\) from ISR corrected BaBar [37] data (red points) and from Belle [38] data (black points) compared with different choices for our ansatz for continuum plus resonances. The blue dashed line is the pQCD prediction for \(R_b(s)\). Upper plot: Continuum ansatz without resonances extended up \(\sqrt{s} = 13\) GeV above the range where data are available. Middle row: Continuum ansatz including Gamma distributions for the \(\Upsilon (4S)\) and \(\Upsilon (5S)\) resonances. In the left plot (our default choice) widths \(\Gamma _R\) from the PDG data are used (see Table 2), while in the right plot the widths are tuned to match the local description of the data, \({\tilde{\Gamma }}_{\Upsilon (4S)} = 29\) MeV and \({\tilde{\Gamma }}_{\Upsilon (5S)} = 165\) MeV. Lower row: Alternative choices including only the \(\Upsilon (4S)\) resonance on top of the continuum (left, \({\tilde{\Gamma }}_{\Upsilon (4S)} = 29\) MeV), or the three resonances \(\Upsilon (4S)\), \(\Upsilon (5S)\), and \(\Upsilon (6S)\) (right, \({\tilde{\Gamma }}_{\Upsilon (4S)} = 29\) MeV, \({\tilde{\Gamma }}_{\Upsilon (5S)} = 192\) MeV, \({\tilde{\Gamma }}_{\Upsilon (6S)} = 139\) MeV)

Table 3 Above the double line: values of \({\hat{m}}_b({\hat{m}}_b)\) (in MeV), \(\lambda _3^b\) and \(\lambda _3^{b,\textrm{exp}}\), determined from different pairs of moments as described in the text, where the \(\Upsilon (4S)\) and \(\Upsilon (5S)\) resonances have been added explicitly to the ansatz in Eq (5). Below the double line: breakdown of the uncertainties in \({\hat{m}}_b({\hat{m}}_b)\) followed by the total errors. The dependence on \({\hat{\alpha _s}}\) is shown in the next-to-last line, where the minus sign indicates that \({\hat{m}}_b\) decreases when \({\hat{\alpha _s}}\) is increased relative to the reference value \({\hat{\alpha _s}}(M_Z) = 0.1182\). The last line contains the uncertainty induced from \(\Delta {\hat{\alpha _s}}(M_Z) = \pm 0.0016\), i.e. the error obtained from the global fit to electroweak precision data [6]. See also Fig. 3 for a graphical representation of these results
Fig. 3
figure 3

Results for \({\hat{m}}_b({\hat{m}}_b)\) using different combinations of moments, where we added the \(\Upsilon (4S)\) and \(\Upsilon (5S)\) states explicitly to the ansatz in Eq. (5), as described in the text. Blue bars represent the full error, red bars are from the experimental uncertainties in the resonance parameters, green bars indicate the truncation errors in the theoretical moments, cyan bars are the symmetrized error combinations due to \(\lambda _3^b \ne \lambda _3^{b,\textrm{exp}}\) and \(\Delta \lambda _3^b\) (see Table 3), and the uncertainty induced by \(\Delta {\hat{\alpha _s}}(M_Z) = \pm 0.0016\) is shown in purple

3 Numerical results and determination of \({\hat{m}}_b\)

We have analyzed the determination of \({\hat{m}}_b({\hat{m}}_b)\) from different pairs of moments and using different prescriptions to include resonances on top of the continuum. The results are shown in figures and tables in this section. We find that the largest source of uncertainty is from the continuum contribution. Indeed, the values of \(\lambda _3^b\) derived from the mutual consistency of the moments deviate from \(\lambda _3^{b,\mathrm exp}\) determined from data if none of the resonances above threshold are taken into account explicitly. The lower moments are more sensitive to the continuum region, and this deviation indicates that the simple ansatz using only \(R_q^{\textrm{cont}}(s)\) from Eq. (5) does not capture the strong onset of the cross section for energies just above the threshold for open bottom production. As a consequence, stable results are not reached for lower moments. However, the stability improves greatly with the inclusion of the \(\Upsilon (4S)\) and \(\Upsilon (5S)\) states. We parametrize them as Gamma distributions,

$$\begin{aligned}{} & {} R_b^{\textrm{res, Gamma}}(s) = \sum _{R = \Upsilon (4S),\Upsilon (5S)}\nonumber \\{} & {} \quad \times \frac{9\pi }{\alpha _{\textrm{em}}^2(M_R)} \Gamma _R^eM_R \textrm{Gamma}(s - 4 M_B^2| \alpha , \beta ), \end{aligned}$$
(13)

where

$$\begin{aligned} \textrm{Gamma}(x|\alpha , \beta ) := \frac{\beta ^\alpha }{\Gamma (\alpha )} x^{\alpha -1} e^{-\beta x},\quad (\alpha> 0,\ \beta > 0), \nonumber \\ \end{aligned}$$
(14)

and \(\alpha \) and \(\beta \) are chosen such that the peak location \(M_R\) and the second derivative coincide with those of a relativistic Breit–Wigner distribution with width \({\tilde{\Gamma }}_R\),

$$\begin{aligned} \alpha = 1 + \frac{2}{\root 3 \of {\pi }} \frac{(M_R^2 - 4 M_B^2)^2}{{\tilde{\Gamma }}_R^2 M_R^2}\ ,\quad \beta = \frac{\alpha - 1}{M_R^2 - 4 M_B^2}\ . \end{aligned}$$
(15)

We use the peak positions \(M_R\) and total width \(\Gamma _R\) of the resonances as given in Ref. [6] and collected above in Table 2.

In order to understand why the inclusion of resonances above threshold leads to an improved determination of the b quark mass, we provide in Fig. 2 a graphical account of the landscape of \(R_b(s)\) above threshold. The upper plot shows that the continuum alone does not describe the data in the energy range of the \(\Upsilon (4S)\), \(\Upsilon (5S)\), and \(\Upsilon (6S)\) resonances and the pQCD limit is reached only when \(\sqrt{s}\) is above threshold by an amount of the order of the b quark mass, i.e. far above the energy range where data are available.

The \(\lambda _3^b\) parameter controls how fast our ansatz Eq. (5) reaches the asymptotic limit. If \(\lambda _3^b < 1\), Eq. (5) reaches that limit from below, while if \(\lambda _3^b > 1\) the limit is reached from above, thus crossing the perturbative result at some energy. The optimal value for \(\lambda _3^b\) is found to be around 1.5 (c.f. Table 3). For a larger value of \(\lambda _3^b\), our model would cross the perturbative result at lower energies, improving the agreement of data just above threshold but worsening the description in the perturbative region.

It is therefore not a surprise that with the continuum ansatz alone one cannot obtain stable solutions from the set of sum rules.

The second row of plots in Fig. 2 shows how the global description of data for \(R_b(s)\) can be improved by the inclusion of Gamma distributions, Eq. (13), for the \(\Upsilon (4S)\) and \(\Upsilon (5S)\) resonances. If we use the total decay widths \(\Gamma _R\) in Eq. (15) as given by the PDG [6] the local description of the data is still not good; however, the moments, i.e. integrals over \(R_b(s)\) can be matched. To see this more clearly one can exploit the fact that moments do not change even if the total widths are significantly increased (which we denote by \({\tilde{\Gamma }}_R\)) if one aims at a better visual representation of the local behavior of the data, as done for the right plot of the middle row of Fig. 2. Here a good description of the data on average is clearly visible. The lower row of plots in Fig. 2 shows other possible choices, namely to add only one resonance, the \(\Upsilon (4S)\), or three resonances, \(\Upsilon (4S)\), \(\Upsilon (5S)\) and \(\Upsilon (6S)\), on top of the continuum. The first (latter) choice would lead to an underestimate (overestimate) of moments in the region above threshold. As a consequence, these choices would lead to solutions for \(\lambda _3^b\) from the set of sum rules in disagreement with \(\lambda _3^{b,\textrm{exp}}\) as determined from data.

We therefore determine the two free parameters, \(\lambda _3^b\) and \({\hat{m}}_b({\hat{m}}_b)\) from pairs of sum rules using the continuum ansatz where we include the \(\Upsilon (4S)\) and \(\Upsilon (5S)\) as described above. The results are summarized in Table 3 and Fig. 3, including the breakdown of the uncertainties from the different sources as discussed before. Since our default description slightly overshoots the experimental data in average, \(\lambda _3^{b,\textrm{exp}}\) is smaller than \(\lambda _3^b\). Results for other options, (i) where we do not include resonances on top of the continuum, (ii) where we include only the \(\Upsilon (4S)\), (iii) or where we include additionally the \(\Upsilon (6S)\) parametrized as a Gamma distribution as well, are presented in Fig. 4. The shift of the value for the b quark mass induced by these different options is small; for example including three resonances above threshold, \({\hat{m}}_b({\hat{m}}_b)\) would be reduced by 1.3 MeV, i.e. by much less than our error estimate. The most stable result and smallest overall uncertainty is obtained with our default option in Fig. 3.

Fig. 4
figure 4

Same as in Fig. 3 but including only the resonances below threshold (top), with the ansatz modified to include the \(\Upsilon (4S)\) as a \(\Gamma \)-function (middle), and with the ansatz modified to include the \(\Upsilon (4S)\), \(\Upsilon (5S)\) and \(\Upsilon (6S)\) as \(\Gamma \)-functions (bottom)

A summary of the best determination in each scenario is shown in Table 4. Our most precise and therefore final result for \({{\hat{m}}}_b({{\hat{m}}}_b)\) is based on the pair of moments \(({{\mathcal {M}}}_0, {{\mathcal {M}}}_7)\), and reads,

$$\begin{aligned} {{\hat{m}}}_b({{\hat{m}}}_b) = (4180.2 - 108.5 \Delta \hat{\alpha _s} \pm 7.9)~\textrm{MeV}. \end{aligned}$$
(16)

We explicitly exhibit the dependence on the input value of the strong coupling \({\hat{\alpha }}_s\) relative to the central value, i.e. \(\Delta {\hat{\alpha }}_s = {\hat{\alpha }}_s(M_Z) - 0.1182\).

Table 4 Values and uncertainties of the bottom quark mass when adding various resonances on top of the continuum ansatz. Only the values with the smallest uncertainty and the corresponding pair of moments from which it is obtained are shown in each case

4 Experimental moments

Our determination of \({{\hat{m}}}_b({{\hat{m}}}_b)\) described above does not rely on the details of experimental data for \(R_b\) except resonance parameters. However, a comparison with data for \(R_b\) allows us to calibrate the uncertainty of the \({{\hat{m}}}_b({{\hat{m}}}_b)\) determination. As described above, this is done by calculating moments from data and extracting an experimental value for \(\lambda ^{b,\textrm{exp}}_3\) which can be compared with the value of \(\lambda ^b_3\) obtained from the consistency relations for moments. In this section we present the details of our determination of \(\lambda ^{b,\textrm{exp}}_3\).

We take data from the BaBar Collaboration [37]. These data cover the range of energies between \(\sqrt{s} = 10.54\) and 11.20 GeV (cf. Fig. 5). Data from the Belle Collaboration [38] will be used to obtain a cross-check, but they cover too short a range in energies to be useful for a calculation of moments for our purpose.

4.1 Data and corrections

The published experimental data for continuum heavy quark production must be corrected for vacuum polarisation and QED radiative effects before they can be used in our analysis. Corrections due to vacuum polarisation can be taken into account by substituting the value for \(\alpha _{\textrm{em}}\) used in the experimental work by the running fine structure constant, \(\alpha _{\textrm{em}}(\sqrt{s})\). Since the variation of \(\alpha _{\textrm{em}}(\sqrt{s})\) in the considered energy range is very small, we take it to be constant and use \((\alpha _\textrm{em}(0)/\alpha _{\textrm{em}}(M_R))^2 = 0.93\), (see Table 2). This factor should be multiplied with the measured \(R_b\) ratio.

BaBar experimental data are available for energies above the open bottom threshold. In this energy range, the radiative tails from the \(\Upsilon (1S)\), \(\Upsilon (2S)\), \(\Upsilon (3S)\) resonances contribute. The required corrections are provided by BaBar in supplementary material to Ref. [37] and are easily subtracted from the data.

To remove initial-state radiative (ISR) effects from the continuum data after subtracting radiative tails from the resonances, we use the prescription following Refs. [30, 31] (see also Refs. [9, 14]). The measured R ratio, \({{\hat{R}}}\), is given by a convolution,

$$\begin{aligned} {\hat{R}}(s) = \int _{z_0}^1 \frac{\textrm{d} z}{z} G(z,s)R(z s), \end{aligned}$$
(17)

of the true R ratio with the radiator function G(zs) describing QED corrections. G(zs) is taken from Ref. [14] and includes next-to-next-to-leading order contributions. The lower integration limit of the integral in Eq. (17) should start at the onset of the continuum region, which we fix at \(z_0 = s_0/s\) with \(s_0 = (10.54~\text{ GeV})^2\). The true R ratio must be determined by inverting (i.e. unfolding) Eq. (17). This can be done iteratively imposing the boundary condition \(R(s_0) = 0\). This condition is automatically satisfied by the BaBar data after subtraction of the radiative tails. The BaBar data corrected for vacuum polarization, radiative tails and ISR is shown in Fig. 6 (red points). We also show the uncorrected data (blue points), which are the same as shown in Fig. 5.

BaBar data contain an outlier at \(\sqrt{s} = 10.86\) GeV (not shown in Fig. 6). At this energy, there are two different experimental measurements, separated by only \(\Delta \sqrt{s} = 0.0005\) GeV which disagree among themselves. Instead of removing this point, as has been suggested in Ref. [9], we take the average of the two points and ascribe, as an error, the difference of the two measured R values. We have checked that either option, removing the outlier or averaging with the close-by point, translates into a tiny difference for the experimental moments.

Fig. 5
figure 5

Data for \(R_b(s)\) (blue points) from the BaBar Collaboration. The orange points show the initial-state radiative tail of the first three narrow states below threshold. Both \(R_b\) data and ISR tail are taken from Ref. [37]

Fig. 6
figure 6

BaBar data for \(R_b(s)\) corrected for vacuum polarization, radiative tails and ISR (red points). The blue points are the same uncorrected data as shown in Fig. 5

4.2 Numerical results for moments

Experimental moments are calculated as numerical integrals over the ISR corrected R values, using the trapezoidal rule. We collect our results in Table 5. The experimental moments \(\mathcal{M}_n^{\textrm{exp}}\) are affected by statistical and systematic uncertainties, propagated from the corresponding data errors, and we take into account correlated and uncorrelated systematic uncertainties following the prescription given by the BaBar Collaboration [37]. For comparison, we show in Table 5 also the moments calculated from \(R_b(s)\), but using the value \(\lambda _3^b = 1.53\). This value was obtained in our preferred scenario where the \(\Upsilon (4S)\) and \(\Upsilon (5S)\) resonances are included on top of the continuum and using the pair of moments \(({{\mathcal {M}}}_0, {{\mathcal {M}}}_7)\). In the last column of Table 5 we also show moments calculated from uncorrected data. One can see that ISR corrections are indeed very small and do not introduce an additional source of uncertainties.

Table 5 Contributions to the moments (\(\times 10^{2n+1}\, \text{ GeV}^{4n+2}\)) from the restricted energy range \(2M_B \le \sqrt{s} \le 11.2~\text{ GeV }\). The column labeled \({{\mathcal {M}}}_n^\textrm{exp}\) is obtained by direct integration over corrected data. The first error is due to the uncorrelated statistical and systematic uncertainties of the data, while the second is the correlated one. The third and fourth columns show the moments calculated from our ansatz for \(R_b(s)\) with \(\lambda ^{b, \textrm{exp} }_3 = 0.82(20)\) (column 3) and \(\lambda ^b_3 = 1.53\) (column 4) as input. In both cases, \({\hat{m}}_b=4.1802\) GeV was used. The last column collects the experimental moments when BaBar data is used without any kind of correction or subtraction

In Table 6 we compare our determination of moments with those from Refs. [9] and [14]. To do so, we have to adjust the energy range correspondingly. For both references the lower limit of the energy range was chosen at \(\sqrt{s} = 10.62~\text{ GeV }\). The upper integration limit was \(\sqrt{s} = 11.20~\text{ GeV }\) in Ref. [9] and \(\sqrt{s} = 11.24~\text{ GeV }\) in Ref. [14]. We also follow Refs. [9, 14] and subtract the \(\Upsilon (4S)\) resonance, which is parameterized by a Breit–Wigner distribution, as well as its radiative tail. Above \(\sqrt{s} = 11.20~\text{ GeV }\), we use our ansatz to extrapolate up to 11.24 GeV. As can be seen from Table 6, we find good agreement with both references.

4.3 Determination of \(\lambda _3^{b, \textrm{exp}}\)

Now that the experimental moments are determined, we proceed to calculate \(\lambda _3^{b,\textrm{exp}}\) by solving the equation,

$$\begin{aligned} \int _{(2 M_B)^2}^{(11.20\mathrm{\, GeV})^2} \frac{\textrm{d} s}{s} R_b^{\textrm{cont}}(s) = {{\mathcal {M}}}_0^{\textrm{Data}} = 0.446 \pm 0.011, \end{aligned}$$
(18)

where \(R_b^{\textrm{cont}}(s) \) is defined in Eq. (5). The b quark mass \({\hat{m}}_b({\hat{m}}_b)\) is fixed in Eq. (18) to the value obtained from a selected pair of moments \(({{\mathcal {M}}}_0, {{\mathcal {M}}}_n)\) as described in the previous section. The solution of Eq. (18) is called \(\lambda _3^{b, \mathrm exp}\). Results are shown in Table 3 already discussed above. In each case, the value obtained for \(\lambda _3^{b, \mathrm exp}\) is compared with \(\lambda _3^b\) determined from the corresponding pair of sum rules. For the default case where we add the \(\Upsilon (4S)\) and \(\Upsilon (5S)\) resonances on top of the continuum and use the \(0^{\textrm{th}}\) and \(7^{\textrm{th}}\) moments, we find \(\lambda _3^{b, \mathrm exp} = 0.82\pm 0.20\). The difference between this value and the one determined from the pair of moments (\(\lambda _3^b = 1.53\), see Table 3) corresponds to a difference in terms of the b quark mass of \(3.5 \pm 1.0\) MeV.

We have used the BaBar data since it covers an energy range large enough to extract a reliable description of the continuum region. The Belle Collaboration [38] also provides a measurement of \(R_b(s)\), but only the narrow energy range between \(\sqrt{s} = 10.620\) and 11.047 GeV is covered with the first three experimental points quite disconnected from the fine-scan around the \(\Upsilon (5S)\) and \(\Upsilon (6S)\) resonances, i.e. \(10.754~\text{ GeV } \le \sqrt{s} \le 11.047~\text{ GeV }\). If we use Belle data we find that this short energy range contributes 0.198(7) to the 0th experimental moment, to be compared with 0.172(5) from BaBar data for the same energy region. These results are compatible at the \(3 \sigma \) level only. Such a difference could by attributed to the different treatment of QED radiative effects of the narrow resonances in the case of Belle data. For our calculation of the 0th moment we have used Belle data corrected for vacuum polarisation effects, but without subtracting radiative tails. The Belle Collaboration does not provide the corresponding information. If we had used the radiative tail provided by BaBar, we would find 0.167(7) for the 0th moment. This would bring the values of the 0th moment calculated from Belle or from BaBar data in very good agreement.

Table 6 Comparison of moments with Ref. [9] (left section) and Ref. [14] (right section). In the first case, moments \({{\mathcal {M}}}_n\) are calculated from data in the range \(10.62~\text{ GeV } \le \sqrt{s} \le 11.20~\text{ GeV }\), while in the second case the energy range is \(10.62~\text{ GeV } \le \sqrt{s} \le 11.24~\text{ GeV }\). Our calculation uses experimental data up to \(\sqrt{s} = 11.20\) GeV and an extrapolation based on our ansatz to cover the energy range up to \(\sqrt{s} = 11.24\) GeV. In both cases, the \(\Upsilon (4S)\) resonance including its radiative tail is subtracted
Table 7 Predictions for the moments in the region \(2 M_B \le \sqrt{s} \le 15.0~\text{ GeV }\) from data and an extrapolation using pQCD including the known heavy quark mass corrections (2nd column) or our ansatz with \({\hat{m}}_b({\hat{m}}_b) = 4.1802\) GeV and \(\lambda _3^{b,exp}= 0.82(20)\) (3rd column). All numbers are given in units of \(10^{- (2n+1)}~\textrm{GeV}^{-2n}\). Errors in the 2nd column are from experimental moments only as no uncertainty is assigned to the contribution from pQCD. The errors in the 3rd column combine those from the experimental moments and from \(\Delta \lambda _3^b = \pm 0.20\)

In our previous analysis of data for charm quark production [4] we found that using data in an energy range between 3.7 and 5 GeV, extended by using pQCD above, can lead to a consistent picture and a reliable determination of the charm quark mass. A correspondingly large energy window for the bottom quark would cover energies up to 15 GeV, i.e. roughly one unit of the heavy quark mass above threshold. Unfortunately, data are available only for \(\sqrt{s} \le 11.2\) GeV. The treatment of the energy range \(11.2~\text{ GeV } \le \sqrt{s} \lesssim 15~\text{ GeV }\) requires special care and may lead to additional uncertainties. In Ref. [14] it was argued that the gap above \(\sqrt{s} = 11.2\) GeV should be described by pQCD. However, this introduces a discontinuity with the experimental data. In Ref. [9] a smooth polynomial fit was used instead. We opt for using our own ansatz, which approaches pQCD only for \(s\rightarrow \infty \). In Table 7 we compare two possible options: calculating moments from data in the window \(2 M_B \le \sqrt{s} \le 11.2~\text{ GeV }\), combined either with pQCD or our ansatz for \(R_b(s)\), both up to \(15~\text{ GeV }\). If one uses data and pQCD above \(\sqrt{s} = 11.2\) GeV, i.e. a prescription with a discontinuity, one obtains moments which are larger than for the case where we use our ansatz with a smooth \(\sqrt{s}\)-dependence. Correspondingly, this will result in smaller values for the bottom mass, as found in Ref. [14].

Differences in the b quark mass determination between Refs. [9, 14] and our work can thus be traced to a different prescription for including contributions to the moments from an energy range where no data are available. One could argue that this should be considered as an additional systematic error for the b quark mass. Experimental data covering the energy range between 11 and 15 GeV are definitely needed to ultimately solve this issue. Until such data will become available, we believe that a description of the unknown part of \(R_b(s)\) with a smooth function is preferable over one with a discontinuity.

5 Conclusions

We presented a determination of the bottom-quark mass from QCD sum rules with a careful determination of the uncertainty. We use an over-constrained system of two different sum rules, pairing the zeroth with the nth moment, and explore the stabilization of the result for the bottom-quark mass across different pairs of sum rules, always including the one for the zeroth moment. The main results are summarized in Fig. 3. Our analysis is based on Ref. [1], but goes beyond that previous work in several respects: first, we included terms of one order higher in \(\hat{\alpha }_s\) than previously. Secondly, more precise experimental data are available now both in the resonance region and also for the strong coupling \(\hat{\alpha }_s\). Finally, we have performed detailed scrutiny of the role of above-threshold resonances. The properties of these resonances, together with the fact that \(R_b\) data are available only in a relatively small window in energy, are responsible for the main differences between the bottom- and the charm-quark case.

The experimental information that determines the b-quark mass in our approach is coming from the threshold and the below-threshold resonance data. Details of the data above the threshold are not important since we use that information only for the moments, i.e. on average, to calibrate the uncertainties. The particular form of our ansatz for the continuum is therefore also of minor importance when consistency across pairs of moments is found. This observation is corroborated by noting that the difference for the moments when using either \(\lambda _3^{b, \textrm{exp}}\) or \(\lambda _3^b\) (the parameters controlling our model when using or not using above-threshold experimental data) is very small, of the order of \(1\%\), when we include high moments in the analysis, in accordance with Ref. [39].

Our final result, \({{\hat{m}}}_b({{\hat{m}}}_b) = (4180.2 - 108.5 \Delta \hat{\alpha _s} \pm 7.9)~\textrm{MeV}\) with \(\Delta {\hat{\alpha }}_s = \hat{\alpha }_s(M_Z) - 0.1182\), is in good agreement with other determinations of the bottom quark mass that can be found in the literature. We show a comparison in Fig. 7 where we group the results in two sets and in chronological order within each set.

Fig. 7
figure 7

Recent bottom quark mass determinations from phenomenological studies (upper part; red, orange, brown and green symbols) and lattice calculations (lower part; blue points). See text for details and references

The first set is based on phenomenological approaches, extracting \({\hat{m}}_b({\hat{m}}_b)\) by comparing theory predictions with data. This includes other results based on relativistic sum rules, Chetyrkin [14], Bodenstein [42], and Dehnadi [9], shown as red diamonds. The methodology of these publications is closest to our own approach. Our higher value for \({\hat{m}}_b({\hat{m}}_b)\) can be traced to the treatment of the intermediate energy behavior where our method approaches the perturbative regime of QCD at higher energies, as discussed in detail above. We also display results based on non-relativstic sum rules (orange squares), Laschka [43], Penin [44], Kiyo [45], Beneke [46, 47], and Peset [48], as well as on other sum rule methods (green stars), Lucha [49] and Narison [39, 40].Footnote 3 The bottom quark mass (brown stars) was also determined as a by-product in a global fit to inclusive semileptonic B-meson decays to obtain the CKM matrix element \(V_{cb}\), Alberti [51], and from an analysis of deep inelastic scattering data at HERA compared with perturbative QCD calculations, Abramowicz [52].

The results in the lower part of Fig. 7 (blue points) are lattice QCD calculations. They are based on an improved non-relativistic QCD action, Lee [53], on Heavy Quark Effective Theory non-perturbatively matched to QCD, Bernardoni [54], on using time-moments of the vector current–current correlator, Colquhoun [55], as well as the MILC highly improved staggered quark ensembles with four flavors of dynamical quarks, Bazavov [56]. We also show the average of the 2021 FLAG Review [57] for \(N_f=2+1+1\).