1 Introduction

Experiments in high energy physics have established the Standard Model as a very good effective theory for elementary particle physics up to the TeV scale. Consequently, the discovery of new physics will require excellent quantitative control over Standard Model (SM) predictions including QCD effects [1]. In particular, for the strong coupling, \(\alpha _s(m_Z)\), as one of the fundamental Standard Model parameters, a sub percent uncertainty will be required, which is significantly less than the current error for the PDG (non-lattice) average \(\alpha _s(m_Z)=0.1176(11)\) [2].

At present, the most accurate results for \(\alpha _s\) are obtained from lattice QCD, as illustrated by the FLAG 2019 average \(\alpha _s(m_Z)=0.1182(8)\) [3], which was recently updated to 0.1184(8) for FLAG 2021 [4]. While lattice QCD does not require any model assumptions on hadronization, the determination of \(\alpha _s(m_Z)\) requires the solution of a multiscale or “window” problem (for an introduction cf. [5, 6]). Therefore, most lattice studies attempt to extract the coupling at relatively low energy scales where perturbative truncation effects are hard to control. In particular, there is now some evidence that error estimates obtained within perturbation theory can be rather misleading unless large energy scales are reached non-perturbatively [7, 8]. As a result many lattice determinations of \(\alpha _s\) are now limited by systematic errors. cf. [4, 5].

A solution to this multiscale problem has been known for 30 years in the form of the recursive step-scaling method [9]. The method has since been applied to the running of the coupling in QCD with \({N_{\textrm{f}}}=0\) [10,11,12,13], \({N_{\textrm{f}}}=2\) [14, 15], \({N_{\textrm{f}}}=3\) [16, 17] and \({N_{\textrm{f}}}=4\) [18, 19] quark flavours (for a review cf. [20]), and in various candidate models of physics beyond the Standard Model (for reviews, cf. [21, 22]). Its most recent application in QCD has allowed the ALPHA collaboration to non-perturbatively trace the scale evolution of the coupling in 3-flavour QCD between 0.2 and 128 GeV [16]. The corresponding result obtained for \(\alpha _s(m_Z) = 0.11852(84)\) in 5-flavour QCD defines a benchmark against which to measure progress. Knowing the scale dependence of the coupling is a pre-requisite for the step-scaling solution of other renormalization problems. This is illustrated by a recent step-scaling study of the running quark mass in 3-flavour QCD [23, 24] which will provide essential input for this paper.

Current lattice QCD simulations include the light up, down and strange quarks (\({N_{\textrm{f}}}=2+1\)), and sometimes also the much heavier charm quark as active degrees of freedom (\({N_{\textrm{f}}}=2+1+1\)). The Appelquist–Carazzone decoupling theorem [25] ensures that the effects of heavy sea quarks on low energy observables can be absorbed in parameter renormalizations up to effects that are power suppressed in the heavy quark masses [26]. In Ref. [27] the perturbative treatment of decoupling, known to 4-loops in the \(\overline{{\textrm{MS}}}\) scheme [28,29,30,31,32,33], was shown to provide an excellent quantitative description of decoupling, even at scales as low as the charm quark mass. Hence, perturbative matching of the \({N_{\textrm{f}}}=3\) coupling across the charm and bottom quark thresholds yields a reliable estimate of the \({N_{\textrm{f}}}=5\) coupling \(\alpha _s(m_Z)\) in terms of the 3-flavour \(\Lambda \)-parameter.Footnote 1 The corresponding error is small and will remain sub-dominant for the foreseeable future.

The high accuracy of perturbative decoupling means that the inclusion of the charm quark is not required for a lattice determination of \(\alpha _s(m_Z)\). There is much more to gain from focusing on a reliable and precise determination of . The currently best lattice result by the ALPHA collaboration, 341(12) MeV, has an error of \(3.5\%\) [16]. For comparison, the recent high precision study in the pure gauge theory [10] quotes i.e. an error of \(1.6\%\). Given that the error due the physical scale setting is subdominant, it is thus conceivable that a substantial error reduction for can still be achieved by pushing the 3-flavour step-scaling method to higher precision. While such a project would be very interesting, we emphasize that it is just as important to corroborate results with a different method.

The decoupling project, introduced in [34] and reviewed in [6], aims to deliver on both counts. It uses decoupling as a tool to connect \({N_{\textrm{f}}}=3\) QCD to \({N_{\textrm{f}}}=0\) QCD, and thus leverage the higher precision that can be achieved with step-scaling methods in the \({N_{\textrm{f}}}=0\) theory [10, 11]. This connection is achieved by varying the RGI mass M of a fictitious triplet of mass-degenerate quarks compared to a hadronic scale \(\mu _{\textrm{dec}}\approx 800\) MeV. We call this low energy scale the decoupling scale. Reaching values for M of up to O(10) GeV and using perturbative 4-loop decoupling then establishes a relation for between both theories, with corrections that are either perturbative in the \(\overline{{\textrm{MS}}}\)-coupling at the scale of the heavy quark mass, M, or power suppressed in 1/M. Obviously, the heavy quark mass defines another scale and thus creates a potentially difficult multi-scale problem. To alleviate this problem, the choice of \(\mu _{\textrm{dec}}\) somewhat above \(\Lambda _{\text {QCD}}\) is convenient, since the matching to a hadronic scale can be safely performed in a separate computation. The use of a finite volume renormalization scheme for the coupling at scale \(\mu _{\textrm{dec}}=1/L\) then reduces the decoupling project to a problem involving two physical scales, \(\mu _{\textrm{dec}}\) and M, where the challenge remains to reach large \(z=M/\mu _{\textrm{dec}}\gg 1\) while keeping the lattice spacing small enough so that \(M \ll 1/a\). In this paper we discuss the details of the decoupling strategy and the numerical simulations we performed. When combined with earlier scale setting results [35] and precision \({N_{\textrm{f}}}=0\) studies  [10, 11], the very accurate result,  MeV, is obtained with uncertainties still dominated by statistical errors. Relying on the usual perturbative matching to 5-flavour QCD this translates to our result \(\alpha _s(m_Z)= 0.11823(84)\).

The paper is organized as follows. In Sect. 2 we give a step by step overview of the decoupling strategy, in a language aimed also at non-lattice experts. Section 3 proposes a closer look at the continuum and decoupling limits. In particular, the leading corrections are derived in order to guide the analysis of the numerical data. Section 4 starts with the chosen set-up of non-perturbatively O(a) improved lattice QCD with Wilson quarks, continues with a summary of the simulations with massive quarks and then presents the continuum and heavy mass extrapolations of the data which lead to the \(\Lambda \)-parameter. In Sect. 5 we obtain the corresponding result for \(\alpha _s(m_Z)\) and we conclude with an outlook (Sect. 6). A number of appendices have been included. Appendix A explains how we estimated heavy mass effects of order 1/M originating from the space-time boundaries. Appendix B summarizes the required \({N_{\textrm{f}}}=0\) results, obtained either by dedicated \({N_{\textrm{f}}}=0\) simulations or taken from the literature. Appendix C contains details about \({N_{\textrm{f}}}=3\) simulations, both with massive and massless quarks. The latter are required to ensure O(a) improvement of the renormalized quark masses. Appendix D discusses the derivation and numerical implementation of formulas (4.2) and (4.6), which allow us to perform shifts to the data and correct for small mistunings to the relevant lines of constant physics. The ingredients for quark mass renormalization and O(a) improvement, as well as some consistency checks, are then given in Appendix E. The impact of errors in the quark mass determination is discussed in Appendix F. Finally, Appendix G explains how bare parameters were chosen on the larger lattices to ensure constant physical conditions.

2 The decoupling strategy

The decoupling strategy has been introduced and explained in Ref. [34]. We assume the reader to be familiar with this reference and try to keep the overlap minimal. Some aspects of the chosen strategy may not seem obvious at first sight, as they are conditioned by previous projects of the ALPHA collaboration [10, 11, 16, 23, 35, 36]. Besides the lattice action, this concerns the choice of boundary conditions and renormalized couplings. Further technical issues arise from the necessity of O(a) improvement and the need to control boundary effects both at O(a) and in the heavy mass expansion. We will address these points in the following sections. Here we use a continuum language to discuss how the decoupling strategy is set up in principle.

2.1 Renormalization group invariant parameters

To set the stage we consider continuum QCD with \({N_{\textrm{f}}}\) mass-degenerate quarks. The theory only has two bare parameters, the coupling and the quark mass. If these are renormalized in a mass-independent scheme s, their scale dependence gives rise to the definition of the \(\beta \)-function

(2.1)

and the quark mass anomalous dimension,

(2.2)

The renormalization scheme dependence begins with \(b_2\) and \(d_1\), and the universal first coefficients are, for 3 colours,

$$\begin{aligned}{} & {} b_0 = \left( 11-\frac{2}{3}{N_{\textrm{f}}}\right) \times (4\pi )^{-2},\nonumber \\{} & {} \quad b_1 = \left( 102-\frac{38}{3}{N_{\textrm{f}}}\right) \times (4\pi )^{-4} , \quad d_0 = 8 \times (4\pi )^{-2}.\nonumber \\ \end{aligned}$$
(2.3)

Given a non-perturbative definition of the running parameters in a scheme s and thus non-perturbative results for the RG functions \(\beta _s\) and \(\tau _s\), one may define the renormalization group invariant (RGI) parameters,

$$\begin{aligned} \Lambda _s= & {} \mu \varphi _s^{}(\bar{g}_s (\mu )), \end{aligned}$$
(2.4)
$$\begin{aligned} \varphi _s^{}(\bar{g}_s)= & {} ( b_0 \bar{g}_s^2 )^{-b_1/(2b_0^2)} \textrm{e}^{-1/(2b_0 \bar{g}_s^2)}\nonumber \\{} & {} \times \exp \left\{ -\int \limits _0^{\bar{g}_s} \textrm{d}x\ \left[ \frac{1}{\beta _s(x)} +\frac{1}{b_0x^3} - \frac{b_1}{b_0^2x} \right] \right\} , \nonumber \\ M= & {} \overline{m}_s(\mu ) \left[ 2b_0\bar{g}_s^2(\mu ) \right] ^{-\frac{d_0}{2b_0}}\nonumber \\{} & {} \times \exp \left\{ -\int \limits _0^{\bar{g}_s(\mu )} \left[ \frac{\tau _s(x)}{\beta _s(x)}-\frac{d_0}{b_0x} \right] \, dx \right\} , \end{aligned}$$
(2.5)

where the RGI quark mass M is scheme independent, while \(\Lambda _s\) depends on the renormalization scheme s, the standard reference being the \(\overline{{\textrm{MS}}}\) scheme of dimensional regularization.Footnote 2 The running coupling, the quark mass, and the RGI parameters are defined for QCD with a fixed flavour number \({N_{\textrm{f}}}\). We occasionally indicate this dependence by a superscript; when omitted we refer to QCD with non zero \({N_{\textrm{f}}}\) implying \({N_{\textrm{f}}}=3\) in our numerical work. Note that the ratio \(\Lambda _s/\mu \) is, for large enough \(\mu \), in one-to-one correspondence with the coupling \(\bar{g}_s^2(\mu )\) in scheme s. Hence, the running parameters in scheme s at large \(\mu \) can be traded for and the scheme independent RGI quark mass M.

2.2 Decoupling relations

So far we have assumed that the coupling and quark mass are renormalized in a mass-independent scheme. In practice, this is achieved by imposing the renormalization condition at vanishing quark mass [37]. A renormalized coupling at finite quark mass defines a function \(\bar{g}_s^{}(\mu ,M)\) such that, for \(M\ll \mu \), it coincides with the coupling in a massless scheme, \(\bar{g}_s^{}(\mu ,0) = \bar{g}_s(\mu )\), whereas, for \(M \gg \mu \) its scale dependence is well described by an effective theory with the \({N_{\textrm{f}}}\) heavy quarks removed. In the absence of other light quarks, this simply is the pure gauge theory or “quenched (\({N_{\textrm{f}}}=0\)) QCD”. We can thus write

$$\begin{aligned} \bar{g}_s^{}(\mu ,M) = \bar{g}_s^{(0)}(\mu ) +{\textrm{O}}(\mu ^2 /M^2) , \end{aligned}$$
(2.6)

where on the r.h.s. the scale \(\mu \) in units of the pure gauge theory \(\Lambda \)-parameter has an implicit M-dependence. The mass-dependent coupling is thus seen to interpolate the couplings in QCD with \({N_{\textrm{f}}}\) and zero flavours. This in turn implies a relation between the respective mass independent couplings. In perturbation theory and in the \(\overline{{\textrm{MS}}}\) scheme the ensuing relations have been computed up to 4-loop order [28,29,30,31,32,33]:

(2.7)

where the scale choice eliminates the 1-loop coefficient in

$$\begin{aligned} C(x) = 1 + c_2x^4 + c_3 x^6 + c_4x^8 + \cdots , \end{aligned}$$
(2.8)

and, for \({N_{\textrm{f}}}=3\), one obtains [33]

$$\begin{aligned}{} & {} c_2 = 2.940776\times 10^{-4},\quad c_3 = 4.435355\times 10^{-5},\nonumber \\{} & {} \quad c_4 = 5.713208\times 10^{-6} . \end{aligned}$$
(2.9)

Beyond perturbation theory, the limit of infinite \(M/\mu \) requires the extrapolation of numerical data and one would like to understand how it is approached. To this end, the language of effective field theory is most helpful, cf. [27]. Assuming that the heavy quark limit is described by a local effective theory, one obtains a systematic expansion in inverse powers of the quark mass M. In particular, with all fermions decoupled simultaneously the effective theory takes the form of the pure gauge theory where the inverse mass corrections are proportional to insertions of higher dimensional local gluonic operators. One naturally wonders whether all powers of 1/M may appear in this expansion. In the absence of space-time boundaries, and for gluonic observables defining the running couplings, the locality of the effective decoupling theory, Euclidean O(4) symmetry and gauge invariance rule out any odd-dimensional terms so that the expansion is effectively in \(1/M^2\).

The situation changes if space-time is a manifold with boundaries, as this allows for additional local boundary terms at order 1/M in the effective Lagrangian. This is relevant in our case, where we use the standard Schrödinger functional set-up on a space-time hyper-cylinder of 4-volume \(L^3\times T\), with Dirichlet boundary conditions imposed on some of the fermionic and gauge field components at Euclidean times \(x_0=0\) and \(x_0=T\) [38, 39]. In order to minimize the impact of such boundary contributions, we chose a geometry such as to have large distances from the boundary to the observable defining the coupling in the mass-dependent scheme. Using the decoupling effective field theory, with a single local term at the boundaries \(x_0=0\) and \(x_0=T\), we are able to compute the 1/M contributions to the coupling. They are given in terms of a 1-loop matching coefficient (see Appendix A.1) and a non-perturbative matrix element. The latter is evaluated by simulations in the decoupled theory with \(N_{\textrm{f}}=0\) and extrapolated to the continuum (see Appendix A.2). We can thus confirm that the boundary effects are negligible for \(T=2L\).

2.3 Master formula and strategy breakdown

Following Ref. [34], the decoupling strategy can be cast in the form

(2.10)

Note that the desired ratio on the l.h.s. in \({N_{\textrm{f}}}=3\) QCD also appears on the r.h.s. in the argument of the function P, i.e. the equation is implicit. In order to solve it one needs to be able to both evaluate the pure gauge theory function \(\varphi _s^{(0)}\) and the function P. The latter corresponds to \(P_{0,{N_{\textrm{f}}}}\) in the notation of [27], where it was shown that non-perturbatively P has an ambiguity of order \((\Lambda /M)^2\), which arises once the reference to a specific matching condition is removed. A major result of [27] is the observation that the perturbative evaluation of P using the -scheme is numerically very accurate already for quark masses in the charm region. With the heavy quark mass setting the scale for the -coupling, the accuracy further improves towards the decoupling limit. In practice, Eq. (2.10) will be used for a range of finite but reasonably large values \(M/\mu _{\textrm{dec}}\). When using a perturbative approximation for P in the \(\overline{\textrm{MS}} \) scheme, deviations from the limit are expected to be on one hand proportional to \(1/M^2\) and on the other hand logarithmic corrections of \(\text {O}(\alpha _{\overline{\textrm{MS}} }^4(M))\). This assumes that linear terms in 1/M are either completely absent by symmetry or that they can be controlled or subtracted explicitly.

While decoupling can be studied in the infinite volume regime [27], for a lattice QCD approach it is advantageous to separate the determination of the hadronic scales from the study of decoupling, by using a finite volume renormalization scheme [34]. We use the GF scheme with SF boundary conditions, first introduced in [40], with details given in [35]. In a continuum language it is given by

$$\begin{aligned} \bar{g}^2_{\text {GF}}(\mu )= & {} {{\mathcal {N}}}^{-1} \nonumber \\{} & {} \times \sum _{k,l=1}^3\left. \dfrac{t^2\langle {{\textrm{tr}}} \left\{ G_{kl}(t,x)G_{kl}(t,x)\right\} \delta _{Q,0} \rangle }{\langle \delta _{Q,0}\rangle }\right| _{\mu =1/L,T=L,\, M=0}^{x_0=T/2,\,c = \sqrt{8t}/L}\nonumber \\ \end{aligned}$$
(2.11)

where \(G_{\mu \nu }(t,x)\) denotes the field tensor for the gauge field at flow time t, and \({{\mathcal {N}}}\) is a known normalization factor which ensures \(\bar{g}^2_{\text {GF}}(\mu ) = g_0^2 + {\textrm{O}}(g_0^4)\), with \(g_0\) the bare coupling. The projection onto the topological charge \(Q=0\) sector is part of the scheme definition and merely introduced in order to avoid technical difficulties with the numerical simulation algorithms. The remaining parameter c fixes the ratio between the scales set by the flow time and the finite volume.

We may now break down the decoupling strategy into several steps:

  1. 1.

    Decoupling scale \(\mu _{\textrm{dec}}\): Given the coupling in the massless fundamental theory, we fix \(\mu _{\textrm{dec}}\) by setting

    $$\begin{aligned} \bar{g}^2_{\text {GF}}(\mu _{\textrm{dec}}) = 3.949. \end{aligned}$$
    (2.12)

    From previous work [16, 35] one finds \(\mu _{\textrm{dec}}= 789(15)\,{\textrm{MeV}}\), which is a typical QCD scale. Equation (2.12) defines a so-called line of constant physics (LCP); following it towards the continuum, lattice spacing \(a\rightarrow 0\), means that the limit is approached at fixed . Evaluating the LCP for a given lattice size \(L/a=1/(a\mu _{\textrm{dec}})\) defines a corresponding value for the bare coupling, \(g_0^2\equiv 6/\beta \), and vice versa. We have implicitly assumed here that M vanishes. With Wilson fermions this requires a further tuning condition on the bare mass parameter. Details are discussed in Sect. 4.1.

  2. 2.

    Definition of \(z=M/\mu _{\textrm{dec}}\): A further set of constant physics conditions is obtained by fixing the RGI quark mass in units of \(\mu _{\textrm{dec}}\). Choosing a set of values in the range \(\in [2,12]\), one needs to work out, for given lattice spacing (as obtained from Eq. (2.12)) the corresponding bare mass parameters \(am_0\). The details of this procedure will be discussed in Sect. 4.2.

  3. 3.

    Determination of \(\bar{g}^{(0)}_{\text {GFT}}(\mu _{\textrm{dec}})\): The value of a renormalized coupling in the \({N_{\textrm{f}}}=0\) theory, at a known scale \(\mu _{\textrm{dec}}\) is obtained by evaluating the same coupling in the fundamental theory at a heavy mass M and assuming decoupling. The main problem with a mass dependent GF-coupling are boundary 1/M terms, which render decoupling slower than necessary. In order to minimize these effects we use a variant of the coupling with \(T=2L\),

    $$\begin{aligned} \bar{g}^2_{\text {GFT}}(\mu ,M) = \left. \bar{g}^2_{\text {GF}}(\mu )\right| _{T\rightarrow 2L, M\rightarrow z\mu }. \end{aligned}$$
    (2.13)

    Compared to the GF coupling (2.11) this doubles the distance of the magnetic energy density to the boundaries, thereby reducing the coefficient of the 1/M boundary contribution substantially. Calling this scheme GFT, the main computational effort was required for the evaluation of \(\bar{g}^2_{\text {GFT}}(\mu _{\textrm{dec}}, M)\), at the lattice spacings and bare quark masses which follow from the chosen lines of constant physics.

  4. 4.

    Determination of \(\bar{g}^{(0)}_{\text {GF}}(\mu _{\textrm{dec}})\): Obtaining this input value for the precisely known \(\varphi _{\text {s}}^{(0)}\) (with the scheme \(s=\text {GF}\)) in Eq. (2.10) requires the establishment of a non-perturbative relation between the GF and the GFT schemes in the \({N_{\textrm{f}}}=0\) theory at scale \(\mu _{\textrm{dec}}\). This is achieved by evaluating the GF coupling along a LCP defined by a fixed value of the GFT coupling, and continuum extrapolating.

  5. 5.

    Determination of : The recent step-scaling study of the GF coupling in [10] allows us to evaluate \(\varphi _\textrm{GF}^{(0)}(\bar{g}^{(0)}_{\text {GF}}(\mu _{\textrm{dec}})) = \Lambda ^{(0)}_{\text {GF}}/\mu _{\textrm{dec}}\) which completes the numerator in the square brackets of Eq. (2.10). The conversion to the \(\overline{{\textrm{MS}}}\) \(\Lambda \)-parameter then simply requires the one-loop matching between the GF and \(\overline{{\textrm{MS}}}\)-couplings in the pure gauge theory [41].

  6. 6.

    The function P gives the ratios of \(\Lambda \)-parameters between the fundamental and effective theories and can be reliably evaluated in massless continuum perturbation theory

    (2.14)

    with C(g) known to 4-loop order, cf. Eq. (2.9) and the notation . In particular, the quark mass M only enters to set the scale. For given \(z=M/\mu _{\textrm{dec}}\), the l.h.s. and the function P in Eq. (2.10) only depend on and with \(\varphi _\textrm{GF}^{(0)}\) known it remains to numerically solve for . This is to be repeated for all available z-values and the result for is then obtained by extrapolation to the decoupling limit.

Concluding this overview, we see that, besides the evaluation of the GFT coupling in a mass-dependent scheme, the main ingredients are precision results in the pure gauge theory for the running GF coupling and the matching between GF and GFT schemes. Together with available 5-loop perturbative results for the function P, this allows us to infer the \({N_{\textrm{f}}}=3\) \(\Lambda \)-parameter in units of \(\mu _{\textrm{dec}}\) and thus in MeV, given the relation of \(\mu _{\textrm{dec}}\) to a hadronic scale from [35].

3 The continuum and decoupling limits: a closer look

In this section we discuss the approach to the continuum and decoupling limits in some more detail, in order to provide the theoretical underpinning for the analysis of the lattice data. While the limits are conceptually independent, in practice they are best dealt with together, in terms of effective continuum field theories. The methods of Refs. [42,43,44,45,46] allow us in principle to go beyond power law behaviour and use renormalization group improved perturbation theory to obtain the correct leading asymptotics. This holds true for both limits, with the small parameter being either the lattice spacing or the inverse quark mass. While the information for the bulk effects is still incomplete, the discussion serves to motivate the fit ansätze which will be used in the data analysis, cf. Sect. 4. For the boundary 1/m effects we are able to estimate the full contribution in Sect. 3.3 without a fit to the data. We will focus on the bulk effects first and address the influence of the boundaries in the Euclidean time direction in the end.

3.1 Symanzik’s effective theory for lattice QCD

Following Symanzik [47,48,49,50], the approach of a connected lattice correlation function to the continuum limit can be described in terms of an effective continuum theory, with action

$$\begin{aligned} S_{\text {eff}} = S_0 + a S_1 + a^2 S_2 + \cdots . \end{aligned}$$
(3.1)

Here, \(S_0\) is the continuum action and \(S_k\) are space-time integrals over linear combinations of local composite fields of mass dimension \(4+k\), \(k=1,2,\ldots \), which respect all the symmetries of the lattice action. We will omit \(S_1\) in the following, assuming a non-perturbatively O(a) improved lattice set-up. Residual effects due to \(S_1\) are dealt with separately (cf. Sects. 3.3 and 4, and Appendix E).

Local fields O defining the observables are represented by corresponding effective fields and expanded similarly,

$$\begin{aligned} O_{\text {eff}} = O_0 + a O_1 + a^2 O_2 + \cdots . \end{aligned}$$
(3.2)

Gluonic gradient flow observables \(O_{\text {gf}}\) can be formulated in terms of a local 4 + 1 dimensional field theory [51, 52] with the flow time t as the extra coordinate. This allows us to work entirely in terms of local observables O and improve them to O(\(a^2\)), such that \(O_1\) and \(O_2\) vanish in the effective field description. We also assume that the O(\(a^2\)) effects originating from the 4 + 1 dimensional bulk action are removed by an appropriate O(\(a^2\)) modification of the flow equation [52]. The Symanzik expansion for such observables then takes the form

$$\begin{aligned} \langle O_{\text {gf}}\rangle _{\text {lat}}= \langle O_{\text {gf}} \rangle _{\text {cont}} - a^2 \langle O_{\text {gf}} S_2\rangle _{\text {cont}} + \cdots , \end{aligned}$$
(3.3)

in terms of connected correlation functions. Although \(S_2\) contains an integral over space-time, no contact terms are generated for gradient flow observables \(O_{\text {gf}}\), as they are separated from \(S_2 \) by the finite flow time t. For the lattice set-up with non-perturbatively O(a) improved, mass degenerate Wilson quarks, \(S_2\) is given as a linear combination of 18 local dimension-6 operators,

$$\begin{aligned} S_2 = \int d^4x\, \sum _{i=1}^{18} \omega _i {{\mathcal {O}}}_{i}(x), \end{aligned}$$
(3.4)

integrated over space-time. This constitutes an operator basis after the use of the equations of motion, the elimination of total derivative terms and the use of relations among 4-quark operators due to Fierz transformations. In the absence of gradient flow observables, the equations of motion simplify and allow for the elimination of 2 operators [45].

Note that Eq. (3.3) makes the power dependence on the lattice spacing explicit. An additional a-dependence arises through the coefficients \(\omega _i\) of the operators in \(S_2\), as these can be understood as functions of a renormalized coupling at the cutoff scale, \(\mu =1/a\). Close to the continuum limit asymptotic freedom implies that their leading asymptotic behaviour is exactly computable. This was first used by Balog, Niedermayer and Weisz [53, 54] in their analysis of the 2-dimensional O(n) \(\sigma \)-model. The technique has recently been extended and applied to gauge theories in various lattice regularisations, including lattice QCD with quarks of both the Wilson and Ginsparg–Wilson type [42,43,44,45,46]. Technically, one needs to compute, to 1-loop order, the anomalous dimension matrix for the set of mass dimension 6 operators entering \(S_2\). The operator basis mixes under renormalization, \({{\mathcal {O}}}_{\text {R},i} = \sum _{j=1}^{18} Z_{ij} {{\mathcal {O}}}_{j}\), and, following our conventions from Sect. 2, we define the corresponding anomalous dimension matrix

$$\begin{aligned} \gamma _{ij}^{{\mathcal {O}}}= & {} \sum _{k=1}^{18} \left( \mu \dfrac{d}{d\mu } Z_{ik}\right) \left( Z^{-1}\right) _{kj}\nonumber \\= & {} - g^2 \left[ {(\gamma _0^{{\mathcal {O}}})}_{ij} + {(\gamma _1^{{\mathcal {O}}})}_{ij} g^2 + {\textrm{O}}(g^4)\right] . \end{aligned}$$
(3.5)

A change of basis, \({{\mathcal B}}_i = \sum _j V_{ij}{{\mathcal {O}}}_j\), may then be performed in order to diagonalize the one-loop anomalous dimension matrix, \(\gamma _0^{{\mathcal {O}}}\), and determine its eigenvectors and eigenvalues. Denoting the transformed anomalous dimension matrix by

$$\begin{aligned}{} & {} \gamma ^{{\mathcal B}} = V\gamma ^{{\mathcal {O}}}V^{-1} = -g^2\left( \gamma ^{{\mathcal B}}_0 + \gamma ^{{\mathcal B}}_1 g^2+ \cdots \right) ,\nonumber \\{} & {} \quad \left( \gamma _0^{{\mathcal B}}\right) _{ij} = \delta _{ij}\gamma _{0,i}^{{\mathcal B}}, \end{aligned}$$
(3.6)

renormalization group invariant (RGI) operators can be defined throughFootnote 3

$$\begin{aligned} {{\mathcal B}}_{i}^{\text {RGI}} = \lim _{\mu \rightarrow \infty } \left[ \alpha _{\overline{{\textrm{MS}}}}(\mu )\right] ^{-\hat{\gamma }_i^{\mathcal B}}{{\mathcal B}}_{i}(\mu ),\quad \hat{\gamma }^{\mathcal B}_i = \gamma _{0,i}^{{\mathcal B}}/(2b_0). \nonumber \\ \end{aligned}$$
(3.7)

At finite \(\mu \) there are corrections of O(\(\alpha \)) stemming from the two- and higher loop anomalous dimensions. Note also the \({N_{\textrm{f}}}\)-dependence of \(\hat{\gamma }^{\mathcal B}_i\), due to the normalization by \(2b_0\).

In terms of the eigenbasis of operators, \(\{{{\mathcal B}}_i\}\), the cutoff effects take the form

$$\begin{aligned}{} & {} \langle O_\text {gf}\rangle _\text {lat} = \langle O_\text {gf} \rangle _\text {cont} - a^2 \sum _i b_i\left( \alpha _{\overline{{\textrm{MS}}}}(1/a)\right) [\alpha _{\overline{{\textrm{MS}}}}(1/a)]^{\hat{\gamma }^{\mathcal B}_i}\nonumber \\{} & {} \int d^4 x\langle O_\text {gf}{{\mathcal B}}^\text {RGI}_i(x)\rangle _\textrm{cont}+\ldots \,, \end{aligned}$$
(3.8)

where the coefficient functions

$$\begin{aligned} b_i(\alpha ) = \sum _{j=1}^{18} \omega _j(\alpha )\,(V^{-1})_{ji} = \sum _{n\ge 0} \alpha ^{n} b_i^{(n)}, \end{aligned}$$
(3.9)

are given as a linear combination of the \(\omega _i\), Eq. (3.4), and are thus perturbatively computable. For tree-level O(\(a^2\)) improved lattice actions, \(b^{(0)}_i = 0\), and the higher coefficients can be successively eliminated by perturbative O(\(a^2\)) improvement of the lattice action. For lattice QCD with O(a) improved Wilson quarks, one then expects that the leading cutoff effects in the bulk are of the form

$$\begin{aligned} \langle O_{\text {gf}}\rangle _{\text {lat}}= & {} \langle O_{\text {gf}} \rangle _{\text {cont}} - a^2 \sum _{i=1}^{18} A_i [\alpha _{\overline{{\textrm{MS}}}}(1/a)]^{\hat{\Gamma }_i}\nonumber \\{} & {} \times \left\{ 1 + {\textrm{O}}(\alpha _{\overline{{\textrm{MS}}}}(1/a))\right\} + {\textrm{O}}(a^3), \end{aligned}$$
(3.10)
$$\begin{aligned} A_i= & {} b_i^{(n_i^{\textrm{I}})} \int d^4x\,\left\langle O_{\text {gf}}{{\mathcal B}}^{\text {RGI}}_i(x)\right\rangle _{\text {cont}}, \end{aligned}$$
(3.11)

where the neglected powers in \(\alpha \) include both the expansion of \(b_i(\alpha )\) and terms containing the (non-diagonal) higher order anomalous dimensions. The constants \(A_i\) contain the insertions of the scale-independent RGI operators and \(\hat{\Gamma }_i=\hat{\gamma }^{\mathcal B}_i +n_i^{\textrm{I}}\) depends on the degree of perturbative O(\(a^2\)) improvement of the lattice action. For example, a tree-level (completely) improved action leads to \(n_i^{\textrm{I}}\ge 1\) and in general we have \(b_i=\hat{b}_i\alpha ^{n_i^{\textrm{I}}} (1+{\textrm{O}}(\alpha ))\).

For \({N_{\textrm{f}}}=3\) lattice QCD, with O(a) improved Wilson quarks, Husung et al. [44,45,46] found that the spectrum for the 1-loop anomalous dimensions is bounded from below by \(\hat{\Gamma }_i \ge -1/9\), for the basis of 16 operators needed for observables not involving the gradient flow. There are then 6 operators found with 1-loop anomalous dimensions \(-1/9 \le \hat{\Gamma }_i < 8/9\). The remaining operators describe cutoff effects accompanied by powers of \(\alpha \) equal or higher than other neglected terms and may therefore be discarded. Explicit expressions for the eigen-operators of the 1-loop anomalous dimension matrix are, in general, rather complicated and will not be required here. For the case of gradient flow observables this result is not complete, as there are two further dimension 6 operators which must be included to obtain the full matrices V and \(\gamma ^{{\mathcal {O}}}_0\) [45]. They have so far only been computed in the pure gauge theory [45].

In view of the heavy mass expansion, there is a very interesting block structure in \(\gamma ^{{\mathcal {O}}}_0\), for the subset of operators in \(S_2\) which come with a positive power of the quark mass. For \({N_{\textrm{f}}}=3\) with non-degenerate quarks there are eleven operators [44,45,46], which reduce to just three for degenerate quarks, namely

$$\begin{aligned}{} & {} {{\mathcal {O}}}_{m,1} = \frac{1}{g_0^2}\sum _{\mu ,\nu } m^2 \,\hbox {tr}\left( F_{\mu \nu }F_{\mu \nu }\right) ,\quad {{\mathcal {O}}}_{m,2} = m^3 \overline{\psi }\psi ,\nonumber \\{} & {} \quad {{\mathcal {O}}}_{m,3} = \frac{1}{4} \sum _{\mu ,\nu } m\overline{\psi }\, i\sigma _{\mu \nu } F_{\mu \nu } \psi . \end{aligned}$$
(3.12)

Note that this subset will remain the same for gradient flow observables, as the additional operators do not come with mass factors. Moreover, the tridiagonal block structure of \(\gamma ^{{\mathcal {O}}}_0\) means that their anomalous dimensions will not be affected by enlarging the basis and their renormalization can be consistently carried out ignoring the remainder of the basis [44,45,46]. This is fortunate, as it means that the structure of the leading \(a^2m^2\) lattice effects can be inferred with current knowledge. We denote the corresponding basis of eigen-operators for the 1-loop anomalous dimension matrix by \(\{{{\mathcal B}}_{m,i}\}_{i=1,2,3}\), and their 1-loop anomalous dimensions are then given by [44,45,46],

$$\begin{aligned} \hat{\gamma }^{\mathcal B}_{m,1} = -1/9,\quad \hat{\gamma }^{\mathcal B}_{m,2} = 14/27, \quad \hat{\gamma }^{\mathcal B}_{m,3} = 8/9. \end{aligned}$$
(3.13)

Furthermore, from [44,45,46] one infers that, for our lattice action (cf. Appendix C), we have \({b}_{m,i} = \hat{b}_{m,i} + {\textrm{O}}(\alpha )\) so that \(\hat{\Gamma }_{m,i}=\hat{\gamma }^{\mathcal B}_{m,i}\) for \(i=1,2,3\). We note that one only needs to retain the first two operators as the difference \(\hat{\gamma }^{\mathcal B}_{m,3}-\hat{\gamma }^{\mathcal B}_{m,1}=1\) translates to a relative factor of \(\alpha \), i.e. \({{\mathcal B}}_{m,3}\) contributes at the same order as other neglected contributions.

3.2 The decoupling expansion

The Symanzik expansion renders the a-dependence explicit, both for the powers of a and the leading logarithmic terms given as fractional powers of \(\alpha _{\overline{{\textrm{MS}}}}(1/a)\). The connected correlation functions which appear in this expansion are thus defined in the continuum limit, with respect to the continuum QCD action. In a second step, we now determine how the continuum correlation functions, \( \langle O_{\text {gf}}\rangle _{\text {cont}}\), \(\langle O_{\text {gf}} S_2\rangle _{\text {cont}},\) behave as the quark mass m is taken large. The effective decoupling theory bears formal similarities with Symanzik’s effective theory, in particular it renders both the powers in 1/m and the logarithmic corrections explicit. The effective decoupling action can be expanded,

$$\begin{aligned} S_{\text {dec}} = S_{0,\text {dec}} + \frac{1}{m} S_{1,\text {dec}}+\frac{1}{m^2} S_{2,\text {dec}} + {\textrm{O}}(1/m^3), \end{aligned}$$
(3.14)

with

$$\begin{aligned} S_{0,\text {dec}}= & {} -\frac{1}{2} \int d^4x \,{\mathcal {D}}_0(x),\quad {\mathcal {D}}_{0} = \frac{1}{g_0^2} \,\hbox {tr}(F_{\mu \nu } F_{\mu \nu }), \end{aligned}$$
(3.15)
$$\begin{aligned} S_{2,\text {dec}}= & {} \int d^4x \left( d^{\textrm{S}}_1 {\mathcal {D}}_{1}(x) + d^{\textrm{S}}_2 {\mathcal {D}}_{2}(x)\right) . \end{aligned}$$
(3.16)

Due to the simultaneous decoupling of all quarks the leading term, \(S_{0,\text {dec}}\), is given by the pure gauge action. \(S_{k,\text {dec}}\) are given space-time integrals of gauge invariant local operators of mass dimension \(4+k\), polynomial in the gauge field and its derivatives. Gauge and O(4) symmetries do not allow for odd values of k, so that the first order term must vanish. The dimension-6 pure gauge operators in Eq. (3.16) take the form,

$$\begin{aligned}{} & {} {\mathcal {D}}_{1} = \frac{1}{g_0^2}\sum _{\mu ,\nu ,\rho }\,\hbox {tr}\left( D_\mu F_{\mu \nu }D_\rho F_{\rho \nu }\right) ,\nonumber \\{} & {} \quad {\mathcal {D}}_{2} = \frac{1}{g_0^2}\sum _{\mu ,\nu ,\rho }\,\hbox {tr}\left( D_\mu F_{\rho \nu } D_\mu F_{\rho \nu }\right) -\frac{23}{7}{\mathcal {D}}_{1}, \end{aligned}$$
(3.17)

where we have directly chosen the eigenbasis of the one-loop anomalous dimension matrix, with eigenvalues \(\hat{\gamma }^{\mathcal {D}}_{0,1,2} = -1,0,7/11\), respectively. The coefficients \(d^{\textrm{S}}_{1,2}\) are matching coefficients between QCD with \({N_{\textrm{f}}}=3\) heavy quarks and the \({N_{\textrm{f}}}=0\) effective theory and can be perturbatively expanded in , taken at the decoupling scale. In perturbation theory, this scale is most naturally defined as the running quark mass \(\overline{m}^{}_{\overline{{\textrm{MS}}}}\left( \mu \right) \) at its own scale,

$$\begin{aligned} m_\star = \overline{m}^{}_{\overline{{\textrm{MS}}}}\left( m_\star \right) , \end{aligned}$$
(3.18)

which also defines the (inverse) expansion parameter of the effective decoupling theory.

Besides the effective action, observables O have an effective large mass description, too. For the case of linear combinations of the fields \({\mathcal B}_i\), \(O= \sum _i c_i{{\mathcal B}}_i\), it starts with a term of O(\(m^2\)). We thus expect the form

$$\begin{aligned}{}[O]_\text {dec} = m^2\sum _{k\ge 0} \frac{1}{m^{2k}} O_{2k,\textrm{dec}}, \end{aligned}$$
(3.19)

where the fields \(O_{\textrm{2k,dec}}\) are linear combinations of gauge invariant local composite fields, polynomial in the gauge field and its derivatives, of mass dimension \(d_O+2(k-1)\), where \(d_O\) is the dimension of the observable. For gradient flow observables \(O_{\text {gf}}\) we will assume that the effective observable description reduces to the term with \(k=1\).

We will now look at the combined Symanzik and decoupling expansion in a and 1/m and discuss in turn the correction terms of order O(\(1/m^2\)), O(\(a^2m^2\)) and O(\(a^2)\). We emphasize that the decoupling expansion is applied to the Symanzik effective theory. Hence, the combined expansion is valid for

$$\begin{aligned} q \ll m \ll 1/a , \end{aligned}$$
(3.20)

for all scales q present in the observable considered. In our application these are \(q\in \{1/\sqrt{8t},\Lambda _{\textrm{QCD}}\}\).

3.2.1 Corrections of O(\(1/m^2\))

To order \(1/m^2\) in the heavy mass expansion, we formally have,

$$\begin{aligned} \langle O_{\text {gf}} \rangle _{\text {cont}} = \langle O_{\text {gf}}\rangle _{\text {dec}}- \frac{1}{m^2} \langle O_{\text {gf}} S_{2,\text {dec}}\rangle _{\text {dec}} + \cdots , \end{aligned}$$
(3.21)

which evaluates to

(3.22)

where we have converted to the RGI operators in the \({N_{\textrm{f}}}=0\) theory. Without performing an explicit matching calculation, the leading order, \(l^{\textrm{S}}_i\), in \(d^{\textrm{S}}_i = \hat{d}^{\textrm{S}}_i \alpha ^{l^{\textrm{S}}_i}+{\textrm{O}}(\alpha ^{l^{\textrm{S}}_i+1})\), \(i=1,2\), is not known and we will have to use assumptions for \(l^{\textrm{S}}_i\). Also converting to the RGI quark mass, M,

(3.23)

then leads to the asymptotic large mass behaviour in the continuum limit of the form

(3.24)

where we have used that the couplings of the \({N_{\textrm{f}}}=3\) and 0 theory coincide at the decoupling scale, i.e. , up to terms of O(\(\alpha ^2\)), which are neglected here. The constants \(D_i\) parametrize the matrix elements in the decoupled theory and the exponents of \(\alpha \) are further specified as

$$\begin{aligned} l^{\textrm{S}}_i-2\hat{\gamma }_m +\hat{\gamma }_i^{\mathcal {D}} = {\left\{ \begin{array}{ll} l^{\textrm{S}}_1 - 8/9 &{} (i=1), \\ l^{\textrm{S}}_2 -25/99 &{} (i=2).\end{array}\right. } \end{aligned}$$
(3.25)

Assuming, e.g. \(l^{\textrm{S}}_1=l^{\textrm{S}}_2=1\) (at least one fermion loop has to be present in QCD), then fixes a possible ansatz for the heavy mass extrapolation of continuum extrapolated data for the gradient flow observable, with leading correction terms and , for \(i=1,2\), respectively.

3.2.2 Corrections of O(\(a^2m^2\)) and O(\(a^2\))

We now turn to the large mass expansion of \(\langle O_{\text {gf}}{{\mathcal B}}_i^{\text {RGI}}\rangle _{\text {cont}}\) which appears at O(\(a^2\)) in the Symanzik expansion. We first transform the RGI operators to the relevant scale \(\mu =m_\star \), by applying Eq. (3.7)

$$\begin{aligned} {{\mathcal B}}_{i}^{\text {RGI}} = \left[ \alpha _{\overline{{\textrm{MS}}}}(m_\star )\right] ^{-\hat{\gamma }^{\mathcal B}_i}{{\mathcal B}}_{i}(m_\star )\left[ 1 + {\textrm{O}}(\alpha (m_\star ))\right] , \end{aligned}$$
(3.26)

and inserting into the Symanzik expansion coefficient,

(3.27)

where we have neglected terms of relative \({\textrm{O}}(\alpha )\) and introduced the notation,

(3.28)

In this approximation, we expect that less than half of the 18 operators contribute terms that are parametrically leading in \(\alpha \). However, a precise statement can only be made once the full one-loop anomalous dimension matrix and the coefficients \(b_i\) are known.

With these preliminaries we use the effective decoupling description for the operators \({{\mathcal B}}_i\),

$$\begin{aligned}{}[{{\mathcal B}}_{i}]_{\text {dec}} = m^2 d^{{\mathcal B}}_{i,0} {\mathcal {D}}_0 + d^{{\mathcal B}}_{i,1} {\mathcal {D}}_1 + d^{{\mathcal B}}_{i,2} {\mathcal {D}}_2 + {\textrm{O}}(1/m^2) \end{aligned}$$
(3.29)

with matching coefficients \(d^{{\mathcal B}}_{i,j}\). Inserting the expansion of both the decoupling action (3.14) and these fields we obtain,

$$\begin{aligned}{} & {} \langle O_\text {gf}{{\mathcal B}}_i(m_\star ;x) \rangle _\text {cont} = m^2 d^{{\mathcal B}}_{i,0} \langle O_\text {gf}{\mathcal D}_0(x) \rangle _\text {dec} \nonumber \\{} & {} - d^{{\mathcal B}}_{i,0} \sum _{j=1}^2 d^{\textrm{S}}_j\int d^4 y \langle O_\text {gf}{\mathcal D}_0(x) {\mathcal D}_j(y) \rangle _\text {dec}\nonumber \\{} & {} + \sum _{j=1}^2 d^{{\mathcal B}}_{i,j} \langle O_\text {gf}{\mathcal D}_j(x) \rangle _\text {dec}\,, \end{aligned}$$
(3.30)

up to terms of order \(1/m^2\). In the next step we pass back to RGI operators, now in the decoupled, \({N_{\textrm{f}}}=0\) theory. With the anomalous dimension of \({\mathcal {D}}_0\) given by \(\hat{\gamma }^{\mathcal {D}}_{0}=-1\) [45], and after conversion to the RGI quark mass with \(\hat{\gamma }_m=4/9\), we find

(3.31)

where we have used once again that the couplings coincide at the decoupling scale, i.e. , up to terms of O(\(\alpha ^2\)), which are negligible in this context. Inserting this expansion into Eq. (3.27) one notices that each term is weighted by \(b_i \times R_\alpha ^{\hat{\gamma }^{\mathcal B}_i}\). The matching coefficients \(d^{{\mathcal B}}_{i,j}\) for the observable and \(d^{\textrm{S}}_j\) for the action have expansions in \(\alpha \), but their leading orders are not known. However, as we are interested in the leading \(M^2\) behaviour, we focus on the massive operators \({\mathcal B}_{m,i}\), for \(i=1,2\) (cf. Sect. 3.1). Both operators contain the gluonic component \({{\mathcal {O}}}_{m,1}\) [44,45,46], so that one expects the expansion of their matching coefficients to start at tree level, i.e. \(d^{{\mathcal B}}_{(m,i),0}=\hat{d}^{\mathcal B}_{i,0} +{\textrm{O}}(\alpha )\). Combining this with the Symanzik expansion we obtain the form of the leading \(a^2M^2\) lattice effects,

(3.32)

where \(D_0\) denotes the matrix element of \({\mathcal {D}}^{\text {RGI}}_0\). Note that accidentally cancels out in the leading term.

Proceeding to the subleading \(a^2\)-effects, there are two types of contributions in Eq. (3.31). The first arises from the cancellation of the \(m^2\) leading term with the subleading \(1/m^2\) contribution from the effective decoupling action \(S_{2,{{{\textrm{dec}}}}}\), Eq. (3.16). Counting powers of \(\alpha \), we expect that only the massive operators \({\mathcal B}_{m,i}\) have a tree level matching coefficient to \({\mathcal {D}}_0\), rendering all non-massive operators negligible. For the matching coefficients in \(S_{2,{{{\textrm{dec}}}}}\) we assume \(d^{\textrm{S}}_{1,2} = {\hat{d}}^{\textrm{S}}_{1,2}\alpha \), so that we obtain the form of the first subleading \(a^2\)-effect,

(3.33)

Here, we have used \(\hat{\gamma }_{1,2}^{\mathcal {D}}=0,7/11\) and \(D_{0i}\) denotes the matrix elements of \({\mathcal {D}}^{\text {RGI}}_0{\mathcal {D}}^{\text {RGI}}_i\), for \(i=1,2\). The leading term is proportional to and thus cancels yet again.

For the second subleading \(a^2\)-term, the main question is which operators \({\mathcal B}_i\) match to \({\mathcal {D}}_{1,2}\) with a non-zero tree-level coefficient \(d^{{\mathcal B}}_{i,j}=\hat{d}^{{\mathcal B}}_{i,j} + {\textrm{O}}(\alpha )\). This is certainly the case for those operators in \(S_2\) which contain the gluonic dimension-6 operators of the same form as \({\mathcal {D}}_{1,2}\). Including only such operators \({\mathcal B}_i\) in \(S_2\) we then expect

(3.34)

where \(D_i\) denotes the matrix element of \({\mathcal {D}}^{\text {RGI}}_i\), \(i=1,2\). The possible powers \(\hat{\Gamma }_i\) could be obtained from a complete basis of operators for gradient flow observables. Until this becomes available we assume that the \(a^2\)-effects in Eq. (3.34) are subleading, i.e. \(\hat{\Gamma }_i > \hat{\Gamma }_1 =-1/9\), with \(\hat{\Gamma }_1\) corresponding to the massive operator \({\mathcal B}_{m,1}\).

3.3 Boundary effects

So far we have not considered the effect of boundaries, where chiral symmetry can be broken by the boundary conditions. This is the case for standard SF [39], open [55], and open-SF [56] boundary conditions. Locality means that these effects can be discussed separately. In particular, boundary O(\(a^k\)) and O(\(1/m^k\)) effects can be respectively described in the Symanzik and decoupling effective theory, in terms of local gauge-invariant fields of dimension \(3+k\) localized at the boundaries [57]. In fact, the counterterm fields that appear at O(a) in the Symanzik expansion and at O(1/m) in the large-mass expansion are the same. A complete set of fields can be found in Ref. [58], where a detailed discussion of the O(a) contributions to the Symanzik effective action in the presence of SF boundary conditions is presented. Below we shall focus on the decoupling expansion in the presence of SF boundary conditions. For a discussion on the boundary O(a) effects affecting the observables of interest, instead, we refer the reader to Refs. [10, 35]. In these references, a detailed analysis for the case of the GF-couplings in the \({N_{\textrm{f}}}=0\) and 3 theory is presented. Here we note that for the case of the GFT-couplings, due to the larger separation between the flow energy density defining the couplings and the SF boundaries (cf. Eq. (2.13)), these effects are expected to be significantly smaller than the estimates obtained in Refs. [10, 35]. In practice, this renders these effects irrelevant in the context of the analysis presented in Sect. 4.4 and we neglect them.

As in the previous subsections, we are interested in the situation where the resulting effective theory for large quark masses is the pure gauge theory, i.e. all quarks simultaneously decouple. In this case, we have (cf. Eq. (3.14)),

$$\begin{aligned} S_{\text {dec}} = S_{0,{\textrm{dec}}} + \frac{1}{m} S_{1,{\textrm{dec}}} + \frac{1}{m^2} S_{2,{\textrm{dec}}} + {\textrm{O}}(1/m^3), \end{aligned}$$
(3.35)

where

$$\begin{aligned} S_{1,{\textrm{dec}}}= \sum _{i=1}^{2}\int d^3x\,\omega _{i,{{{\textrm{b}}}}}\, \big [{\mathcal {O}}_{i,{{{\textrm{b}}}}}(0,{\varvec{x}})+ {\mathcal {O}}_{i,{{{\textrm{b}}}}}(T,{\varvec{x}})\big ], \end{aligned}$$
(3.36)

with

$$\begin{aligned} {\mathcal {O}}_{1,{{{\textrm{b}}}}}= & {} -{1\over g_0^2}\sum _{k=1}^3\,\hbox {tr}(F_{0k}F_{0k}), \nonumber \\ \quad {\mathcal {O}}_{2,{{{\textrm{b}}}}}= & {} -{1\over 2g_0^2}\sum _{k,l=1}^3\,\hbox {tr}(F_{kl}F_{kl}). \end{aligned}$$
(3.37)

The SF boundary conditions commonly considered in applications are defined in terms of spatially constant Abelian fields [12, 38]. These include in particular the case of vanishing boundary conditions for the gauge field. For this class of fields, the only operator that contributes to the effective action is \({\mathcal {O}}_{1,{{{\textrm{b}}}}}\), since \(F_{kl}(x)=0\) at \(x_0=0,T\).Footnote 4 Thus, in this situation, we can take,

$$\begin{aligned} S_{1,{\textrm{dec}}}= \int d^3x\,\omega _{{{{\textrm{b}}}}} \big [{\mathcal {O}}_{{{{\textrm{b}}}}}(0,{\varvec{x}})+ {\mathcal {O}}_{{{{\textrm{b}}}}}(T,{\varvec{x}})\big ], \end{aligned}$$
(3.38)

where we simplified the notation by setting \(\omega _{{{\textrm{b}}}}\equiv \omega _{1,{\textrm{b}}}\) and \({\mathcal {O}}_{{{\textrm{b}}}}\equiv {\mathcal {O}}_{1,{\textrm{b}}}\).

The knowledge of the matching coefficient \(\omega _{{{\textrm{b}}}}\) between QCD with \({N_{\textrm{f}}}\) heavy quarks and the pure gauge theory, would allow us to compute the O(1/m) corrections to any observable stemming from the effective action. In the case of gradient flow quantities, \(O_{\text {gf}}\), these are the only O(1/m) effects. As a result, we have that,

$$\begin{aligned} \langle O_{\text {gf}}\rangle _{\textrm{cont}}= & {} \langle O_{\text {gf}} \rangle _{{{\textrm{dec}}}} - {1\over m_\star }\omega _{{{\textrm{b}}}} \int d^3x\, \big \langle O_{\text {gf}}\big [{\mathcal {O}}_{{{{\textrm{b}}}}}(0,{\varvec{x}})\nonumber \\{} & {} + {\mathcal {O}}_{{{{\textrm{b}}}}}(T,{\varvec{x}})\big ]\big \rangle _{{{\textrm{dec}}}} +{\textrm{O}}(1/m_\star ^2). \end{aligned}$$
(3.39)

Comparing Eqs. (3.39) and (3.22), the attentive reader might have noticed the absence of a factor \([\alpha ^{(0)}_{\overline{{\textrm{MS}}}}(m_\star )]^{\hat{\gamma }_{{{\textrm{b}}}}}\), with \(\hat{\gamma }_{{{\textrm{b}}}}\) the anomalous dimension of the relevant boundary field. This can be understood by noticing that for \(x_0=0,T\), \({\mathcal {O}}_{{{\textrm{b}}}}\) coincides with the Hamiltonian density operator of the pure gauge theory,

$$\begin{aligned} {\mathcal {H}}=-{1\over g_0^2}\bigg [\sum _{k=1}^3\,\hbox {tr}(F_{0k}F_{0k})-{1\over 2}\sum _{k,l=1}^3\,\hbox {tr}(F_{kl}F_{kl})\bigg ]. \end{aligned}$$
(3.40)

As is well-known, in continuum regularizations where the Euclidean space-time symmetries are preserved, the latter is protected against renormalization and its insertion in correlation functions is \(x_0\)-independent. The result in Eq. (3.39) is thus expected to be valid to all orders in the perturbative expansion. On the lattice, where the continuum space-time symmetries are reduced to the symmetries of the hypercube, \({\mathcal {O}}_{{{\textrm{b}}}}\) still has vanishing anomalous dimension, but it requires a scale-independent multiplicative renormalization [59].

Since the matching coefficient \(\omega _{{{\textrm{b}}}}\) is independent of the specific correlator considered, we may impose the validity of Eq. (3.39) for some convenient observable (neglecting O(\(1/m^2\)) terms) in order to determine \(\omega _{{{\textrm{b}}}}\) (up to O(1/m) ambiguities). It can then be used to compute the corresponding O(1/m) corrections to any other quantity of interest. While in principle Eq. (3.39) could be imposed non-perturbatively, in practice this is expected to be very challenging, since the 1/m contributions have to be separated numerically from other powers. For large enough masses \(m_\star \), however, we may rely on a perturbative determination of \(\omega _{{{{\textrm{b}}}}}\), since \(\alpha _{\overline{{\textrm{MS}}}}(m_\star )\) is then small. A convenient observable that can be used to determine \(\omega _{{{\textrm{b}}}}\) is the finite-volume SF coupling, \(\bar{g}^2_{\textrm{SF}}(\mu )\) [12, 38, 60]. Although it is not a gradient flow quantity, it receives 1/m corrections only from terms in the action. This is so because it is defined through the variation of the (logarithm of the) partition function with respect to the boundary conditions for the gauge fields, and not by the correlation function of some field. Thus, Eq. (3.39) still holds in this form. Solving this equation in perturbation theory, where on the l.h.s. the coupling is computed in \({N_{\textrm{f}}}\)-flavour QCD with \({N_{\textrm{f}}}\) heavy quarks, while on the r.h.s. the correlators are computed in the pure gauge theory, we can extract

$$\begin{aligned} \omega _{{{\textrm{b}}}}(\alpha _\star ) = \omega _{{{{\textrm{b}}}}}^{(1)} \alpha _\star +\omega _{{{{\textrm{b}}}}}^{(2)} \alpha _\star ^2+ \cdots , \quad \alpha _\star \equiv \alpha _{\overline{{\textrm{MS}}}}(m_\star ), \end{aligned}$$
(3.41)

by studying the limit \(m_\star \rightarrow \infty \). We refer the interested reader to Appendix A.1 for the details. Here we simply quote the result,

$$\begin{aligned} \omega _{{{{\textrm{b}}}}}^{(1)}=-0.0541(5) N_{\textrm{f}} . \end{aligned}$$
(3.42)

A couple of remarks are in order at this point. While the strategy based on Eq. (3.39) is a general way to compute (and therefore eliminate) the O(1/m) effects due to the SF boundary conditions, other strategies are in principle possible. For an even number of flavours \(N_{\textrm{f}}\), considering a twisted mass \(\mu _{\textrm{tw}}\) rather than a standard mass for the heavy quarks, would imply having leading O(\(1/\mu _{\textrm{tw}}^2\)) corrections to observables [61]. Entirely equivalent in the continuum is the choice of having a standard mass for the quarks, but with chirally rotated SF boundary conditions [62]. Regular periodic boundary conditions are in principle possible for any value of \(N_{\textrm{f}}\) with leading corrections of O(\(1/m^2\)), however, perturbation theory becomes unduly complicated [63]. In QCD with \({N_{\textrm{f}}}=3\) flavours 1/m effects could also be avoided by choosing twisted-periodic boundary conditions [64].

4 3-flavour QCD: set-up, simulations and results

We simulate three flavors of non-perturbatively \({\textrm{O}}(a)\)-improved Wilson fermions with the tree-level Symanzik \({\textrm{O}}(a^2)\)-improved gauge action [65]. The same discretization was employed in our previous work [35]. The parameter \(\beta =6/g_0^2\) in the gauge action and a parameter \(\kappa =1/(2am_0+8)\) in the mass-degenerate fermion action need to be fixed for each lattice size, \(L/a = 1/(a\mu _{\textrm{dec}})\) and for each physical heavy quark mass M. Our line of constant physics (LCP) is identified in terms of the value of the massless coupling \(\bar{g}_{{\textrm{GF}}}^2(\mu _{\textrm{dec}}) = 3.949\) and the values of \(z=ML\). Our error analysis takes all (auto-)correlations into account using the publicly available implementations of the \(\Gamma \)-method [66, 67] and a second independent analysis. A preliminary analysis of our results was presented in [68].

4.1 Line of constant physics at \(M=0\)

The first four columns of Table 1 show results of simulations tuned such that the PCAC mass vanishes and \(\bar{g}_{{\textrm{GF}}}^2\approx 3.949\). In order to precisely tune to our LCP, we apply a small shift,

$$\begin{aligned} g^2_{0,{\textrm{LCP}}} = g^2_{0,{\textrm{sim}}} +\frac{3.949 - \overline{g}_{{\textrm{GF}}}^2}{S} ,\quad S=\left. \frac{\partial \overline{g}_{{\textrm{GF}}}^2}{\partial {\tilde{g}}_0^2}\right| _{L/a} , \end{aligned}$$
(4.1)

where \(g^2_{0,{\textrm{sim}}}\) are the simulated bare couplings (cf. second column of Table 1) and the slope

$$\begin{aligned} S= & {} \left. \frac{\partial \overline{g}_{{\textrm{GF}}}^2}{\partial \log (a)}\right| _{ L/a} \,\frac{\textrm{d}\log (a)}{\textrm{d}g_0^2} = \left. \frac{\partial \overline{g}_{{\textrm{GF}}}^2}{\partial \log (L)}\right| _{ L/a} \,\nonumber \\{} & {} \times \frac{\textrm{d}\log (a)}{\textrm{d}g_0^2}\, = \frac{\overline{g}_{{\textrm{GF}}}\beta ^{(3)}_{\textrm{GF}}(\overline{g}_{{\textrm{GF}}})}{ g_0 \beta _0^{(3)}(g_0) }. \end{aligned}$$
(4.2)

All quantities here are defined at vanishing quark mass, but we note that the shift in \(\kappa \) to maintain the \(m=0\) condition at \(\beta _{\textrm{LCP}}\) is negligible. We also convert the uncertainty in \(\bar{g}^2_{{\textrm{GF}}}\) into an uncertainty in the LCP \(\beta \)-value using the slope S. The last column of Table 1 lists the resulting values of \(\beta _{\textrm{LCP}}=6/g^2_{0,{\textrm{LCP}}}\). Note that the decoupling scale \(\mu _{{{\textrm{dec}}}}\) is implicitly defined by our LCP, i.e. \(\bar{g}^2_{{\textrm{GF}}}(\mu _{{{\textrm{dec}}}}) = 3.949\). Our estimates used for the three-flavor renormalized (\(\beta ^{(3)}_{\textrm{GF}}\)) and bare (\(\beta ^{(3)}_0\)) beta-functions are described in Appendix D.

Table 1 Massless line of constant physics. The simulations described in the first four columns are taken from [36]. They are used to fix \(\beta _{\textrm{LCP}}\) such that the renormalized massless coupling \(\bar{g}^2_{{\textrm{GF}}}(\mu ) = 3.949\). The last row (\(L/a=48\)) is obtained indirectly from our knowledge of the non-perturbative running of the coupling, while the previous one (\(L/a=40\)) is an interpolation of the other data, see Appendix G for more details

4.2 Massive simulations

With the value of the massless coupling fixed by our LCP, we proceed to simulate massive quarks with (again) homogeneous SF boundary conditions but with \(T=2L\) and various z.

Namely, for a given L/a, the massive simulations have to be performed at fixed lattice spacing (defined in a massless scheme and with O(a) improvement [58]) and for a set of renormalized quark masses, common to all L/a. Therefore, for each L/a we fix the simulation parameters \(\beta , \kappa \) for a prescribed value of \(z = M /\mu _{{{\textrm{dec}}}}=ML\). This last quantity is given by

$$\begin{aligned} z = \frac{L}{a} \frac{M}{\overline{m}(\mu _{\textrm{dec}})}Z_{\textrm{m}}({\tilde{g}}_0^2,a\mu _{\textrm{dec}}) \left[ 1 + b_{\textrm{m}}({\tilde{g}}_0) am_\textrm{q}\right] \,am_\textrm{q}, \nonumber \\ \end{aligned}$$
(4.3)

where the running factor \({M}/{\overline{m}(\mu _{\textrm{dec}})} = 1.474(11)\) (with \(\overline{m}(\mu _{\textrm{dec}})\) defined in the SF scheme) can be determined from results available in the literature [23] (see Appendix E). The renormalization constant \(Z_\textrm{m}(\tilde{g}_0^2,a\mu _\textrm{dec})\) and improvement coefficient \(b_{\textrm{m}}({\tilde{g}}_0)\), instead, are determined in Appendix E. Once z is fixed, Eq. (4.3) is just a quadratic equation in \(am_{ \mathrm q}\). For our O(a)-improved Wilson fermions fixed lattice spacing corresponds to fixed improved bare coupling \({\tilde{g}}_0^2\). The simulation parameter \(\beta \) of the massive simulation is thus obtained from

$$\begin{aligned} \beta = \frac{6(1+b_{\textrm{g}}\, am_\textrm{q})}{{\tilde{g}}_0^2}, \end{aligned}$$
(4.4)

where the values of \({\tilde{g}}_0^2 =6/\beta _{\textrm{LCP}}\) are taken from Table 1, since at zero mass the improved and unimproved couplings coincide. For \(b_{\textrm{g}}\) as well as for all other improvement coefficients that are known only perturbatively, we use 1-loop values and treat the difference between tree-level and 1-loop as uncertainties, see below and Appendix E. The largest effect arises from \(b_{\textrm{g}}\).

The other simulation parameter, \(\kappa \), is obtained from the critical mass, \(am_{{\textrm{c}}}(g_0^2)\). Since Table 1 provides the values of \(\kappa _{{\textrm{c}}} = 1/(2am_{{\textrm{c}}}({\tilde{g}}_0^2)+8)\) we perform a small shift

$$\begin{aligned} am_{{\textrm{c}}}(g_0^2) = am_{{\textrm{c}}}({\tilde{g}}_0^2) + \left( g_0^2-{\tilde{g}}_0^2\right) \frac{\partial }{\partial {\tilde{g}}_0^2} \,(am_{{\textrm{c}}}), \end{aligned}$$
(4.5)

where the needed derivative can be obtained either from the literature [35], or from the simulations used to extract \(Z_{\textrm{m}}, b_{\textrm{m}}\) (see Appendix E). Both determinations of the derivative agree at the percent level. We thus obtain \(\beta ,\kappa \) needed to simulate at fixed values of z. The uncertainty in z, propagated from our determinations of \(Z_{\textrm{m}}\), \(b_{\textrm{m}}\), \(\kappa \), are propagated into an error in the coupling according to the discussion in Appendix F. The error in z contributes to a small part to the uncertainty of \(\bar{g}^2_{{\textrm{GFT}}}\).

Our simulations were performed when only an incomplete data-set for the determination of the LCP was available. This can be observed by comparing our simulation parameters at \(z=0\) in Table 12 with the final values of the LCP available in Table 1. We correct for this small mismatch by a linear shift in \({\tilde{g}}_0^2\) using

$$\begin{aligned} \left. \frac{\partial {\overline{g}}_{{\textrm{GFT}}}^2}{\partial {\tilde{g}}_0^2} \right| _{z, L/a}= & {} \left. \frac{\partial {\overline{g}}_{{\textrm{GFT}}}^2}{\partial \log (a)}\right| _{z, L/a} \,\frac{\textrm{d}\log (a)}{\textrm{d}{\tilde{g}}_0^2} = \left. \frac{\partial {\overline{g}}_{{\textrm{GFT}}}^2}{\partial \log (L)}\right| _{z, L/a} \,\frac{\textrm{d}\log (a)}{\textrm{d}{\tilde{g}}_0^2}\, \nonumber \\= & {} \frac{{\overline{g}}_{{\textrm{GFT}}}\beta ^{(0)}_{\textrm{GFT}}({\overline{g}}_{{\textrm{GFT}}})(1-\eta ^{\textrm{M}}(g_\star ))}{{\tilde{g}}_0 \beta _0^{(3)}({\tilde{g}}_0) }[1+R_{\textrm{z}}+R_{\textrm{a}}]. \end{aligned}$$
(4.6)

Here, the numerator uses decoupling and thus the pure gauge theory \(\beta \)-function of the GFT coupling appears together with the factor \((1-\eta ^{\textrm{M}}(g_\star )) \approx b_0^{(3)}/b_0^{(0)}=9/11\) [27]. The denominator is the \(N_{\textrm{f}}=3\) bare \(\beta \)-function at \(z=0\). The terms \(R_{\textrm{z}},\;R_{\textrm{a}}\) are corrections for O(\(1/z^2\)) and O(\(a^2\)) terms, respectively. The derivation of this equation as well as its numerical approximation are explained in Appendix D. In all cases the resulting shifts in \(\bar{g}^2_{{\textrm{GFT}}}\) are small. In general they are well below our statistical uncertainties; only at \(L/a=40\) the shifts amount to more than one standard deviation.

Fig. 1
figure 1

Global continuum fit of our data for c = 0.3 (\(\bar{g}^2_z \equiv \bar{g}^2_{0.3}(z)\) in the plots) and two values of the mass cuts. Left and right plots represent the same data as a function of \((aM)^2\) and \((a/L)^2\) respectively. Note that the assumed 100% uncertainty of \(b_g\) is not included in the error bars of the points. However, it is propagated into the uncertainties of the global fit shown by the shaded areas

4.3 Choice of c

Within the same simulation, the massive coupling \(\bar{g}^2_{{\textrm{GFT}}}(\mu _{{{\textrm{dec}}}},M)\) can be obtained at different values of \(c = \mu _{{{\textrm{dec}}}}\sqrt{8t}\), which defines the given gradient flow scheme (cf. Eqs. (2.11),(2.13)). For better clarity, in the following discussion we shall thus employ the notation \(\bar{g}^2_{{{\textrm{GFT}}},{{\textrm{c}}}}(\mu _{{{\textrm{dec}}}},M)\). Typically, in finite size scaling studies (in massless renormalization schemes), the choice of c represents a compromise between scaling violations (larger at small values of c) and statistical uncertainties (larger at large values of c), with c in the range 0.3–0.5 representing a good choice [40]. Here, however, we do not intend to use the massive coupling to compute a step-scaling function, but rather as an observable to which decoupling can be applied. Its value is the same as in the pure gauge theory up to power corrections in the inverse heavy-quark mass. In particular, the leading corrections are expected to be of \({\textrm{O}}(\mu _{\textrm{max}}^2/M^2)\), where \(\mu _{\textrm{max}}\) is the largest mass scale present. For \(\bar{g}^2_{{{\textrm{GFT}}},{{\textrm{c}}}}\) we have: \(\mu _{\textrm{max}}=1/\sqrt{8t} = \mu _{\textrm{dec}}/c\). This implies that at a fixed scale \(\mu _{{{\textrm{dec}}}}\), a scheme with larger c is expected to have smaller corrections to the infinite mass limit. In addition, contrary to standard finite-size scaling studies, in the present context we do not expect a larger value of c to reduce the discretization errors in our data. Discretization errors are in fact dominated by \({\textrm{O}}((aM)^2)\) terms.

We have performed the analysis for \(c=\sqrt{8t}/L=0.30,0.33,0.36,0.39,0.42\). In general, we will focus the discussion on the cases \(c=0.30\) and \(c=0.36\), although the conclusions are similar for other values (see Table 2). The case \(c=0.3\) represents the most precise dataset. It is therefore an ideal choice to explore different mass cuts and study the systematics involved in the continuum extrapolations. On the other hand, \(c=0.36\) is an intermediate value from which we will extract our central results.

4.4 Continuum extrapolations

We turn to the continuum extrapolations of the massive couplings \(\bar{g}^2_{{{\textrm{GFT}}},{{\textrm{c}}}}(\mu _{{{\textrm{dec}}}},M,a\mu _{{{\textrm{dec}}}})\) for different \(z = M/\mu _{{{\textrm{dec}}}}\) and c. In Sect. 3 we gave a detailed description of the scaling violations in the framework of the Symanzik effective theory. In particular, we showed the absence of corrections \(\sim a^2M\mu _{{{\textrm{dec}}}}\). Still, continuum extrapolations are difficult due to the complicated functional form, Eq. (3.34), of the \({\textrm{O}}(a^2)\) corrections. Even when the leading anomalous dimensions are known (see Sect. 3.2.2) there are higher order corrections in a and in \(\alpha (1/a)\). Hence, in practice, the extrapolations have to be approached from an empirical point of view. The effect of the different logarithmic corrections can be explored by varying the values of their exponents, see below. Our values for \(a\mu _{{{\textrm{dec}}}}\) are very small, but having large masses we expect to have significant cutoff effects of \({\textrm{O}}((aM)^2)\). Different cuts in aM will thus be used to test the different assumptions regarding logarithmic corrections and higher order terms.

Fig. 2
figure 2

Continuum extrapolations for \(c=0.36\). Details as in Fig. 1

Given these considerations, we opt for two approaches to obtain the continuum coupling \(\bar{g}^2_{{\textrm{c}}}(z_i,0)\! \equiv \!\bar{g}^2_{{{\textrm{GFT}}},{{\textrm{c}}}}(\mu _{{{\textrm{dec}}}},\!M,0)\) from the values \(\bar{g}^2_{{\textrm{c}}}(z_i,a) \equiv \bar{g}^2_{{{\textrm{GFT}}},{{\textrm{c}}}}(\mu _{{{\textrm{dec}}}},M, a\mu _{{{\textrm{dec}}}})\) at non-zero lattice spacing.Footnote 5

Extrapolations at fixed z: The measured values of \(\bar{g}^2_{{\textrm{c}}}(z_i,a)\) for each value of \(z_i=M_i/\mu _{{{\textrm{dec}}}}\) are extrapolated with the ansatz

(4.7)

where \(C_i(c),\, p_i(c)\) are independent fit parameters for each value \(z_i\) (with the continuum limits being \(\bar{g}^2_{{\textrm{c}}}(z_i,0)=C_i(c)\)), and we use \(\hat{\Gamma }\in [-1,1]\).

Global extrapolations: The measurements of the coupling for all \(z_i,a\mu _{{{\textrm{dec}}}}\) at a fixed c are combined in a single fit using the ansatz

(4.8)

In this case the fit parameters are the continuum values \(C_i\) and the two parameters \(p_{1,2}\), while we consider \(\hat{\Gamma }\in [-1,1]\), and \(\hat{\Gamma }'\in [-1/9,1]\). This simple form is the result of expanding the \(a^2\)-terms of the Symanzik effective theory in 1/M and dropping \({\textrm{O}}(1/M^2)\) corrections (see Sect. 3.2.2). We therefore need to check which values of z are large enough to be included in the global fit.

Due to the shifts to the proper LCP, the data are slightly correlated across different values of z. We performed uncorrelated fits, but judged the quality of the fits from the value of \(\chi ^2\) computed from the known covariance matrix [69], which however was in all cases very close to the naive number of d.o.f. Using data with \((aM)^2 > 0.35\) leads to biased results and fits with bad \(\chi ^2\). Therefore we use only two mass cuts \((aM)^2 \le 0.25, 0.16\) in the following analysis.

Figures 1 and 2 show the different extrapolations for \(c=0.30\) and \(c=0.36\), respectively. We make the following observations concerning the fits.

  • Discretization effects proportional to \((a\mu _{{{\textrm{dec}}}})^2\) are very small. For the case \(c=0.30\) the fit coefficient \(p_1\) in the global analysis is very small, well compatible with zero. For \(c\ne 0.3\), these scaling violations are slightly larger, but still all our lattices are large enough to be included in the fit. This justifies that we will only discuss cuts in aM.

  • The data at \(c=0.3\) shows a very different behavior in \((aM)^2\) for our smallest value of the mass \(z=1.972\) (see Fig. 1). This suggests that \(z>2\) is needed for the large mass expansion to be reliable. For \(c=0.36\) (see Fig. 2) the behavior in \((aM)^2\) even for \(z=1.972\) looks well compatible with the behavior at \(z\ge 4\). Since the effective decoupling scale is smaller in this case, the data at \(c=0.36\) is closer to the large mass limit. In any case, to be on the safe side, we only include in the global analysis \(z\ge 4\) for all values of c, while the data with \(z=1.972\) is always fitted with an independent slope.

The extrapolations at fixed values of z and the global analysis always lead to compatible results. Also the uncertainties of the continuum limits are very similar except for the case \(z=12\), where the error in the extrapolation at fixed z (that only uses two points) is twice as large as the result from the global analysis. Given that our global formula is theoretically sound, particularly so at large values of z, we have no reason to suspect that the results of the global analysis are not accurate for \(z=12\). Figures 1 and 2 shows the results of the global analysis with two cuts \((aM)^2 \le 0.16,0.25\). Results are compatible, with the extrapolations using only data with \((aM)^2 \le 0.16\) resulting in slightly larger uncertainties.

We shall now discuss the logarithmic corrections to scaling. We have tried several values of \(\hat{\Gamma }\in [-1,1], \hat{\Gamma }'\in [-1/9,1]\) in the continuum extrapolations. We see deviations much smaller than our uncertainties. Our analysis shows that the logarithmic corrections have little influence in our case.Footnote 6 This can be understood from the fact that in our finite volume setup we reach very small lattice spacings, i.e. in the range \(a\in [0.006,0.015]\, {\textrm{fm}}\). At the high scales 1/a defined by these lattice spacings, the coupling runs very slowly, rendering the effect of the logarithmic corrections very small. This feature should be considered a virtue of our strategy.

These considerations lead us to quote as final values for the continuum extrapolation the results of the global fit with \(aM\le 0.4\) and \(\hat{\Gamma }= \hat{\Gamma }' = 0\). This particular analysis has larger or similar uncertainties than other choices, and provides an excellent description of our data for \(z\ge 4\). Table 2 shows the data entering the analysis together with the final results of the extrapolations. Thanks to the use of large lattices, the continuum extrapolations are under reasonable control. The deviation of our finest lattice spacing data from the continuum values is at most two standard deviations. Obviously this “gap” grows with increasing z. Given the importance of large z for the extrapolation of \(\Lambda _{\overline{\textrm{MS}}}\) to \(z\rightarrow \infty \), it would be worthwhile to close the gap further by simulating even larger values of L/a when an improved overall precision is desired.

Table 2 Values of the massive coupling \(\bar{g}^2_{{\textrm{c}}}(z)\) and its continuum extrapolated values. The results quoted for the continuum extrapolations correspond to a global fit of the data with \(z\ge 4\) and \((aM)^2\le 0.16\) and fixing \(\hat{\Gamma }= \hat{\Gamma }' = 0\). At finite lattice spacings, the uncertainty in \(b_{\textrm{g}}\) is omitted, but the continuum values include it. We also give just the \(b_{\textrm{g}}\)-uncertainty with 100% correlation across all data

4.5 Large mass extrapolations and the determination of \(\Lambda ^{(3)}_{\overline{\textrm{MS}}}\)

4.5.1 Estimates of the three flavor \(\Lambda \)-parameter

Table 3 The massive couplings, \(\bar{g}^2_{{\textrm{c}}}(z)\), together with the associated pure gauge coupling, \([\bar{g}^{(0)}_{\text {GF}}(\mu _{\textrm{dec}})]^2\), after a non-perturbative matching to the scheme with \(T=L, c=0.3\). The coupling \([\bar{g}^{(0)}_{\text {GF}}(\mu _{\textrm{dec}})]^2\) is used to obtain \(\rho = \Lambda ^{(3)}_{\overline{\textrm{MS}},{\textrm{eff}} }/\mu _{{{\textrm{dec}}}}\) and \(\Lambda ^{(3)}_{\overline{\textrm{MS}},{\textrm{eff}} }\), which is \(\Lambda ^{(3)}_{\overline{\textrm{MS}}}\) up to power corrections in 1/M. We show results for two representative values of \(c=0.3, 0.36\)

Once the values of the massive coupling \(\bar{g}^2_{{{\textrm{GFT}}},{{\textrm{c}}}}(\mu _{{{\textrm{dec}}}}, M)\) are known in the continuum, decoupling tells us that the values of these couplings are the same as in the pure gauge theory, up to heavy mass corrections. In order to make use of this together with the results of Ref. [10], we first need to match our coupling to the GF coupling definition of Ref. [10]. The difference is our choice of \(T=2L\) as well as of c-values in the massive coupling \(\bar{g}^2_{{{\textrm{GFT}}},{{\textrm{c}}}}(\mu _{{{\textrm{dec}}}}, M)\) (cf. Sect. 4.3), compared to the choice \(T=L\) and \(c=0.3\) made in the pure gauge theory [10]. The two different schemes can be matched non-perturbatively in the pure gauge theory. There, the couplings \(\bar{g}^{(0)}_{{\textrm{GF}}}(\mu )\) (with \(T=L\) and \(c=0.3\)) and \(\bar{g}^{(0)}_{{{\textrm{GFT}}}, {{\textrm{c}}}}(\mu )\) (with \(T=2L\) and arbitrary c) are related by

$$\begin{aligned} \bar{g}^{(0)}_{{\textrm{GF}}}(\mu ) = \chi _{{\textrm{c}}}\left( \bar{g}^{(0)}_{{{\textrm{GFT}}},{{\textrm{c}}} }(\mu ) \right) . \end{aligned}$$
(4.9)

The functions \(\chi _{{\textrm{c}}}\) for the relevant values of \(c=0.30,0.33,0.36,0.39,0.42\), are precisely determined as described in Appendix B.1.

We define \(\bar{g}^{(0)}_{\text {GF}}(\mu _{\textrm{dec}})\) as the values of the pure gauge coupling (\(T=L\), \(c=0.3\)) that correspond to the values of the massive coupling extrapolated to the continuum, i.e. (cf. Table 2)

$$\begin{aligned} \bar{g}^{(0)}_{\text {GF}}(\mu _{\textrm{dec}}) = \chi _{{\textrm{c}}}\left( \bar{g}^{(3)}_{{{\textrm{GFT}}},{{\textrm{c}}}}(\mu _{{{\textrm{dec}}}}, M)\right) . \end{aligned}$$
(4.10)

Pure gauge theory results for the function \(\varphi _{{\textrm{GF}}}^{(0)}\) (see Appendix B.2) then yield values for

$$\begin{aligned} \frac{\Lambda ^{(0)}_{\overline{\textrm{MS}}}}{\mu _{{{\textrm{dec}}}}} = \frac{\Lambda ^{(0)}_{\overline{\textrm{MS}}}}{\Lambda ^{(0)}_{\text {GF}}} \varphi _{{\textrm{GF}}}^{(0)}(\bar{g}^{(0)}_{\text {GF}}(\mu _{\textrm{dec}})). \end{aligned}$$
(4.11)

Since \(z=M/\mu _{{{\textrm{dec}}}}\) is a known input, the non-linear equation (cf. Eq. (2.10))

$$\begin{aligned} \rho P \left( z/\rho \right) = \frac{\Lambda ^{(0)}_{\overline{\textrm{MS}} }}{\mu _{{{\textrm{dec}}}}} \end{aligned}$$
(4.12)

allows us to determine \(\rho = \Lambda ^{(3)}_{\overline{\textrm{MS}},{\textrm{eff}} }/\mu _{{{\textrm{dec}}}} = \Lambda ^{(3)}_{\overline{\textrm{MS}} }/\mu _{{{\textrm{dec}}}}+{\textrm{O}}(1/z)\), see Table 3. With \(\mu _{{{\textrm{dec}}}} = 789(15)\) MeV obtained in \(N_{\textrm{f}} = 3\) QCD [35], we convert these ratios to the effective three flavor \(\Lambda \)-parameter, again equal to \(\Lambda ^{(3)}_{\overline{\textrm{MS}}}\) up to \({\textrm{O}}(1/M)\) corrections. Results are also listed in Table 3.

4.5.2 \(M\rightarrow \infty \) extrapolation

According to the discussion in Sect. 3 we expect the estimates of \(\Lambda ^{(3)}_{\overline{\textrm{MS}} ,{\textrm{eff}}}\) of Table 3 to approach \(\Lambda ^{(3)}_{\overline{\textrm{MS}}}\) with power corrections of the form \(z^{-k}\), accompanied by logarithmic corrections. The function P is approximated by high order perturbation theory. Since the used masses \(m_\star \) are large, the associated \({\textrm{O}}(\alpha ^4_{\overline{\textrm{MS}} }(m_\star ))\) uncertainties can be neglected. Linear terms of \({\textrm{O}}(z^{-1})\) are a consequence of our boundary conditions. The choice \(T=2L\) suppresses their effects to a level below our statistical precision, as we were able to show by an explicit computation (cf. Appendix A.2). We therefore assume leading \(1/z^2\) corrections, with logarithmic corrections as discussed in Sect. 3.2.1. In practice we fit the parameters \(A,\,B\) in

$$\begin{aligned} \Lambda ^{(3)}_{\overline{\textrm{MS}} ,{\textrm{eff}}} = A + \frac{B}{z^2}[\alpha (m_\star )]^{\hat{\Gamma }_{m}}, \end{aligned}$$
(4.13)

to the data, in order to obtain \(\Lambda ^{(3)}_{\overline{\textrm{MS}}}=A\). Since the leading exponent \(\hat{\Gamma }_{m}\) is presently not known, we vary it in a reasonable range \(\hat{\Gamma }_{m} \in [0,1]\) (cf. Sect. 3.2.1).

The first issue that we have to deal with is what values of z are included in this extrapolation. Part of the difficulty here is that the estimates of \(\Lambda ^{(3)}_{\overline{\textrm{MS}} }\) coming from different values of cz are very correlated. Correlations are due to many sources: \(b_{\textrm{g}}\), the running in the pure gauge theory, the scale \(\mu _{\textrm{dec}}\), all enter in the same way for all cz. There are also less obvious correlations. E.g. the global fit performed to obtain the continuum limit has common parameters \(p_1,p_2\) describing the cutoff effects. All of these correlations are precisely known – they do not involve difficult-to-estimate correlation matrices from Monte Carlo chains.

We therefore performed correlated fits to Eq. (4.13). Visually they all look very good; an example is displayed in Fig. 3. The \(\chi ^2\)-values are found above the numbers of d.o.f., but the quality of fit, Q, reported in Table 4, is generally good enough. Only fits including \(z=4\) and the smallest values of c are statistically discouraged. As a precaution against higher order corrections (i.e. \({\textrm{O}}(z^{-3})\), etc.) we exclude the \(z=4\) data also for the larger values of c and use \(c=0.36,\,z\ge 6\) as our central result. Note that the Q-value is relatively small for the \(z\ge 8\) fits since they only contain one degree of freedom. The fact that Q becomes better including more data is supporting our choice of the \(z\ge 6\) fits.

As a check of this analysis we also performed uncorrelated fits, computed their Q-value from the known covariance matrix [69] and found entirely consistent results.

Fig. 3
figure 3

Values for \(\Lambda ^{(3)}_{\overline{\textrm{MS}},{\textrm{eff}} }\) from Table 3 (\(c=0.36\)) and their extrapolation \(M\rightarrow \infty \) using Eq. (4.13) with \(\hat{\Gamma }_{ m}=0\)

Table 4 Estimates of \(\Lambda ^{(3)}_{\overline{\textrm{MS}},{\textrm{eff}}}\) (see Table 3) are extrapolated to \(M\rightarrow \infty \) according to Eq. (4.13) with \(\hat{\Gamma }_{m} = 0\). We also quote the \(\textrm{Q}-\)value of the fit

We now proceed to investigate the effect of the logarithmic corrections. Fits with \(\hat{\Gamma }_{m} =1\) yield only about 3 MeV higher values for \(\Lambda \) when the \(z=4\) data is excluded. Further excluding also \(z=6\) reduces these shifts to only 1–2 MeV. We take the result with \(z\ge 6\) and \(c=0.36\) as our final result, and add 3 MeV as our estimate of the systematic effect due to the logarithmic corrections or higher orders in 1/M in the \(M\rightarrow \infty \) extrapolation, see Fig. 3.

Taking all these points into account, we quote as our final result

$$\begin{aligned} \Lambda ^{(3)}_{\overline{\textrm{MS}} } = 336(10)(6)_{b_{\textrm{g}}}(3)_{\Gamma _{m}}\, {\textrm{MeV}} = 336(12)\, {{\textrm{MeV}}}. \end{aligned}$$
(4.14)

Here the first error is statistical, the second is due to \(b_{\textrm{g}}\) and the third results from the estimated uncertainty in the z-extrapolation. The combined error covers all central results that we obtained by varying the cuts in z, \((aM)^2\), and the different \(\hat{\Gamma }_{m}\) except for two cases. These extreme cases have small \(c\le 0.33\) and include \(z=4\) data, where corrections to decoupling are expected to be the largest. They yield Q-values below 2%.

We further note that there is a significant correlation of the above statistical error with the one of the previous work [16],

$$\begin{aligned} \Lambda ^{(3)}_{\overline{\textrm{MS}} } = 341(12)\,{{\textrm{MeV}}}, \end{aligned}$$
(4.15)

using step scaling in the three-flavor theory up to high energy. The common piece is exactly the scale \(\mu _{\textrm{dec}}=789(15)\,{\textrm{MeV}}\). The off-diagonal element of the covariance matrix of the two determinations is

$$\begin{aligned} {\textrm{Cov}}((4.14),(4.15)) = 41\,{\textrm{MeV}}^2 , \end{aligned}$$
(4.16)

compared to the diagonal ones of \(144\,{\textrm{MeV}}^2\), which at present happen to be about the same for each of the individual determinations. As a quantitative measurement of the compatibility of the two different determinations we note that their difference is not significant at all: \(\Lambda \)(4.15)-\(\Lambda \)(4.14) = \(5(14)\,{\textrm{MeV}}\).

5 Result for \(\alpha _s(m_Z)\)

Our result for \(\Lambda ^{(3)}_{\overline{\textrm{MS}} }\) (Eq. (4.14)) can be translated, after running across the charm and bottom quark thresholds, into a value of the four and five flavor \(\Lambda \)-parameter. Using the FLAG values [4] (based on [70,71,72,73]) \(m_{{{\textrm{c}}}, \star } = 1275(5)\) MeV, \(m_{{\textrm{b}}, \star } = 4171(20)\) MeV for the charm and bottom quark mass thresholds,Footnote 7 we obtain the following values for the four and five flavor \(\Lambda \)-parameters

$$\begin{aligned} \Lambda ^{(4)}_{\overline{\textrm{MS}} }= & {} 294(10)(6)_{b_{\textrm{g}}}(3)_{\Gamma _{ m}}(0.7)_{3\rightarrow 4,{\textrm{PT}}}(1)_ {3\rightarrow 4,{{\textrm{NP}}}}\,{\textrm{MeV}}\nonumber \\= & {} 294(12)\, {\textrm{MeV}}, \end{aligned}$$
(5.1)
$$\begin{aligned} \Lambda ^{(5)}_{\overline{\textrm{MS}} }= & {} 211.3(8.1)(5.0)_{b_{\textrm{g}}}(2.4)_{\Gamma _{ m}}(0.7)_{3\rightarrow 5, {\textrm{PT}}}(0.8)_{3\rightarrow 5,{\textrm{NP}}}\,{\textrm{MeV}}\nonumber \\= & {} 211.3(9.8)\, {\textrm{MeV}}. \end{aligned}$$
(5.2)

where the first error is statistical, the second is due to \(b_\textrm{g}\), and the third represents the uncertainty associated with the logarithmic corrections in the limit \(M\rightarrow \infty \) (see Sect. 4.5). The last two errors come instead from crossing the charm and bottom thresholds: first a perturbative error (determined by taking the difference in the decoupling relations and RG functions between the last two known orders [28,29,30,31,32,33, 74,75,76,77,78,79]), and second an estimate of 0.3% in \(\Lambda ^{(3)}_{\overline{\textrm{MS}}}\) for the non-perturbative corrections in the decoupling of the charm [27].

Using the experimental value \(m_Z = 91187.6(2.1)\) MeV for the Z boson pole mass [2] we get

$$\begin{aligned} \alpha _s(m_Z)= & {} 0.11823(69)(42)_{b_{\textrm{g}}}(20)_{\Gamma _{\textrm{m}}}(6)_{3\rightarrow 5, {\textrm{PT}}}(7)_{3\rightarrow 5,{\textrm{NP}}}\nonumber \\= & {} 0.11823(84). \end{aligned}$$
(5.3)

Figure 4 shows a comparison of our results with other lattice computations [16, 17, 70, 80,81,82,83,84] that enter the FLAG average [4]. Our result shows a good agreement with the FLAG average, our previous determination of the strong coupling [16], and the other lattice works that enter in the FLAG average. It is important to point out that the result of this work is largely independent from our previous determination [16]. Only the value of \(\mu _{{{\textrm{dec}}}} = 789(15)\) MeV is shared between both determinations of the strong coupling (see Sect. 4.5.2). This amounts to 28% of the squared error.

Fig. 4
figure 4

Our result compared with other lattice computations [16, 17, 70, 81,82,83,84] that enter in the FLAG average [4] (acronyms taken from the FLAG report [4])

6 Conclusions and outlook

The determination of the strong coupling on the lattice faces particular challenges compared with low energy hadronic quantities. One has to connect a low energy scale with the perturbative high energy regime of QCD. Due to the slow running of the coupling, perturbative scales are very large and these two regimes cannot be comfortably simulated on a single lattice. This “window problem” which is due to the fact that only a limited range of scales can be simulated on a single lattice is the reason why most lattice determinations of the strong coupling have uncertainties dominated by the truncation errors of the perturbative series: they apply perturbation theory at in-between energy scales (see [5] for a review). One exception is the step scaling method [9], which was designed to cover a large scale difference non-perturbatively. In practice, however, the method is quite demanding, and a reduction of the current uncertainty in the strong coupling using this technique is possible but requires large computational resources.

An alternative strategy based on the decoupling of heavy quarks built on [27, 85] was formulated in Ref. [34]. In short, one connects the theory with physical quark masses to the one where up, down and strange quark have masses far above the low energy QCD scales. Decoupling of the heavy quarks relates the theory to the pure gauge theory and we can use the knowledge of a pure gauge intermediate energy scale, \(\mu _{\textrm{dec}}\approx 800~{\textrm{MeV}}\), in units of the parameter. Thus the non-perturbative running between \(\mu _{\textrm{dec}}\) and perturbative scales is taken from the pure gauge theory where it is much more tractable from the numerical point of view. The connection of \(\mu _{\textrm{dec}}\approx 800~{\textrm{MeV}}\) to the physical scales \(f_{\pi }\), \(f_{\textrm{K}}\) requires only one or two step-scaling steps with light quarks; we could here take it from previous work [16].

In this paper we have worked out practical and theoretical aspects in detail and in particular demonstrated how systematic effects of various kinds can be controlled by numerical extrapolations and/or explicit computations. This is far from trivial, since a very good precision is required in all steps to reach the desired accuracy of the strong coupling. For practical reasons intermediate scales of the theory are always defined by values of associated renormalized couplings and those are defined in the Schrödinger functional. We then need to control corrections to the continuum limit and the decoupling limit of order a and 1/M besides the ones of order \(a^2\) and \(1/M^2\) present also when space-time has no boundaries. We showed how the decoupling effective theory can be used to remove the 1/M corrections, again by non-perturbative information in the pure gauge theory, and how Symanzik and decoupling effective theories, applied in that order, restrict the form of a combined continuum and \(1/M^2\) extrapolation of the couplings in the massive theory. Together with the high accuracy [27] of perturbation theory [33, 74,75,76,77,78,79] in the relation , which we use for \(N_{\textrm{f}}=3,{N_{\textrm{l}}}=0\), this is a key to the precision reached in the result.

Building on these important theoretical steps, we have shown that precise results can be obtained using the decoupling strategy: our result, \(\alpha _s(M_Z) = 0.11823(84)\), is among the most precise determinations existing so far. The error is still statistically dominated, with negligible perturbative uncertainties. This also opens the way to further reduce the current uncertainty in the strong coupling with moderate additional effort. The main sources of uncertainty are first the single step-scaling step in QCD, second the missing knowledge of the improvement parameter \(b_{\textrm{g}}\), a parameter that affects the continuum extrapolations of our massive couplings, and third the pure gauge theory non-perturbative running at high energies. The first and third source of uncertainty are statistical in nature and can be substantially improved at a modest cost with existing techniques. Lastly, a non-perturbative determination of \(b_{\textrm{g}}\) would completely eliminate the second largest source of uncertainty on our result.

Finally it is worth mentioning that the good agreement between the result of this work, and the previous determination by the ALPHA collaboration [16] (using the step-scaling method in three flavor QCD), represents a highly non-trivial cross-check of the methods.

We expect that the use of heavy quarks as a tool for non-perturbative renormalization will have more applications in the future. For example, the determination of the strong coupling directly in large volume is possible, in principle [34]. The idea can straightforwardly be applied to the determination of quark masses. Other renormalization problems, such as the determination of RGI 4-fermion operators may be tractable, but there remains work to be done in continuum perturbation theory: the high accuracy available for the perturbative decoupling of the QCD parameters [33, 74,75,76,77,78,79] needs to be extended also to such operators.