1 Introduction

Lattice regularizations with Wilson type fermions [3] are widely used in current lattice QCD simulations [4,5,6,7,8,9,10]. The ultra-locality of the action enables numerical efficiency and thus access to a wide range of lattice spacings and spatial volumes. Furthermore, Wilson fermions maintain the full flavour symmetry of the continuum action, as well as the discrete symmetries such as parity, charge conjugation and time reversal. Unitarity is either realized exactly, or, in the case of Symanzik-improved actions, approximately up to cutoff effects which vanish in the continuum limit.

The price to pay for these advantages consists in the explicit breaking of all chiral symmetries by the Wilson term in the action. Well-known consequences include the additive renormalization of quark masses, the mixing under renormalization of composite operators in different chiral multiplets and discretization effects linear in a, the lattice spacing. Furthermore, the Noether currents of chiral symmetry are no longer protected against renormalization.

The matrix elements of the axial Noether currents between pion or kaon states and the vacuum, parametrized by the decay constants \(f_{\pi ,K}\), e.g.

$$\begin{aligned} \langle 0 \vert A_\mu ^{ud}(0) \vert \pi ^-, \mathbf{p}\rangle = i p_\mu f_\pi ,\quad A_\mu ^{ud}(x) = \overline{\psi }_{u}(x) \gamma _\mu \gamma _5 \psi _{d}(x), \end{aligned}$$
(1.1)

can be related to the measured life times of pions and kaons. The decay constants are finite in the chiral limit, can be precisely measured in numerical simulations and are ideally suited to set the scale in physical units. In order to achieve this with Wilson quarks one needs to determine the correctly renormalized axial currents,

$$\begin{aligned} \left( A_{\mathrm{R}}\right) _\mu ^{f_1f_2}(x) = Z_{\mathrm{A}} A_\mu ^{f_1f_2}, \end{aligned}$$
(1.2)

(with flavour indices \(f_{1,2} = u,d,s\)), which are to be inserted into the matrix elements. Of course it is desirable that the error of the matrix elements is not dominated by the uncertainty of the current normalization constant.

Over the last 30 years many efforts have been made to control the consequences of explicit chiral symmetry breaking with Wilson quarks. The main strategy consists in imposing continuum chiral symmetry relations as normalization conditions at finite lattice spacing [11, 12]. This is usually done using chiral Ward identities, which follow from an infinitesimal chiral change of variables in the QCD path integral. An example is the PCAC relation which determines the additive quark mass renormalization constant, as the “critical value” of the bare mass parameter, where the axial current is conserved. The fact that chiral symmetry is fully recovered only in the continuum limit implies that the choice of normalization condition matters at the cutoff level; at a fixed value of the lattice spacing the numerical results may occasionally differ substantially between any two such choices. Rather than interpreting this scatter as a systematic error, the modern approach consists in choosing a particular normalization condition and in fixing all dimensionful parameters (such as momenta or distances or background fields) in terms of a physical scale. This defines a so-called “line of constant physics” (LCP), along which the continuum limit is taken. As the lattice spacing a (or, equivalently, the bare coupling, \(g_0^2=6/\beta \)), is varied, this defines a function \(Z_{\mathrm{A}} = Z_{\mathrm{A}}(\beta )\). Obviously, another choice for the LCP will result in a different function \(Z'_{\mathrm{A}}(\beta )\). However, their difference will be, within errors, a smooth function of \(\beta \) which vanishes asymptotically \(\propto a\) or \(\propto a^2\) if \(\hbox {O}(a)\) improvement is implemented. Hence, following a LCP ensures that cutoff effects are smooth functions of \(\beta \) and the choice of LCP becomes irrelevant in the continuum limit. Adopting this viewpoint, the relevant systematic error is therefore determined by the precision to which a chosen LCP can be followed.

In this paper we apply a recently developed method to lattice QCD with \({N_{\mathrm{f}}}=2\) and \({N_{\mathrm{f}}}=3\) flavours, matching the lattice actions chosen by the CLS initiative [8, 10]. Our method is based on the chirally rotated Schödinger functional (\(\chi \mathrm{SF}\)) [13, 14]. The theoretical foundation of this framework has been explained in [14] and it has passed a number of perturbative and non-perturbative tests [15,16,17,18,19,20]. In contrast to the Ward identity method the axial current renormalization conditions follow from a finite chiral rotation in the massless QCD path integral with Schrödinger functional (SF) boundary conditions. The renormalization constants are then obtained from ratios of simple 2-point functions. For the axial current, this represents a significant advantage over the Ward identity method [12, 21, 22] which involves 3- and 4-point functions. Hence, we observe a dramatic improvement in the attainable statistical precision for \(Z_{\mathrm{A}}\) and some care is required to ensure that systematic errors are under control at a similar level of precision. We also discuss the normalization procedure for the local vector current. While flavour symmetry remains unbroken on the lattice with (mass-degenerate) Wilson quarks, the corresponding Noether current lives on neighbouring lattice points connected by a gauge link, so that the use of the local vector current is often more practical.

This paper is organized as follows: after a short reminder of the \(\chi \mathrm{SF}\) correlation functions in the continuum and the normalization conditions derived from them in Sect. 2, we define in Sect. 3 a couple of different LCPs which we have followed. We then present the \(Z_{\mathrm{A}}\) and \(Z_{\mathrm{V}}\) determinations for lattice QCD with \({N_{\mathrm{f}}}=2\) and \({N_{\mathrm{f}}}=3\) quark flavours in Sects. 4 and 5, respectively, together with various tests we have carried out. Section 6 contains a summary of the main results of this work and some concluding remarks. Finally, the paper ends with three technical appendices: Appendix A collects the parameters and results of the simulations, Appendix B provides a detailed discussion on the systematic error estimates for our determinations, and Appendix C gathers our set of chosen fit functions which smoothly interpolate our \(Z_{\mathrm{A,V}}\) results in \(\beta \).

The main results for \({N_{\mathrm{f}}}=2\) are collected in Table 4, while those for \({N_{\mathrm{f}}}=3\) are given in Tables 6 and 7. These results can be directly applied to data obtained from the CLS 2- and 3-flavour configurations, respectively [8, 10]. The \({N_{\mathrm{f}}}=3\) results have, in fact, already been used, and enabled the precision CLS scale setting in Ref. [1] and the accurate quark-mass renormalization of Ref. [23].

2 Renormalization conditions from universality relations

2.1 The Schrödinger functional and chiral field rotations

We start by considering massless two-flavour continuum QCD. The Euclidean space-time is taken to be a hyper-cylinder of volume \(L^4\) with Schrödinger functional boundary conditions [24, 25]. In particular, in the Euclidean time direction, the quark and anti-quark fields satisfy,

$$\begin{aligned} P_+\psi (x)|_{x_0=0}= 0 = \overline{\psi }(x)P_-|_{x_0=0}, \end{aligned}$$
(2.1)

and similarly at time \(x_0=L\) with the change \(P_\pm \rightarrow P_\mp \). The \(\hbox {SU}(2) \times \hbox {SU}(2)\) chiral and flavour symmetry leads to conserved isovector Noether currents, given by

$$\begin{aligned} A_\mu ^a(x)= & {} \overline{\psi }(x)\gamma _\mu \gamma _5\dfrac{\tau ^a}{2} \psi (x),\nonumber \\ V_\mu ^a(x)= & {} \overline{\psi }(x)\gamma _\mu \dfrac{\tau ^a}{2} \psi (x), \end{aligned}$$
(2.2)

with Pauli matrices \(\tau ^a\) and isospin index \(a=1,2,3\). SF correlation functions of these currents with isovector boundary sources \({\mathcal O}_5^a\) and \({\mathcal O}^a_k\) have been defined in [26, 27] and are given by

$$\begin{aligned} \begin{aligned}&\langle A^a_0(x) \mathcal {O}^{b}_5\rangle = -\delta ^{ab} f_{\mathrm{A}}(x_0), \\&\sum _{k=1}^3\langle V^a_k(x) \mathcal {O}^{b}_k\rangle = -3\,\delta ^{ab} k_{\mathrm{V}}(x_0). \end{aligned} \end{aligned}$$
(2.3)

Passing from the isospin notation to fields with definite flavour assignments,

$$\begin{aligned} A^{f_1f_2}_\mu (x)= & {} \overline{\psi }{}_{f_1}(x)\gamma _\mu \gamma _5\psi _{f_2}(x),\nonumber \\ V^{f_1f_2}_\mu (x)= & {} \overline{\psi }{}_{f_1}(x)\gamma _\mu \psi _{f_2}(x), \end{aligned}$$
(2.4)

and similarly for the boundary sources, the correlation functions for isospin indices \(a=1,2\), can be written in terms of the flavour off-diagonal fields,

$$\begin{aligned} f_{\mathrm{A}}(x_0)= & {} -{1\over 2} \langle A^{ud}_0(x) \mathcal {O}^{du}_5\rangle ,\nonumber \\ k_{\mathrm{V}}(x_0)= & {} -{1\over 6} \sum _{k=1}^3 \langle V_k^{ud}(x)\mathcal {O}^{du}_k\rangle . \end{aligned}$$
(2.5)

For the flavour diagonal fields in the isospin \(a=3\) components, e.g.

$$\begin{aligned} A_\mu ^{3} = \frac{1}{2}\left( A_\mu ^{uu} - A_\mu ^{dd}\right) , \end{aligned}$$
(2.6)

one may use flavour symmetry to write

$$\begin{aligned} f_{\mathrm{A}}(x_0) = -\frac{1}{2} \langle A^{uu'}_0(x) \mathcal {O}^{u'u}_5\rangle , \end{aligned}$$
(2.7)

and analogously for \(k_{\mathrm{V}}\). Note that the additional up-type flavour \(u'\) is merely a notational device to indicate the fermionic contractions taken into account when applying Wick’s theorem. Indeed, the sum of all the disconnected contributions for the flavour diagonal \(a=3\) components of SF correlation functions cancels exactly due to flavour symmetry.

We now apply a flavour diagonal chiral rotation to the fields,

$$\begin{aligned} \psi \rightarrow \exp \left( i\dfrac{\alpha }{2}\gamma _5\tau ^3\right) \psi ,\quad \overline{\psi }\rightarrow \overline{\psi }\exp \left( i\dfrac{\alpha }{2}\gamma _5\tau ^3\right) . \end{aligned}$$
(2.8)

Choosing the rotation angle \(\alpha =\pi /2\) then leads to the chirally rotated SF boundary conditions,

$$\begin{aligned} \tilde{Q}_+\psi (x)|_{x_0=0}= 0 = \overline{\psi }(x)\tilde{Q}_+|_{x_0=0}, \end{aligned}$$
(2.9)

with projectors \(\tilde{Q}_\pm = \frac{1}{2}(1 \pm i\gamma _0\gamma _5\tau ^3)\). Analogous boundary conditions with reverted projectors are obtained at \(x_0=L\). Applying the same chiral field rotation to the axial currents,

$$\begin{aligned} A^{ud}_\mu (x) \rightarrow -iV^{ud}_\mu (x),\quad A^{uu}_\mu (x) \rightarrow A^{uu}_\mu (x), \end{aligned}$$
(2.10)

one obtains either a vector current or remains with an axial current, depending on the flavour assignments. If the chiral rotation of the field variables is performed as a change of variables in the functional integral, one arrives at the formal continuum identities

$$\begin{aligned} f_{\mathrm{A}} = g^{uu'}_{\mathrm{A}} = -ig^{ud}_{\mathrm{V}},\quad k_{\mathrm{V}} = l^{uu'}_{\mathrm{V}} = -il^{ud}_{\mathrm{A}}, \end{aligned}$$
(2.11)

where the g- and l-functions are defined with \(\chi \hbox {SF}\) boundary conditions, Eq. (2.9), for instance

$$\begin{aligned} g_{\mathrm{A}}^{f_1f_2}(x_0) = -\frac{1}{2} \left\langle A^{f_1f_2}_0(x) \mathcal {Q}^{f_2f_1}_5\right\rangle _{({\tilde{Q}}_+)}. \end{aligned}$$
(2.12)

Here, the boundary operators \(\mathcal {Q}^{f_1f_2}_5\) denote the chirally rotated versions of their SF counterparts, \(\mathcal {O}^{f_1f_2}_5\). For the complete expressions and further details we refer to Ref. [20].

Regarding the case of QCD with \({N_{\mathrm{f}}}=3\) quark flavours we note that the very same steps can be taken provided the massless third quark does not take part in the chiral rotation and thus remains with standard SF boundary conditions [14]. Correlation functions are then considered for the doublet fields only, i.e. the third quark never appears as a valence quark.

2.2 Renormalization conditions

In the lattice regularized theory with Wilson type quarks, relations such as (2.11) can only be expected to hold after renormalization and up to cutoff effects. One first has to ensure that massless QCD with \(\chi \mathrm{SF}\) boundary conditions has been correctly regularized. This is achieved by tuning the bare mass parameter \(m_0\) to its critical value, \(m_{\mathrm{cr}}\), where the axial current is conserved, and by tuning a boundary counterterm coefficient \(z_f\) such that physical parity is restored (cf. [20] for more details). In terms of the bare \(\chi \hbox {SF}\) correlation functions one may choose the two conditions,

$$\begin{aligned} m = {\tilde{\partial }_0 g_{\mathrm{A}}^{ud}(x_0)\over 2g_{\mathrm{P}}^{ud}(x_0)}\bigg |_{x_0=L/2} = 0, \quad g_{\mathrm{A}}^{ud}(L/2) = 0 \end{aligned}$$
(2.13)

(with \(\tilde{\partial }_0\) the symmetric lattice derivative). The division by the pseudo-scalar correlation function \(g_{\mathrm{P}}^{ud}\) is not really necessary, however it is done for convenience, as it gives rise to the definition of a (bare) PCAC quark mass m. Solutions to these equations define \(m_{\mathrm{cr}}\) and \(z_f^*\) as functions of the bare coupling \(g_0\), and the lattice size, L / a.

Once the lattice regularization is correctly implemented, we expect e.g.

$$\begin{aligned} Z_{\mathrm{A}} Z_{\zeta }^2 g^{uu'}_{\mathrm{A}}(x_0) =-i Z_{\mathrm{V}} Z_{\zeta }^2 g^{ud}_{\mathrm{V}}(x_0) + \mathrm{O}(a^2), \end{aligned}$$
(2.14)

where \(Z_{\zeta }\) renormalizes a boundary quark or anti-quark field [14, 26, 28] and \(Z_{\mathrm{A,V}}\) are the current normalization constants of interest. Requiring such identities to hold exactly at finite lattice spacing thus fixes the relative normalization of axial and vector current. Replacing the latter by the exactly conserved lattice vector current \({\widetilde{V}}_\mu (x)\) (cf. Ref. [20]), for which \(Z_{\mathrm{\widetilde{V}}}=1\), one may obtain \(Z_{\mathrm{A}}\) from either one of the ratios

$$\begin{aligned} R^g_{\mathrm{A}} = {-ig_{\mathrm{\widetilde{V}}}^{ud}(x_0)\over \phantom {-i} g_{\mathrm{A}}^{uu'}(x_0)}\bigg |_{x_0=L/2}\quad \text {or}\quad R^l_{\mathrm{A}} = {il_{\mathrm{\widetilde{V}}}^{uu'}(x_0)\over \phantom {i} l_{\mathrm{A}}^{ud}(x_0)}\bigg |_{x_0=L/2}. \end{aligned}$$
(2.15)

Assuming that the parameters \(x_0\) (here set to L / 2), the boundary angle \(\theta \) [29], the background gauge field [24], and the precise definition for the zero mass and \(\alpha =\pi /2\) point (2.13) are fixed, we define, on an \((L/a)^4\) lattice and for a given bare coupling \(g_0^2=6/\beta \),

$$\begin{aligned} Z^{g,l}_{\mathrm{A}}(\beta ,L/a) = R^{g,l}_{\mathrm{A}}. \end{aligned}$$
(2.16)

Finally the choice of a line of constant physics (cf. Sect. 3) defines a smooth function \((L/a)(\beta )\) such that the normalization constants become functions of \(\beta \) alone, with the difference between any two definitions vanishing smoothly with a rate \(\propto a^2\).

We also comment on the appearance of a second up-type flavour \(u'\) in (2.15). When applying the chiral rotation (2.8) to the diagonal components of \(f_{\mathrm{A}}\), the disconnected diagrams are mapped to disconnected diagrams on the \(\chi \hbox {SF}\) side which can be shown to add up to a pure cutoff effect. Their omission is thus perfectly legitimate, even if the formulation of the renormalization conditions then has an element of partial quenching to it. The situation is comparable with the Ward identity method in two-flavour QCD [12, 21], where a fictitious s-quark can be introduced to eliminate the disconnected diagrams.

Even though there exists a conserved vector current, in practice the local current is often used and then requires renormalization, too. Its renormalization constant can be obtained from,

$$\begin{aligned} R^g_{\mathrm{V}} = {g_{\mathrm{\widetilde{V}}}^{ud}(x_0)\over g_{\mathrm{V}}^{ud}(x_0)}\bigg |_{x_0=L/2}\quad \text {or}\quad R^l_{\mathrm{V}} = {l_{\mathrm{\widetilde{V}}}^{uu'}(x_0)\over l_{\mathrm{V}}^{uu'}(x_0)}\bigg |_{x_0=L/2}. \end{aligned}$$
(2.17)

The same remarks as for the axial current normalization apply here, and with definite choices for all parameters we set,

$$\begin{aligned} Z^{g,l}_{\mathrm{V}}(\beta ,L/a) = R^{g,l}_{\mathrm{V}}. \end{aligned}$$
(2.18)

As in the case of the axial current normalization conditions, only 2-point functions are required, which connect the boundary quark bilinear sources with the currents in the bulk. This is a major advantage over the Ward identity method [12, 21] where 3- and 4-point functions are required. Hence, one expects better statistical precision from the simpler 2-point functions, and this will be confirmed below. Furthermore, as discussed in [20], the cutoff effects in the ratios are \(\hbox {O}(a^2)\), due to the mechanism of automatic \(\hbox {O}(a)\) improvement [30], even if the PCAC mass and the axial current are not \(\hbox {O}(a)\) improved by the counterterm \(\propto c_{\mathrm{A}}\) [26], or if the vector currents are not improved by the corresponding counterterms \(\propto c_{\mathrm{V}}, c_{\mathrm{\widetilde{V}}}\) [27, 31].

Finally, we emphasize that similar renormalization conditions can be devised for other finite renormalization constants. An interesting example is the ratio \(Z_{\mathrm{P}}/Z_{\mathrm{S}}\), where \(Z_{\mathrm{P}}\) and \(Z_{\mathrm{S}}\) are the pseudo-scalar and scalar renormalization constant, respectively. We refer the reader to Ref. [20] for more details.

3 Lines of constant physics and choice of renormalization conditions

3.1 General considerations

A line of constant physics requires to specify a physical (length) scale r which is kept fixed as the continuum limit is taken. A typical choice would be the pion decay constant, \(r=1/f_\pi \), either at the physical quark masses or in the chiral limit. Once calculated for a range of lattice spacings, this scale defines a function \((r/a)(\beta )\) of the bare coupling \(\beta =6/g_0^2\) which fixes the lattice spacing a in units of the chosen physical scale. Choosing the spatial lattice extent L / a, at a given beta, such that

$$\begin{aligned} \dfrac{(L/a)(\beta )}{(r/a)(\beta )} = L/r = C_r \end{aligned}$$
(3.1)

(with a numerical constant \(C_r\)) then fixes the spatial size of the finite volume system in units of r. In practice we will choose \(C_r\) such that the physical size of L will be somewhat larger than half a femto metre. Note that this equation can be read in two ways: first, if one fixes \(C_r\) and then chooses a set of \(\beta \)-values for which r / a is known, one obtains a corresponding set of values \((L/a)(\beta )\), which will not necessarily be integers. To evaluate the normalization constants at these non-integer lattice sizes then requires some interpolation of results from neighbouring integer L / a-values at the same \(\beta \). Alternatively, one could choose a set of integer L / a-values such that a choice for \(C_r\) will imply a set of \(\beta \)-values. In general this means that the data for r / a may have to be interpolated in \(\beta \). We will here choose the first option, with the set of \(\beta \)-values taken over from the large volume simulations by the CLS project [8, 10].

Having set the scale one needs to ensure the correlation functions are calculated in the desired situation of massless QCD and for the chosen chirally rotated boundary conditions at \(\alpha =\pi /2\). This means one needs to tune the bare quark mass \(am_0\) and \(z_f\) as functions of \(\beta \). We will discuss this in more detail below. Finally, the correlation functions depend on kinematic parameters, such as \(x_0\) or background field parameters such as \(\theta \). We have already set \(x_0=L/2\) in Eqs. (2.15), (2.17) and we choose \(\theta =0\) and work with vanishing SU(3) background field.

With these parameter choices we will have, for a given r and \(C_r\) in Eq. (3.1), two definitions each for \(Z_{\mathrm{A}}\) and \(Z_{\mathrm{V}}\), namely

$$\begin{aligned} Z^{g,l}_{\mathrm{A,V}}(\beta ) = R_{\mathrm{A,V}}^{g,l}(\beta ,a/L)\big \vert _{L/r=C_r;\, m=0;\, \alpha =\pi /2}\,, \end{aligned}$$
(3.2)

either based on the g- or the l-ratios. We then expect e.g. that

$$\begin{aligned} Z_{\mathrm{A}}^g(\beta ) = Z_{\mathrm{A}}^l(\beta ) + \mathrm{O}(a^2), \end{aligned}$$
(3.3)

where the \(a^2\)-effects are now expected to be smooth functions of the bare coupling.

3.2 Perturbative subtraction of cutoff effects

A possible refinement consists in using perturbation theory to reduce the cutoff effects perturbatively. This requires to compute the R-ratios (2.15), (2.17) perturbatively, with the exact same parameter choices as in the numerical simulations. We have performed this calculation to 1-loop order,

$$\begin{aligned} R_{\mathrm{A,V}}^{g,l}(g_0^2,a/L) = R_{\mathrm{A,V}}^{g,l(0)}(a/L) + g_0^2 R_{\mathrm{A,V}}^{g,l(1)}(a/L) + \mathrm{O}(g_0^4), \end{aligned}$$
(3.4)

and for the chosen parameters we always find \(R_{\mathrm{A,V}}^{g,l(0)}(a/L)=1\), exactly. We may then define a 1-loop correction factor,

$$\begin{aligned} r^{g,l}_{\mathrm{A,V}}(\beta ,L/a) = \dfrac{1+g_0^2 R_{\mathrm{A,V}}^{g,l(1)}(0)}{1+g_0^2 R_{\mathrm{A,V}}^{g,l(1)}(a/L)}\,, \end{aligned}$$
(3.5)

and results for the coefficients \(R_{\mathrm{A,V}}^{g,l(1)}\) are collected in Table 1, for the relevant lattice resolutions L / a and the two lattice gauge actions used by CLS. Note that the 1-loop results are \({N_{\mathrm{f}}}\)-independent and are thus obtained along the lines of Ref. [20], the only difference being the form of the free gluon propagator in the case of the Lüscher–Weisz gauge action [32]. As an aside we note that our results converge to the known 1-loop results \(Z_{\mathrm{A,V}}^{(1)}\) for an infinitely extended lattice [33,34,35], i.e. for \(a/L=0\). We also observe that the 1-loop cutoff effects for the l-definitions are generally much smaller than for the g-definitions.

Table 1 Finite L / a estimators for the current normalization constants at 1-loop order, and our estimates for their asymptotic values; the latter agree with previous results in the literature [33,34,35]. All results are given for \(\mathrm{SU(3)}\)

For given L / a and \(\beta \), the perturbatively improved current normalization constants are now defined by

$$\begin{aligned} Z_{\mathrm{A,V,\,sub}}^{g,l}(\beta ,L/a) = r_{\mathrm{A,V}}^{g,l}(\beta ,L/a)\times Z_{\mathrm{A,V}}^{g,l}(\beta ,L/a), \end{aligned}$$
(3.6)

and, by construction, the \(\hbox {O}(a^2)\) cutoff effects are subtracted to O(\(g_0^2\)), reducing them to O(\(a^2g_0^4\)). The subtracted data for the Z-factors are then treated as before: a choice of a line of constant physics implies a set of \(\beta \)- and corresponding L / a-values to which the data must be interpolated. We will see evidence for the effectiveness of this perturbative subtraction of cutoff effects in Sects. 4 and 5.

3.3 Choices of LCP for \({N_{\mathrm{f}}}=2\) and \({N_{\mathrm{f}}}=3\)

In order to fix the physical scale r, we choose either the kaon decay constant \(r=1/f_K\) (\({N_{\mathrm{f}}}=2\)), or the gradient flow scale \(r=\sqrt{8t_0}\) (\({N_{\mathrm{f}}}=3\)) [36].Footnote 1 In order to fix the respective constants \(C_r\) we proceed as follows. Given the set of values \(\beta _i\) for \(i=1,2,\ldots \) (taken from CLS), we choose as a reference value \(\beta _{\mathrm{ref}}\) either the largest or the smallest of the set. Choosing an integer lattice size L / a at the reference point \(\beta _{\mathrm{ref}}\) now fixes \(C_r\) through

$$\begin{aligned} C_r = \dfrac{(L/a)(\beta _{\mathrm{ref}})}{(r/a)(\beta _{\mathrm{ref}})}. \end{aligned}$$
(3.7)

Having set the scale in this way, the L / a-values at the remaining \(\beta _i\) follow from Eq. (3.1). For all our choices the physical size of our space-time extent will be \(L \approx 0.6-0.7\, \mathrm{fm}\). As mentioned before, except at the chosen reference value for \(\beta \) this requires interpolations of simulation results at integer L / a and our current simulation code, which is based on the openQCD package [37, 38], requires that L / a is also even.

3.4 Topology freezing

Numerical simulations of the SF by means of standard Monte Carlo algorithms are known to suffer from the topology freezing problem (see e.g. Ref. [39] for a discussion). A possible solution is to follow the proposal of Ref. [39] and simulate the theory with open-SF boundary conditions. However, if for the given choice of parameters the problem is “mild”, one can circumvent the issue in a straightforward manner by simply imposing the renormalization conditions (2.18) and (2.13) within the trivial topological sector [40, 41]. In a continuum notation, the correlation functions entering these definitions are modified as follows,

$$\begin{aligned} g_{\mathrm{A}}^{ud}(x_0)\, \rightarrow \, g_{\mathrm{A,Q}}^{ud}(x_0) = {-{1\over 2}\langle A^{ud}_0(x) \mathcal {Q}^{du}_5 \delta _{Q,0}\rangle _{({\tilde{Q}}_+)} \over \langle \delta _{Q,0}\rangle _{({\tilde{Q}}_+)}}, \end{aligned}$$
(3.8)

and analogously in all other cases.Footnote 2 Here, the Kronecker \(\delta \) in the functional integral selects the gauge field configurations with topological charge \(Q=0\). Since relations based on chiral flavour symmetries should hold separately in each topological charge sector, this restriction to the trivial sector is a legitimate modification of the current renormalization conditions. It provides a viable solution to the algorithmic problem of topology freezing in cases where this problem becomes marginally relevant; this means when the fraction of topologically non-trivial gauge field configurations in the relevant ensembles is not too large. For our choices of parameters, the percentage of gauge field configurations with \(Q\ne 0\) is generally below 10%, and reaches approximately 30% only in a couple of cases (cf. Tables 8, 9).

On the lattice the topological charge is not unambiguously defined. We follow Refs. [40, 41] and define the trivial topological sector as the set of gauge field configurations for which \(|Q|<0.5\), where Q is discretized in terms of the Wilson flow and the clover definition of the field strength tensor [36]. The flow time t is then kept fixed in physical units by requiring \(\sqrt{8t}=0.6\times L\).

3.5 On the tuning of \(am_0\) and \(z_f\)

The current normalization conditions require the \(\chi \hbox {SF}\) correlation functions at zero quark mass and with a chiral twist angle of \(\pi /2\). In practice this is achieved by the simultaneous tuning of \(m_0\) and \(z_f\) such that Eq. (2.13) are satisfied. In general a 2-parameter tuning can be quite involved. However, here the non-perturbative \(\hbox {O}(a)\) improvement of the action implies that the \(\hbox {O}(a)\) uncertainty of the zero mass point is very much reduced. Since a change in \(z_f\) merely re-defines the matrix element used to define the PCAC mass, a variation of \(z_f\) is expected to induce a small variation of m within this \(\hbox {O}(a)\) uncertainty. The latter could in principle be reduced to \(\hbox {O}(a^2)\) by including the \(c_{\mathrm{A}}\)-counterterm to the axial current, but this will not be pursued here. Another important observation is that, once \(m_0\) and \(z_f\) are within \(\hbox {O}(a)\) of their target values, the sensitivity of the PCAC mass to a variation of \(z_f\) is reduced to order \(a^2\) [cf. Appendix B, discussion after Eq. (B.8)]. One is therefore led to conclude that the PCAC mass m is to a good approximation independent of \(z_f\), and the tuning of \(m_0\) and \(z_f\) thus becomes straightforward; given a reasonable guess for \(z_f\), one can first tune \(m_0\), and then turn to \(z_f\).

Fig. 1
figure 1

Results for the PCAC mass as a function of the bare quark mass, for different values of \(z_f\). The dashed lines are linear fits to the data, while the solid vertical line indicates the location of our final estimate for \(am_{\mathrm{cr}}(g_0,a/L)\) (s. main text). The results are for \(L/a=8\) and \(\beta =5.3\)

As an illustration of this situation we discuss the \({N_{\mathrm{f}}}=2\) case for \(L/a=8\), \(\beta =5.3\). For the tuning we considered 3 values of \(\kappa =1/(2am_0+8)\) and 4 values of \(z_f\). We then generated around 2000 gauge field configurations separated by 10 MDUs for each of the 12 ensembles, and measured the relevant correlation functions. Figure 1 collects the results for the PCAC mass as a function of the bare quark mass, for the 4 different values of \(z_f\). Within statistical errors, the PCAC mass depends linearly on \(m_0\) and is essentially independent of \(z_f\). A linear fit of m vs. \(m_0\) yields an estimate of \(m_0=m_{\mathrm{cr}}(g_0,L/a)\) for which m vanishes: these are collected in Table 2. The results are perfectly compatible with each other, and we take as our estimate for \(m_{\mathrm{cr}}\) the result of a weighted average of these four.

Table 2 Results for \(am_{\mathrm{cr}}(g_0,L/a)\) for four different values of \(z_f\), for \(L/a=8\) and \(\beta =5.3\). The weighted average of the results is also given in the last row of the table

Once the critical bare mass is fixed, a smooth interpolation of \(g_{\mathrm{A}}^{ud}(L/2)\) in \(m_0\) gives the results shown in Fig. 2. Over the chosen range, \(g_{\mathrm{A}}^{ud}(L/2)\) so interpolated is perfectly linear in \(z_f\), and it is thus straightforward to determine the point \(z_f^*\) where \(g_{\mathrm{A}}^{ud}(L/2)\) vanishes i.e. \(z^*_f= 1.2877(5)\) in this example.

The estimated values of \(am_{\mathrm{cr}}\) and \(z_f^*\) determined in this way turn out to be quite accurate in practice, cf. Table 8.Footnote 3 We remark that results for \(m_{\mathrm{cr}}\) could also be taken from a different source, for instance from standard SF simulations. In this case only \(z_f\) needs to be tuned. The differences to the above procedure would be \(\hbox {O}(a)\) both in \(m_{\mathrm{cr}}\) and in \(z_f^*\) which, by the mechanism of automatic \(\hbox {O}(a)\) improvement, induce \(\hbox {O}(a^2)\) differences in observables such as the current normalization constants [14, 20]. One also expects that a precise tuning of \(m_0\) is less crucial in the \(\chi \mathrm{SF}\) than in the SF; the quark mass dependence of physical observables around the chiral limit is quadratic rather than linear [42].

3.6 Sources of uncertainties

Besides statistical errors directly affecting the estimators for the current normalization constants, the other source of uncertainty originates from the precision to which a line of constant physics can be followed. In principle also this latter effect is of a statistical nature, however, some elements of modelling or estimates may be involved when propagating these errors to the normalization constants, so that it is partly justified labelling these effects as systematic.

Fig. 2
figure 2

Results for \(g_{\mathrm{A}}^{ud}(L/2)\) as a function of \(z_f\). The dashed line is a linear fit to the data, while the solid vertical line indicates the location of our final estimate for \(z_f^*\) (s. main text). The values of \(g_{\mathrm{A}}^{ud}(L/2)\) come from an interpolation to \(\kappa =0.1361722\), and are for \(L/a=8\) and \(\beta =5.3\)

Our procedure consists of the following steps:

  1. 1.

    The LCP together with the set of values \(\beta _i\) translates to target values \((L/a)(\beta _i)\). At each \(\beta _i\) we choose lattices with even L / a straddling the target values. We here anticipate that with our choices of LCPs the required lattice sizes are in the range \(L/a=8\) to \(L/a=16\). Note that all target values \((L/a)(\beta _i)\) come with statistical errors except for \(\beta =\beta _{\mathrm{ref}}\), where, by definition, L / a is given as an (even) integer.

  2. 2.

    For given \(\beta \) and L / a we determine the solutions \(am_{0} = am_{\mathrm{cr}}\) and \(z_f=z_f^*\) of Eq. (2.13). In order to find their statistical errors which follow from the statistical uncertainties on m and \(g_{\mathrm{A}}^{ud}(L/2)\), we use estimates for the relevant derivatives,

    $$\begin{aligned} {\partial mL\over \partial m_0L},\quad {\partial mL\over \partial z_f},\quad {\partial g_{\mathrm{A}}^{ud}\over \partial m_0L},\quad {\partial g_{\mathrm{A}}^{ud}\over \partial z_f}. \end{aligned}$$
    (3.9)
  3. 3.

    We then determine the induced error on the Z-factors by estimating their derivatives with respect to the bare parameters,

    $$\begin{aligned} {\partial Z_{\mathrm{A,V}}\over \partial z_f},\quad {\partial Z_{\mathrm{A,V}}\over \partial m_0L}. \end{aligned}$$
    (3.10)

    It turns out that the derivatives (3.9) and (3.10) scale quite well with lattice size and lattice spacing, so that it is unnecessary to evaluate them for all parameter choices. Some cross checks are sufficient. The errors coming from the uncertainties in \(m_0\) and \(z_f\) are then combined in quadrature and added, again in quadrature, to the statistical error.

  4. 4.

    Where necessary, the results for \(Z_{\mathrm{A,V}}\) at the different L / a-values and fixed \(\beta _i\) are interpolated to the target \((L/a)(\beta _i)\); and the statistical error on \((L/a)(\beta _i)\) is propagated at this point. In the case where only one value of L / a has been simulated, an estimate for the derivative

    $$\begin{aligned} {\partial Z_{\mathrm{A,V}}\over \partial (L/a)} \end{aligned}$$
    (3.11)

    is used to assign a systematic error due to the difference \(\Delta (L/a)\equiv L/a-(L/a)(\beta _i)\), also taking into account the statistical uncertainty on \((L/a)(\beta _i)\). The resulting systematic error is again added in quadrature.

We emphasize that all systematic effects become essentially statistical errors provided enough data is produced to estimate the derivatives required to propagate the errors to the normalization constants. In the following two sections we will present the lattice set-up and results for \({N_{\mathrm{f}}}=2\) and \({N_{\mathrm{f}}}=3\) lattice QCD. We will also come back to some of the above points.

4 Numerical results for \({N_{\mathrm{f}}}=2\) flavours

4.1 Lattice set-up and parameter choices

The CLS large volume simulations of 2-flavour QCD [8] were performed using non-perturbatively \(\hbox {O}(a)\) improved Wilson quarks and the Wilson gauge action. The matching to CLS data via the bare coupling requires that we use the same action in the \(\chi \hbox {SF}\). As for the details of the action near the time boundaries we refer to Ref. [20]. In particular the counterterm coefficients \(c_{\mathrm{t}}(g_0)\) and \(d_s(g_0)\) were set to their perturbative one-loop values using the results of that reference. In general, the incomplete cancellation of boundary \(\hbox {O}(a)\) artefacts implies some remnant \(\hbox {O}(a)\) effects in observables. However, for the estimators of the current normalization constants, Eqs. (2.15), (2.17), it can be shown that such \(\hbox {O}(a)\) effects only cause \(\hbox {O}(a^2)\) differences [20].

The CLS simulations were carried out for 3 values of the lattice spacing [8], corresponding to the \(\beta \)-values 5.2, 5.3 and 5.5. For future applications we have added a finer lattice spacing corresponding to \(\beta =5.7\). We choose the smallest CLS-value \(\beta =5.2\) as reference value and set

$$\begin{aligned} L/a=8 \quad \mathrm{at} \quad \beta =5.2, \end{aligned}$$
(4.1)

to define the starting point for the line of constant physics. We then fix the space-time volume of the \(\chi \mathrm{SF}\) simulations in terms of the kaon decay constant, \(f_K\), evaluated at physical quark masses. Taking \(af_K\) from Table 3 at \(\beta =5.2\) yields

$$\begin{aligned} f_KL = 0.4744(74), \end{aligned}$$
(4.2)

and corresponds to \(L\approx 0.6\,\mathrm{fm}\). Imposing this condition at the other \(\beta \)-values then leads to the (non-integer) L / a-values given in Table 3. The quoted errors are a combination of statistical uncertainties, propagated from Eq. (4.2) and the error on \(af_K\) at the given \(\beta \)’s.

Table 3 Values of \(af_K\) used to determine \((L/a)(\beta )\) such as to satisfy the condition (4.2) for the given \(\beta \). The \(\chi \mathrm{SF}\) simulations were performed at the neighbouring even integer L / a-values given in the last column
Table 4 Results for \(Z_{\mathrm{A,V}}\), both g and l definitions, for \({N_{\mathrm{f}}}=2\) non-perturbatively \(\hbox {O}(a)\) improved Wilson fermions and Wilson gauge action. The lower part of the table contains the same results after subtraction of the one-loop cutoff effects, cf. Eq. (3.6)

While the first 3 results for \(af_K\) in Table 3 have been directly measured [8] we have estimated \(af_K\) at the fourth value, \(\beta =5.7\), as follows: with \(af_K\) at \(\beta =5.5\) taken as starting point we used the three-loop \(\beta \)-function for the bare coupling [43], in order to determine the ratio of lattice spacings. The error is obtained by summing (in quadrature) the statistical error propagated from the result at \(\beta =5.5\), and a systematic error due to the use of perturbation theory. The latter is estimated as the difference between the non-perturbative result for \(af_K\) at \(\beta =5.5\), and the same perturbative procedure, applied between \(\beta =5.3\) and \(\beta =5.5\). This systematic error is about 2.7 times larger than the statistical one, and thus dominates the error on L / a at \(\beta =5.7\).

Except for \(\beta = 5.3\), the target values \((L/a)(\beta _i)\) resulting from condition (4.2), are very close to even integer values of L / a, so that interpolations between simulations at different L / a can be avoided. At \(\beta =5.3\) we simulated at the three L / a-values given in the last column of Table 3 and interpolated to the target value (see Appendix B.4 for more details). For each choice of \(\beta \) and L / a, along the lines of the discussion in Sect. 3.5, we have carried out various tuning runs covering a range of \(am_0\) and \(z_f\), so as to determine the parameters satisfying the conditions (2.13). The values of the tuned parameters and the results for m and \(g_{\mathrm{A}}^{ud}(L/2)\) at these parameters are given in Table 8.

4.2 Results and error budget

In Table 4 we collect the results for \(Z_{\mathrm{A,V}}\), both g and l definitions, at the four values of the lattice spacing. The statistics range from 1800 to 12,000 measurements depending on the ensemble, cf. Table 8. The quoted uncertainties combine the statistical and systematic errors. The statistical errors are at the level of 0.1–0.4\(\permille \), depending on the Z-factor and ensemble considered. Hence a significant contribution to the error comes from systematic uncertainties.

Fig. 3
figure 3

Comparison of different \(Z_{\mathrm{A}}\) determinations for \({N_{\mathrm{f}}}=2\), obtained from WIs in the standard SF and from universality relations in the \(\chi \mathrm{SF}\). The effect of the perturbative one-loop improvement of the \(\chi \mathrm{SF}\) results is also shown (right panel). The \(\chi \mathrm{SF}\) results are those of Table 4. The individual SF points are taken from Refs. [8, 44], and are slightly displaced on the x-axis for better clarity. The solid black line corresponds to the SF results from the fit formula of Ref. [8], and the dashed lines delimit the \(1\sigma \) region of the fit. Note that the SF fit formula is obtained by considering additional points with \(g_0^2<1\), here not shown, and by enforcing the perturbative 1-loop behaviour for \(g_0^2\rightarrow 0\) (see Ref. [8] for the details)

Fig. 4
figure 4

Comparison of different \(Z_{\mathrm{V}}\) determinations for \({N_{\mathrm{f}}}=2\), obtained from WIs in the standard SF and from universality relations in the \(\chi \mathrm{SF}\). The effect of the perturbative one-loop improvement of the \(\chi \mathrm{SF}\) results is also shown (right panel). The \(\chi \mathrm{SF}\) results are those of Table 4. The individual SF points are taken from Refs. [21], and are slightly displaced on the x-axis for better clarity. The solid black line corresponds to the SF results from the fit formula of Ref. [21], and the dashed lines delimit the \(1\sigma \) region of the fit. Note that the SF fit formula is obtained by considering additional points with \(g_0^2<1\), here not shown, and by enforcing the perturbative 1-loop behaviour for \(g_0^2\rightarrow 0\) (see Ref. [21] for the details)

As discussed in Sect. 3.6, systematic errors result from uncertainties or deviations in following a chosen LCP, which correspond with statistical errors and deviations from zero in m and \(g_{\mathrm{A}}^{ud}(L/a)\), as well as uncertainties in the target lattice extent L / a and systematic errors arising from inter- or extrapolations from the simulated lattices sizes, if applicable. Tables 3 and 8 contain the relevant information for the case \({N_{\mathrm{f}}}=2\). The propagation of these uncertainties to the Z-factors is then performed following the steps outlined in Sect. 3.6. We have carried out some additional simulations to estimate the derivatives in Eqs. (3.9), (3.10), and some perturbative calculation to check the expected scaling of the derivatives with the lattice size. We delegate a detailed discussion to Appendix B. Here we just note that with our statistics and our rather conservative approach, the propagated uncertainties are typically larger than the statistical errors for the R-estimators Eqs. (2.15), (2.17) (cf. Tables 10, 11).

4.2.1 Effect of perturbative one-loop improvement

As discussed in Sect. 3.2, we have also computed the relevant \(\chi \hbox {SF}\) correlation functions in perturbation theory to order \(g_0^2=6/\beta \). Besides consistency checks and qualitative insight the main application consists in the perturbative subtraction of cutoff effects from the data. Note that this requires to emulate the non-perturbative procedure in all details, in particular the determination of \(am_{\mathrm{cr}}\) and \(z_f^*\) according to Eq. (2.13). The lower part of Table 4 contains the results for \(Z_{\mathrm{A,V}}\) after perturbative improvement. Comparing with the unimproved results in the upper part of Table 4, one can see that the g-definitions are more affected, and are brought closer to the corresponding l-definitions by the perturbative improvement (cf. also Fig. 5). In any case, the perturbative corrections are at the level of 1 per cent at most.

In conclusion, our final results for \(Z_{\mathrm{A,V}}\), either with or without perturbative improvement, turn out to be very precise and improve significantly on the standard SF determination based on chiral Ward identities (WIs) [8, 21, 44]. This is particularly true for the case of \(Z_{\mathrm{A}}\), which can be appreciated in Fig. 3 where the determinations of Table 4 are compared with those of Refs. [8, 44]. In Fig. 4 we show instead a comparison for the case of \(Z_{\mathrm{V}}\), as obtained from the \(\chi \mathrm{SF}\), cf. Table 4, and from the standard SF (cf. Ref. [21]). We note that a relevant contribution to the error of our results comes from propagating the uncertainties associated with maintaining the condition (4.2) i.e. keeping L constant (cf. Tables 10, 11). We anticipate that due to the much more accurate knowledge of the LCP in terms of \(t_0\) (cf. Table 5), and by using interpolations in L / a at all relevant \(\beta \) values, this source of error will be essentially eliminated in the case of \({N_{\mathrm{f}}}=3\) (cf. Sect. 5).

Table 5 CLS \(\beta \)-values and corresponding results for \(t_0/a^2\) in the SU(3) flavour symmetric limit [1, 2]. The latter are used to determine the lattice sizes \((L_{1,2}/a)(\beta _i)\) which satisfy the conditions (5.1). The \(\chi \mathrm{SF}\) simulations are performed at the neighbouring L / a’s given in the last column of the table
Fig. 5
figure 5

Continuum limit of the ratios between the l and g definitions of \(Z_{\mathrm{A}}\) (left panel) and \(Z_{\mathrm{V}}\) (right panel) for the case of \({N_{\mathrm{f}}}=2\) quark-flavours; the effect of subtracting the lattice artefacts from the Z-factors to \(\hbox {O}(g_0^2)\) is also shown. The dashed lines correspond to linear fits to the data, constrained to extrapolate to 1 for \(a/L=0\)

4.3 Universality and automatic \(\hbox {O}(a)\) improvement

The \(\chi \mathrm{SF}\) determinations (2.15) and (2.17) are expected to be automatically \(\hbox {O}(a)\) improved once the bare parameters \(m_0\) and \(z_f\) are properly tuned (cf. Sect. 2.2). This means that neither bulk nor boundary \(\hbox {O}(a)\) counterterms are necessary to cancel \(\hbox {O}(a)\) discretization errors in these quantities. This was confirmed to one-loop order in perturbation theory [20] and should hold generally. To this end we now look at the ratios between Z-factors coming from the g- and l-definitions. The expectation that these ratios converge to 1 with \(\hbox {O}(a^2)\) corrections is indeed very well borne out by the data, cf. Fig. 5, where we also include fits to this expected behaviour. We emphasize that this is a non-trivial result: even though the bulk action is improved to match the CLS set-up, we did not \(\hbox {O}(a)\) improve the currents entering the definitions (2.15), (2.17) and (2.13). This result thus confirms automatic \(\hbox {O}(a)\) improvement at the non-perturbative level, and, indirectly, the universality relations between the \(\chi \mathrm{SF}\) and SF formulations. A direct way to test universality between the \(\chi \mathrm{SF}\) and SF formulations would be simply to study the continuum scaling of ratios of Z-factors as obtained from one and the other formulation. Provided the SF determinations are properly improved, these should also approach 1 in the continuum limit with \(\hbox {O}(a^2)\) corrections. The large errors on the SF determinations do not allow us for a precise test of this expectation. However, the results in Figs. 3 and 4 clearly show that our determinations are in fact compatible with the SF ones within errors.

5 Numerical results for \({N_{\mathrm{f}}}=3\) flavours

5.1 Lattice set-up and parameter choices

The CLS simulations with \({N_{\mathrm{f}}}=2+1\) flavours of non-perturbatively \(\hbox {O}(a)\) improved Wilson fermions [45] and Lüscher–Weisz (LW) gauge action, have been carried out for 5 values of the lattice spacing, with \(\beta \)-values between 3.4 and 3.85 [1, 2, 10]. For completeness we note that CLS has also tried to simulate at a coarser lattice spacing corresponding to \(\beta =3.3\). However, these ensembles have been discarded for the scale determination in [1] due to very large cutoff effects observed e.g. in \(t_0\) [10]. For this reason we will not consider this \(\beta \)-value in our study, however, we mention that it was adopted as starting point for the Ward identity determination of \(Z_{\mathrm{A}}\) in Ref. [22]. Given the relatively large set of lattice spacings we here consider two different LCPs, with slightly different physical extent, \(L_1\) and \(L_2\), which we define through the gradient flow time \(t_0\) [36]. The associated length scale \(r=\sqrt{8t_0}\) can be interpreted as a smoothing radius, and has been very precisely determined for the CLS \(\beta \)-values \(\ge 3.4\) in [1, 2]. Using this scale we impose the conditions

$$\begin{aligned} L_1/\sqrt{8t_0}= 1.6719(16) \quad \mathrm{and} \quad L_2/\sqrt{8t_0}= 1.5099(30), \end{aligned}$$
(5.1)

where the right hand sides were chosen in order to have exactly,

$$\begin{aligned} L_1/a=8 \quad \mathrm{at} \quad \beta =3.4 \quad \mathrm{and} \quad L_2/a=16 \quad \mathrm{at} \quad \beta =3.85, \end{aligned}$$
(5.2)

respectively. Using the result for \(t_0\) in physical units [1], Eq. (5.1) translates to \(L_1\approx 0.7\,\mathrm{fm}\) and \(L_2\approx 0.6\,\mathrm{fm}\).

In Table 5 we collect the relevant \(\beta \) values of the CLS simulations and the corresponding results for \(t_0/a^2\) [2]. The latter are evaluated for equal up-, down-, and strange-quark masses, which are close to the physical average quark mass (see Refs. [1, 2]). Table 5 also gives the lattice sizes \((L_{1,2}/a)(\beta )\) which satisfy the conditions (5.1). Compared to the \({N_{\mathrm{f}}}=2\) case (cf. Table 3), it is obvious that these \({N_{\mathrm{f}}}=3\) LCPs are much more accurately determined. In order to exploit this higher precision, we performed simulations for several L / a-values at each \(\beta \) (cf. Table 5). This allowed us to accurately interpolate the Z-factors to the target values (see Appendix B.5 for more details). Table 9 contains a summary of all simulations performed with the corresponding parameters. Due to both technical and historical reasons, we do not use the finest lattice spacing for the LCP defined in terms of \(L_1\). Following this LCP up to \(\beta =3.85\) would have required simulating lattices with \(L/a=18,20\), which are particularly inconvenient to parellelize with our current simulation program. Note also that CLS simulations at \(\beta =3.85\) are ongoing and currently limited to a single ensemble, so that the LCP with \(L_1\) may remain useful for a while. More importantly, however, the comparison between both LCPs allows us to perform additional tests on our results (cf. Sect. 5.3).

The lattice action we employ for the finite volume simulations matches the CLS action in the bulk, i.e. the Lüscher–Weisz tree-level improved gauge action and 3 flavours of non-perturbatively improved Wilson quarks [45]. Close to the time boundaries of the lattice there is some freedom regarding the implementation of Schrödinger functional boundary conditions. For the gauge fields we choose option B of Ref. [32]; we refer the reader to this reference for the details. Regarding the fermions, two quark flavours satisfy \(\chi \mathrm{SF}\) boundary conditions (option \(\tau =1\) of [14]), while the third one obeys the standard SF boundary conditions [25]. In general, such a mixed set-up increases the number of \(\hbox {O}(a)\) improvement coefficients which need to be tuned in order to eliminate \(\hbox {O}(a)\) discretization errors from the time boundaries. As in the \({N_{\mathrm{f}}}=2\) case, however, one can show that the corresponding counterterms affect the renormalization constants \(Z_{\mathrm{A,V}}\) only at \(\hbox {O}(a^2)\). For definiteness we have used the one-loop estimate \(c_{\mathrm{t}} = 1 + g_0^2 c_{\mathrm{t}}^{(1)}\), where the one-loop coefficient decomposes as follows,

$$\begin{aligned} c_{\mathrm{t}}^{(1)} = c_{\mathrm{t}}^{(1,0)} + 2\times c_{\mathrm{t}}^{(1,1)}(\chi \text {SF}) + 1\times c_{\mathrm{t}}^{(1,1)}(\text {SF}). \end{aligned}$$
(5.3)

The pure gauge contribution is taken from Ref. [46], the fermionic \(\chi \mathrm{SF}\) contribution from Ref. [20] and the SF contribution from Ref. [29].Footnote 4 Furthermore, we use the tree-level values \(d_s=1/2\) [20] and \(\tilde{c}_{\mathrm{t}}=1\) [26].

5.2 Results and error budget

In Tables 6 and 7 we collect the results for \(Z_{\mathrm{A,V}}\), corresponding to the \(L_1\)- and \(L_2\)-LCP, respectively. The statistics we accumulated for the different ensembles ranges between 3200 and 31,000 measurements, with exact numbers given in Table 9. The corresponding statistical precision on the Z-factors is between 0.1 and 0.55\(\permille \), depending on the exact quantity and ensemble. The errors quoted in the tables then combine the statistical errors with the systematic errors originating from the uncertainties on the LPCs.

Table 6 \({N_{\mathrm{f}}}=3\) results for \(Z_{\mathrm{A,V}}\) using the \(L_1\)-LCP, both for g and l definitions. The lower part of the table contains the results after subtraction of the one-loop cutoff effects, cf. Eq. (3.6)
Table 7 Same as Table 6 but for the \(L_2\)-LCP

Like in the \({N_{\mathrm{f}}}=2\) case, the high statistical precision requires a careful assessment of the systematic errors in order to arrive at reliable error estimates. Tables 5 and 9 contain information on the accuracy with which the chosen LCPs are realized for our simulation parameters. Our estimates for the systematic uncertainties due to deviations from the chosen LCP were then obtained analogously to the case of \({N_{\mathrm{f}}}=2\); we refer the reader to Appendix B for the details. Here it is worth noting that, similarly to this case, the propagated uncertainties are typically larger than the statistical errors for the R-estimators, Eqs. (2.15), (2.17), cf. Table 12.

5.2.1 Effect of perturbative one-loop improvement

In the lower halves of Tables 6 and 7 we give the results for \(Z_{\mathrm{A,V}}\) after perturbatively subtracting the lattice artefacts to one-loop order. The results have been obtained by first improving the \(Z_{\mathrm{A,V}}\) determinations for each L / a and \(g_0\) value, and then interpolating to the proper \((L_{1,2}/a)(\beta )\) (see Appendix B.5).

Comparing the results for \(Z_{\mathrm{A,V}}\) before and after perturbative improvement, one sees that the g-definitions are the most affected, and are brought closer to the corresponding l-definitions. All in all, the effect of the perturbative improvement is at most at the level of a couple of percent (cf. Fig. 7). Hence, not too surprisingly perhaps, the situation is very much the same as for the \({N_{\mathrm{f}}}=2\) case.

In conclusion, our final results for \(Z_{\mathrm{A,V}}\) are very precise for both LCPs. Similarly to the \({N_{\mathrm{f}}}=2\) case, the results for \(Z_{\mathrm{A}}\) are significantly more accurate than the standard SF determination based on Ward identities [22]. This can be appreciated in Fig. 6, where the results from Table 7 are displayed together with the 2 alternative definitions \(Z_{\mathrm{A,0}}\) and \(Z_{\mathrm{A,0}}^\mathrm{con}\) of Ref. [22].

5.3 Universality and automatic \(\hbox {O}(a)\) improvement

Given our estimates for \(Z_{\mathrm{A,V}}\) we can study the approach to the continuum limit of the ratio between different definitions. We begin with Fig. 7 where the ratios between the g- and l-definitions are considered for the \(L_1\)- and \(L_2\)-LCPs; both the results before and after perturbative improvement are shown. The conclusions are very much the same as for the \({N_{\mathrm{f}}}=2\) case. Considering the results before perturbative improvement, along both LCPs, the g and l definitions deviate by at most a couple of per-cent. These differences then perfectly scale with \(a^2\) to zero as the continuum limit is approached. If perturbative improvement is implemented, these differences almost vanish even at the coarsest lattice spacings. There is no significant deviation from \(a^2\) scaling, however, some small admixture of higher order effects cannot be excluded either.

Fig. 6
figure 6

Comparison between different \(Z_{\mathrm{A}}\) determinations for \({N_{\mathrm{f}}}=3\), obtained either from WIs in the standard SF or from universality relations in the \(\chi \mathrm{SF}\). The \(\chi \mathrm{SF}\) results are taken from Table 7 and the effect of the perturbative one-loop improvement is shown in the right panel. The individual SF points labelled \(Z_{\mathrm{A}}^\mathrm{SF}\) and \(Z_{\mathrm{A, con}}^\mathrm{SF}\) are taken from Ref. [22] and correspond to the definitions \(Z_{\mathrm{A,0}}\) and \(Z_{\mathrm{A,0}}^\mathrm{con}\), respectively, of that reference. The solid black line is the fit formula to \(Z_{\mathrm{A}}^\mathrm{SF}\) also given in [22] and the dashed lines delimit the \(1\sigma \) region of the fit. Note that this fit function enforces the perturbative 1-loop behaviour for \(g_0^2\rightarrow 0\)

Fig. 7
figure 7

Continuum limit of the ratios between the l and g definitions of \(Z_{\mathrm{A}}\) (left panels) and \(Z_{\mathrm{V}}\) (right panels) for the case of \({N_{\mathrm{f}}}=3\) quark-flavours; the effect of subtracting the lattice artefacts from the Z-factors to O(\(g_0^2\)) is also shown. The upper panels show the \(L_1\)-LCP results while the lower ones show those of the \(L_2\)-LCP. In all cases, the dashed lines correspond to linear fits to the data constrained to extrapolate to 1 for \(a/L_{1,2}=0\). Note that the (tiny) effect of the statistical correlation between numerator and denominator has been neglected in these ratios

Fig. 8
figure 8

Continuum limit of the ratio between the \(Z^l_{\mathrm{X},\,L_2}\) definitions, \(\hbox {X}=\hbox {A}\), V, corresponding to the \(L_2\)-LCP, and the \(Z^g_{\mathrm{X},\,L_1}\) definitions corresponding to the \(L_1\)-LCP. The dashed lines correspond to linear fits to the data constrained to extrapolate to 1 for \(a^2/t_0=0\)

Fig. 9
figure 9

Continuum limit of the ratios between the \({N_{\mathrm{f}}}=3\) WI determinations of \(Z_{\mathrm{A}}\) of Ref. [22], and the \(\chi \mathrm{SF}\) determinations \(Z_{\mathrm{A}}^{g,l}\) (left panel) and \(Z_{\mathrm{A,\,sub}}^{g,l}\) (right panel) of Table 7. The \(Z_{\mathrm{A}}^\mathrm{SF}\) results are from the fit formula provided in Ref. [22], and correspond to their preferred, \(Z_{\mathrm{A,0}}\), definition. The associated dashed lines (red and blue lines) are linear fits to the data with \(a^2/t_0<0.2\), constrained to extrapolate to 1 for \(a^2/t_0=0\). The \(Z_{\mathrm{A,\,con}}^\mathrm{SF}\) results come instead from a fit of the results for the alternative, \(Z_{\mathrm{A,0}}^\mathrm{con}\), definition considered in Ref. [22]. The latter fit was obtained using the same fit ansatz used in Ref. [22] for \(Z_{\mathrm{A,0}}\). The associated dashed lines (green and magenta lines) are linear fits to all data, constrained to extrapolate to 1 for \(a^2/t_0=0\)

It is also interesting to consider the continuum limit of the ratio between the definitions belonging to different LCPs i.e. the \(L_1\)- and \(L_2\)-LCP. An example of such a ratio is shown in Fig. 8. Also in this case, the continuum scaling of this ratio is the one expected, and the initial difference is at the 2 per cent level. Apart from providing an important check of universality and automatic \(\hbox {O}(a)\) improvement, these results show that considering one definition or the other for the renormalization of matrix elements of the axial and vector currents, will only introduce small \(\hbox {O}(a^2)\) differences over the whole range of lattice spacings covered.

Finally, we look at ratios between \(\chi \mathrm{SF}\) and standard SF determinations. Towards the continuum limit these should also scale like \(1+\mathrm{O}(a^2)\), if the SF determinations are \(\hbox {O}(a)\) improved. In Fig. 9 we show the continuum limit of the ratios between the standard SF determinations of Ref. [22] and the \(\chi \mathrm{SF}\) results of table 7. We here consider both definitions of this reference, and label them as \(Z_{\mathrm{A}}^\mathrm{SF} = Z_{\mathrm{A,0}}\) and \(Z_{\mathrm{A,\,con}}^\mathrm{SF} = Z_{\mathrm{A,0}}^\mathrm{con}\), respectively (cf. [22] for the exact definitions).

As one can see in Fig. 9, for their preferred definition, \(Z^\mathrm{SF}_{\mathrm{A}}\), the expected scaling is only setting in around \(a^2/t_0<0.2\), where the SF and \(\chi \mathrm{SF}\) determinations differ by a couple of per cent. At the coarsest lattice spacing, corresponding to \(\beta =3.4\), the deviation from the \(\hbox {O}(a^2)\) scaling is significant. The results for \(Z_{\mathrm{A}}^g\) show the largest deviation from the SF determination, which is about 6%. Considering the perturbatively improved \(\chi \mathrm{SF}\) results this difference is somewhat reduced to 4–5%, but \(\hbox {O}(a^2)\) scaling is not observed either. If we consider instead the alternative definition, \(Z^\mathrm{SF}_{\mathrm{A,\,con}}\), the deviation is reduced to about 2 per cent at the coarsest lattice spacing for \(Z_{\mathrm{A}}^l\), while, remarkably, the results for \(Z_{\mathrm{A}}^g\) and \(Z^\mathrm{con}_{\mathrm{A,0}}\) are compatible within errors. In particular, the difference between this SF and both our \(\chi \mathrm{SF}\) determinations is perfectly compatible with an \(\hbox {O}(a^2)\) effect over the whole range of lattice spacings considered. While discretization effects can only be defined with respect to some reference definition, we conclude that the alternative SF definition \(Z_{\mathrm{A,\,con}}^\mathrm{SF}\) is, within errors, perfectly scaling with \(a^2\) for \(\beta \ge 3.4\) relative to all \(\chi \mathrm{SF}\) definitions, whereas the preferred definition \(Z_{\mathrm{A}}^\mathrm{SF}\) of Ref. [22] requires much finer lattices before this expected asymptotic behaviour sets in. With hindsight, \(Z_{\mathrm{A,\,con}}^\mathrm{SF}\) seems to be a better choice within the SF framework and also has been the preferred SF definition within the \({N_{\mathrm{f}}}=2\) setup of Refs. [8, 44].

6 Summary and conclusions

We have used a new method [20] based on the chirally rotated Schrödinger functional [14] to obtain high precision results for the normalization constants of the Noether currents corresponding to non-singlet chiral and flavour symmetries. The matrix elements of these axial and vector currents play a crucial rôle in various contexts of hadronic physics. Our method differs from the traditional Ward identity method [11, 12] in that it compares correlation functions which are related by finite chiral or flavour rotations, rather than infinitesimal ones. The major advantage compared to the Ward identity method consists in the avoidance of 3- and 4-point functions in favour of simple 2-point functions. This very significantly improves on the precision achieved in previous determinations [8, 21, 22, 44, 47]. In particular, for the case of \(Z_{\mathrm{A}}\), we obtain a reduction of the error by up to an order of magnitude (cf. Figs. 3, 6). The relatively poor precision obtained for \(Z_{\mathrm{A}}\) with the traditional Ward identity methods [8, 21, 22, 44] (around the percent level at the coarsest lattice spacings of interest), has now become a limiting factor in several applications. For this reason, our results are in high demand and have already been used in several works [1, 23, 48]. In particular, the precise \({N_{\mathrm{f}}}=2+1\) scale setting from a linear combination of \(f_K\) and \(f_\pi \) in Ref. [1] crucially relies on our values of \(Z_{\mathrm{A}}^l\) in Table 6 and the associated uncertainty is negligible compared to the statistical error of the bare hadronic matrix elements. In turn, the precise scale setting result of [1] is entering almost all studies done with CLS gauge configurations: in particular it has enabled the precise result for the 3-flavour QCD \(\Lambda \)-parameter and thus \(\alpha _s(m_Z)\) by the ALPHA-collaboration [2, 41, 49, 50]. Further applications of our \(Z_{\mathrm{A}}\)-results include the non-perturbative quark mass renormalization factor in [23] and the related determination of the light and strange quark masses [48]. Regarding the \({N_{\mathrm{f}}}=2\) case, the potential improvement of the scale setting in Ref. [8] due to our \(Z_{\mathrm{A}}\)-results would be very significant, too. Tentative estimates anticipate a gain by a factor 3–6 in precision, when going from the finest to the coarsest lattice spacing [51].

In order to maximize the usefulness of our results we have chosen the same actions and the same \(\beta \)-values for \({N_{\mathrm{f}}}=2\) and \({N_{\mathrm{f}}}=3\) lattice QCD as used by the CLS initiative [8, 10]. Hence, anyone working with CLS gauge configurations will be able to directly use our results: for \({N_{\mathrm{f}}}=2\) we recommend to use \(Z_{\mathrm{A,V,\,sub}}^l\) from Table 4, and for \({N_{\mathrm{f}}}=3\) we recommend using \(Z_{\mathrm{A,V,\,sub}}^l\) either of Tables 6 or 7. Although the results for \(Z_{\mathrm{A,V,\,sub}}^l\) are slightly less precise than those for \(Z_{\mathrm{A,V,\,sub}}^g\), their L / a-interpolations turn out to be more robust. Furthermore, the effect of the perturbative subtraction of cutoff effects is rather small and only marginally significant with current errors. While the precise choice of the \(\chi \mathrm{SF}\) results for the Z-factors is not crucial, it is however very important to be consistent and to not switch definitions when changing \(\beta \). Only then cutoff effects are guaranteed to vanish smoothly at a rate \(\propto a^2\).

Our determination of \(Z_{\mathrm{A,V}}(\beta )\) was carried out for each \(\beta \)-value independently, in order to avoid adding statistical correlation between physics results at different lattice spacings. However, it is straightforward to fit our Z-factors to a smooth function of \(\beta \) (or \(g_0^2\)), which interpolates to any intermediate \(\beta \)-value. We have included a few such fits in Appendix C to our preferred definitions \(Z_{\mathrm{A,V,\,sub}}^l\). We also include fits which incorporate the expected perturbative behaviour to 1-loop order. However, the high precision obtained in the \(\beta \)-range covered by the data cannot be guaranteed outside this range. If a similar precision is required at higher \(\beta \), an extension of our non-perturbative determination will be required. If \(t_0/a^2\) was known for higher \(\beta \)-values one could extend our chosen line of constant physics covering another factor of 2 or so in the lattice spacing. The required simulations of the \(\chi \hbox {SF}\) for lattice sizes up to \(L/a=32\) would be feasible with current resources. Going beyond this range it may be advisable to choose a different line of constant physics from a finite volume observable, or at least estimate the errors incurred by deviating from the original choice.

In applications to hadronic physics one would also like to control the \(\hbox {O}(a)\) effects cancelled by the counterterms to the currents. Close to the chiral limit, one essentially requires the counterterm coefficients \(c_{\mathrm{A,V}}\) [26, 27]. We emphasize that our method of determining the Z-factors does not rely on any assumptions about these counterterms and can therefore be combined with results for \(c_{\mathrm{A,V}}\) from other studies, e.g. [47, 52]. The same remark applies to the b-coefficients multiplying O(am) counterterms, which have recently been determined for the vector current in Ref. [53].

Looking beyond direct applications of our results in the CLS context, it is quite obvious that the precision gains of this method are generic and could be implemented with any other choice of Wilson type fermions. One would need to implement the \(\chi \hbox {SF}\) boundary conditions following Ref. [14], as well as the \(\chi \hbox {SF}\) correlation functions [20]. We also note that the computer resources required are rather modest: in fact our largest lattice size was \(16^4\); indeed, the main work for the present results went into painstakingly following lines of constant physics and the determination of the corresponding uncertainties and their propagation to the Z-factors. We have reported many technical details in the hope that any further applications of the method will be able to benefit from our experience. One possible improvement we did not explore was to measure the derivatives (3.9), (3.10) by computing the corresponding operator insertions into the correlation functions directly on the tuned ensembles; this was done e.g. in Refs. [1, 2] for the PCAC mass, \(t_0\), and other observables, and this would certainly allow one to further improve on the precision, as no assumptions on the derivatives need to be made.

Possible future applications of the \(\chi \hbox {SF}\) include the determination of the ratio between pseudo-scalar and scalar renormalization constants, \(Z_{\mathrm{P}}/Z_{\mathrm{S}}\). Advantages of the \(\chi \hbox {SF}\) are also expected for scale-dependent problems, such as the renormalization of 4-quark operators, where the contamination by \(\hbox {O}(a)\) effects could be significantly reduced by the mechanism of automatic \(\hbox {O}(a)\) improvement [19]. Finally the \(\chi \hbox {SF}\) offers new methods for the determination of \(\hbox {O}(a)\) improvement coefficients, which we hope to explore in the future.